Creating a Financial Data Lake for Academic Fintech Research

Daniel Broby, Huckleberry Hopper

Research output: Working paper

47 Downloads (Pure)


This paper presents the case for a Financial Technology (Fintech) data lake. Fintech is impacting business models and its concepts require testing. The definition of Fintech is imprecise, but it is characterized by the use of technology as applied to digital financial transformation. The software and programming driving it is evolving and should be evaluated before being introduced into financial markets. Its development impacts "client money" and this can be risky unless supervised. Fortunately, such experimentation can be done in a controlled way using a regulatory sandbox. This allows Fintech concepts to be checked for reliability and robustness, using consenting live accounts (which receive a special regulatory exception). We propose a less risky supplementary approach, namely the testing of concepts on real but “blinded” financial big data files stored in a data lake. In this way, back-testing, out of sample experiments and forward performance checks can be done without the risk of losing money. We investigate how to implement such a data lake in order to do this.
Original languageEnglish
PublisherUniversity of Strathclyde
Number of pages11
Publication statusPublished (in print/issue) - 14 Nov 2019


  • Fintech
  • Strategy
  • Business models
  • Innovation
  • Financial Services
  • Disruption
  • Artificial Intelligence
  • Taught finance
  • Fintech Scotland
  • Data lake
  • Data warehouse


Dive into the research topics of 'Creating a Financial Data Lake for Academic Fintech Research'. Together they form a unique fingerprint.

Cite this