What kind of architecture are you using for your backtesting framework? I have been collecting data to attempt to build something similar to that with close to around 2.3TB of tick data stretching around 20 years across 75 major pairs (indices, fx, major commods, etc) and I am always overwhelmed about design of systems to process large datasets like this.
It's been a work in progress for quite a while. Involved writing many different scrapers and other data ingestion systems that systematically downloaded everything I could get my hands on. I'm building an API to eventually sell some of the information, so I'd prefer not to say specific sources.
I'm not sure what you mean with architecture? The computers I currently run on are ordinary Macs of the latest generations. (Before Julia came and saved me I ran on Google cloud)
That is way more data than I am currently using! I'm currently a hobo using data from Yahoo finance. :) That sounds interesting! Could you contact me at kruxigt at gmail.com and tell me more?