> From inception Two Sigma’s early funds, like Eclipse and Spectrum, focused on trading stocks globally. Eclipse was faster, changing positions within weeks, while Spectrum had a longer-term horizon closer to one month.
> The duo eventually used their algorithms to create programs that operate outside the global stock markets, like the trend-following Compass funds that bet on futures markets. In 2014 another important fund, Horizon, was folded into Spectrum, which had diversified its offering beyond stocks. One of Two Sigma’s least visible funds is its Partners Fund, an internal fund of funds, fueled mostly by capital from the founders.
Most, all?, of the largest hedge funds are actually partnerships that encompass multiple funds. From an internal perspective this allows each fund to focus on a few core competencies, like trend following vs market making vs global macro. Each of these strategies will do well at certain times and do poorly during others.
From a firms's perspective this allows for diversification, which is almost always good. From an employee's perspective it allows them to get paid for their work and be insulated a little bit from the performance of their peers in other funds.
And in some cases like Steve Cohen's SAC it allows the good ideas to retroactively be put into the blessed fudn while the bad ideas are shuffled to the lesser feeder funds. Yes, this is illegal, yes it happens.
From an outsider's perspective its usually means that there is one well performing fund that is closed for new money while there are several lesser performing funds that are open to outside money. Even RenTech has funds that are open to outside money and they don't perform anywhere near as well as their master fund that is for employee's only.
> Two Sigma researchers spend time testing existing models, and each researcher is expected to come up with two or three new models per year. These are presented to Overdeck in a white paper that is typically less than ten pages long. Since Two Sigma’s trading models can change its forecast in seconds, lots of back-testing goes into each model. It’s not unlike the way Amazon exhaustively tests various Web-page changes in real time to ensure optimal clicks and purchases. At Two Sigma headquarters the model builders, who need to write code, sit with the engineers and collaborate with them all the time.
From the strategy development side, often idea's have a half life of anywhere from months in the HFT space to years in the global macro space. For idea's I've since come to believe that idea generation is equal parts people, ie brain power, and platform, ie the ability to iterate.
RenTech is a perfect example of this. They had two people leave who were very high up and go to another fund. They had 2-3 years of poor performance away from the huge backtesting platform that RenTech had built. it's not like these Phd's suddenly forgot everything. Its the ability to iterate quickly on idea's that is the key once the bar has been met for math and intellect.
As they say, its not the algorithm you use but the features that produce your alpha. If you want to make money in the markets focus all your time on feature engineering.
Any given recipe has a history behind it, an environment (ingredients available, staff expertise, equipment, fashion) in which it was popular, and a future that may or may not be good for it.
There's also all sorts of non-recipe things that matter. How is the restaurant decorated? Is it in the right location, serving clientele that like that kind of thing?
And so on. Like many other businesses, there isn't one central piece of business knowledge that determines success.
I won't spoil it for anyone who wants to listen to it (it's only 3 minutes long), but sometimes your secret sauce is not even what you think it is, but a completely unexpected & undocumented interaction:
It was also motivated by the same discussion and recipe analogy.
To be clear, when you talk about the platform, you mean the research platform used to develop the idea, rather than the execution platform used to apply the idea, right?
What is there to such a platform? What does it actually do? Is it just a question of pumping historical data into a model and measuring its performance?
The major hedge funds build in-house big data platforms that allow their quantitative traders come up with an idea (what if the price of IBM is 22% predicted by soybean futures in Tokyo on Tuesdays?) and then essentially replay history and see the result.
Say you have the soybean-IBM theory. You feed it into the platform. It applies your proposed strategy across billions or trillions of data points. It probably does a lot more than an SQL query -- possibly checking for interactions with other current strategies in use at the firm, running sensitivity analyses, blasting various parts of it with Monte Carlo simulations and so on. A lot of the same tech is also used to perform forecasts of the coming hour, day, week, month or whathaveyou.
It comes back with the simulated return. If it's good, and if you think you can disguise your strategy while trading, then and only then can you code it up and deploy it to the trading platform. In this article it's claimed that the models are reviewed by a company founder as well.
By now, going on Google's publications about Millwheel and Dataflow, the hedges will already be advanced enough to provide model testing that can converge to a number quickly enough that you can abort the test if it's not shaping up well, thus saving platform time for other uses.
Building your own platforms is a mixed blessing. If you the first to do it you can get an enormous advantage because you are the only company with that capability. Later, as technology continues the march of commoditisation, others gain the capability "for free". Now you are at a slowly increasing disadvantage -- again, because you are the only company with that platform.
A more specific description of the problem might help though. There's a huge amount of data in many forms, a huge amount of unpredictable major events, a lot of noise, a lot of hidden data (take hidden orders on exchanges as an example), and any action you take is going to affect the market. If you do something dumb, like place a bad order , nobody would take it in simulation but they would in real life. This makes it extremely difficult to understand why your model performed the way it did, what needs to be fixed, what sort of risk it entails, etc.
Research platforms help answer these questions without quants having to do do complex bespoke analysis over and over again.
Focus on gearing and slippage.
In an age when all hedge funds have the resources to hire the best and brightest engineers and buy the fastest processing hardware, it seems that none of them will have an edge if they are all starting with the same publicly available data.
You can view a traditional market as a process where information from the real world flows in, and people can either make money by predicting how others will react to that information, or reacting slightly faster (HFT), or understanding the patterns the information tells you (e.g. stat arb).
But even better if you can get the information before they hit the markets, hence the search for more and more data sets that have predictive value.
This changes the actual market (the exchange) from the place where prices changes reflect different sets of participants' views being hashed out (the market is performing information processing mediated by price signals), to one where the actual predictive signals are monopolized by private parties before they hit the market.
But note that this is exactly the path that the large tech platforms' ad exchanges have gone down as well. The price per click/conversion of a user and the outcomes of real-time bidding are hidden from everyone except the particular participants of the transactions; we're moving from a world where public markets (no matter their limitations) reflect the sum of public information in real time, to one where the thinking and computation are moved into hidden platforms.
At some point, of course, some bright person is going to create an exchange/marketplace for data signals, and then the financialization/reification of another abstraction will take place.
I used to be part of a research group that sold the so-called "alternative data" you're describing to 30 or so hedge funds in the NYC area, including several of the largest. The example I like to give is that we knew well ahead of time that Tesla would miss on the Model 3 because we knew every vehicle they were selling by model, year, configuration, date and price with <99% accuracy. I still occasionally sell forecasts like this and the methodology is straightforward enough that even a solo investor can consistently beat the market if they know how to source the data. But I've mostly lost faith in this technique as the sole differentiator of a fund's alpha.
Some funds, like Two Sigma, have large divisions with a very sophisticated pipeline for this kind of analysis. They do exactly what you describe. For the most part it works, but there are several obstacles that keep this from being the holy grail of successful trading:
1. First and foremost, this analysis is fundamentally incomplete. You are not forecasting market movements, you're forecasting singular features of market movements. What I mean by that is that you aren't predicting the future state of a price; if the price of a security is a vector representing many dimensions of inputs, you're predicting one dimension. As a simple example, if I know precisely how many vehicles Tesla has sold, I don't know how the market will react to this information, which means I have some nontrivial amount of error to account for.
2. This analysis doesn't generalize well. If I have a bunch of information about the number of cars in Walmart parking lots, the number of vehicles sold by Tesla (with configurations), the number of online orders sold by Chipotle, etc. how should I design a data ingestion and processing pipeline to deal with all of this in a unified way? In other words, my analysis is dependent upon the kind of data I'm looking at, and I'll be doing a lot of different munging to get what I need. Each new hypothesis will require a lot of manual effort. This is fundamentally antagonistic to classification, automation and risk management.
3. It's slow. Under this paradigm you're coming up with hypotheses and seeking out unique and exclusive data to test those hypotheses. That means you're missing a lot of unknown unknowns and increasing the likelihood of finding things that other funds will also be able to find pretty easily. You are only likely to develop strategies which can have somewhat straightforward and intuitive explanations for their relationship with the data.
This is not to say the system doesn't work - it very clearly works. But it's also easy to hit relatively low capacity constraints, and it's imperfect for the reasons I've outlined. You might think exclusive data gives you an edge, but for the most part it does not (except for relatively short horizons). It's actually extremely difficult to have data which no other market participant has, and information diffusion happens very quickly. Ironically, in one of the very few times my colleagues and I had truly exclusive data (Tesla), the market did not react in a way that could be predicted by our analysis.
The most successful quantitative hedge funds focus on the math, because most data has a relatively short half-life for secrecy. They don't rely on the exclusivity of the data, they rely on superior methods for efficiently classifying and processing truly staggering amounts of it. They hire people who are extraordinarily talented at the fundamentals of mathematics and computer science because they mostly don't need or want people to come up with unique hypotheses for new trading strategies. They look to hire people who can scale up their research infrastructure even more, so that hypothesis testing and generation is automated almost entirely.
This is why I've said before that the easiest way to be hired by RenTech, DE Shaw, etc. is to be on the verge of re-discovering and publishing one of their trade secrets. People like Simons never really cared about how unique or informative any particular dataset is. They cared about how many diverse sets of data they could get and how efficiently they could find useful correlations between them. The more seemingly disconnected and inexplicable, the better.
Now with all of that said, I would still wholeheartedly recommend this paradigm for anyone with technical ability who wants to beat the market on $10 million or less (as a solo investor). A single creative and competent software engineer can reproduce much of this strategy for equities with only one or two revenue streams. You can pour into earnings positions for which your forecast predicts an outcome significantly at odds with the analyst consensus. You can also use your data to forecast volatility on a per-equity basis and sell options on those which do not indicate much volatility in the near term. Both of these are competitive for holding times ranging from days to months and, with the exception of some very real risk management complexity, do not require a large investment in research infrastructure.
Is the way in which you got that information something you can divulge? I mean, was it talking to an employee or was it something exciting and far fetched? By the way, I presume you meant ">99%" or something similar.
> A single creative and competent software engineer can reproduce much of this strategy
By "this strategy", do you mean prediction based on a source of "alternative data"?
Interesting comment, in any case.
Also...generally speaking, what does this type of information sell to hedge funds for? For something like the Tesla information for example? I would assume it's probably not millions, but somewhere in the 5-6 figures?
I used to work for D. E. Shaw & Co., now I work in Silicon Valley and invest my money in index funds. Much better that way.
"Our definition of success has become narrow, boring, and limited. If we want young people to be creative and innovative, we need to reward them for it."
from "Skip The Hedge Fund: We Need Young People To Take Risks And Build Inspiring Things" at https://www.fastcompany.com/3026586/skip-the-hedge-fund-we-n....
So you have to get creative, and if you want it to keep working, you can't tell anyone about it. Ever.
The beauty in all this is that 1000s of people keep trying... But only a select few, less than 1/100 of 1 percent of market participants succeed in making reasonable profits
I find this hard to believe. There's entire fields of research based around AI now and an endless number of ideas that have yet to be tried/implemented. There simply aren't enough people working at hedge funds to cover them all.
2. It's not just my opinion, and
3. I didn't say they're "well ahead" unilaterally.
This isn't unique to finance; industry labs in tech also often have novel results in applied mathematics and computer science that are ahead of academia and other industry labs. You don't have to believe me but it's not exactly a controversial topic. Not everything is published or patented.
But I read your claim as saying that there are broad methods and approaches that they hide. And that's, while possible, more peculiar. Most of the tech industry labs don't keep their theoretical research secret. Practically anything that could be published is.
As for 3, the way you described the "rediscovery" made it sound like those Labs were a number of steps ahead, so I hope you pardon my misunderstanding.
Like I said in the original comment: this isn't (to my knowledge at least) pure mathematics that's being kept secret. But there are absolutely families of techniques and algorithms whose applications to finance are nontrivial, non-incremental and very well guarded.
In other words, techniques that are broadly applicable to the field, or techniques that maybe spawn a family of related techniques, but appear to be useful only in a specific subdomain.
Who are "these guys"? The funds discussed in the article have average annual returns well above 9%.
“The firm’s biggest fund, Spectrum, has earned an annual average return of 9.4% net of fees since 2004.”
The S&P 500 has averaged 9% for 80 years. Joe Blow can buy index fund and do just as well as the quant clients. Sure, these guys are averaging 14% before fees, and it’s a great way to get rich. But after fees the client might as well just passively invest.
I am not even anti-passive funds but you are memeing a bit too hard.
After recently reading A Random Walk Down a Wall Street, my conclusions are: you are not going to time the market, the managed funds don’t do any better than index funds. 90% percent of the effort goes into squeezing out an additional 10%. So may as well spend 10% of the effort and settle with 90%. Of course fortunes are made on that 10%.
Joe Blow slinging garbage cans can retire a multimillionaire by steady, passive investing.
So lets say you are a 10 billion dollar fund. You do your research and find that there is a small $100 million dollar company that doesn't have much debt, is growing rapidly, and the management seems good and there are good tailwinds for the industry as whole.
You have 100x the deployable capital than the company's market capitalization. Even if the stock price doubles or triples in price, your potential upside for the firm as whole is a relative drop in the bucket. You can't deploy more than a few million into this company without moving the market. For a large fund, it isn't worth wasting time chasing these small opportunities. In fact, "scaling" is a hard problem in funds.
Hence, there is actually a lot of opportunity out there for small scale investors who are willing to look under rocks that bigger guys don't see enough potential opportunity in.