Hacker News new | past | comments | ask | show | jobs | submit login
Jane Street Market Prediction ($100k Kaggle competition) (kaggle.com)
357 points by tosh on Nov 24, 2020 | hide | past | favorite | 208 comments



As a frequent Kaggler (perhaps too frequent... it's a bit addicting, in a way I'm sure others on HN will understand), I was fairly intrigued to see this one pop up in the competition list a few days ago. Finance shops have tried their hand at Kaggle before, but I think they've normally been out of their domain. e.g. Two Sigma recently did a reinforcement learning game competition.

I'd caution the HN crowd not to expect production-level quant models out of this, like I'm seeing some doing in the comments already. Kagglers are excellent machine learning practitioners and the models that come out of many competitions are top-notch stuff, often making their way into research papers. But this is a short competition on limited data in a non-real-world scenario. The winning models will be very interesting educational exercises and probably wonderful recruiting material for Jane Street, but won't be the underpinnings of a new fund.

That said, I can't wait to see what comes out of this one. It ticks all of my competitive boxes :)


Mathematical analysis of financial markets is more celebrated when applied to relative valuation of different assets, rather than prediction of the market. Black-scholes, for example, applied calculus with an underlying no-arbitrage assumption to create a thriving market in option pricing, by giving traders a mechanism to reduce risk and thereby reduce bid offer spreads. Same in fixed income, mortgage, and credit market assets over the years.

The problem with predicting absolute levels, is that there is a game theoretic aspect which undermines any mathematical trading strategy as soon as it is public. optimal game theory trading strategies don’t produce great results, and they are relatively trivial to identify. Instead strong profits in market long/short macro positions are mostly created by information advantages, which don’t really make for interesting Kaggle competitions. For example, big profits in macro trading have historically been consistently achieved by front running customer orders, by building timing advantages on top of trading infrastructure, by funding research analysts that inspect operations on the ground, by lobbying for regulations that change market directions and so on.

It’s very hard to tell if a best performing hedge funds that doesn’t have an unfair advantage, that declares its only using quantitative strategies, is in fact just a statistical anomaly with a hollow narrative.


This!

>> The problem with predicting absolute levels, is that there is a game theoretic aspect which undermines any mathematical trading strategy as soon as it is public.

I took finance in Business school, coming from doing a lot of statistical analysis in a research lab. I hated my finance professors and there pseudo science. Pricing formulas work great until they don't. The problem is when they don't, they really don't, in a catastrophic way. Read "When Genius Failed." Real traders know this. But some economists and finance professors act like these mathematical models are describing a predictable physical phenomena.


To clarify, the hedge fund LTCM in "When Genius Failed", collapsed not because it relied on arbitrage 'pricing formulae', rather because it failed to properly execute arbitrage trades.

LTCM in being overly leveraged, relied on other market participants to maintain short term price alignment, which meant it was not arbitrage. Salomon's reduced its role as market-maker, maintaining short term price alignment, which increased short term price anomalies, and thus increased LTCM's vulnerability. The Asian financial crisis increased the frequency and extent of those pricing anomalies, and the subsequent Russian Default crisis did the same. Margin calls were made on LTCM that it couldn't cover, forcing them to close out of their positions at very unprofitable times of the trade strategy.

So I don't think "pseudo-science" is a great description for what those B-School profs are teaching. Rather the pricing formulae are just the beginnings of the financial theory you need to run arbitrage strategies, but they are not sufficient. You need to augment them with a broader picture of market dynamics and capital management, just like you'd need to learn about financial law, financial market technology, and a bunch of other stuff to run a successful market-making desk.


'Why aren't they shouting' also has some things to say about LTCM. Broadly in the same vein. Those guys were leveraged to the gills and couldn't take market movements against them.

From what I've read elsewhere, if you had managed to hold an LTCM replica portfolio until expiry, you would have actually made money.


To add to this, unless your model is situated, and can purturb the market, it has no way of knowing what happens when you flex your muscle. I have a friend who did algorithmic trading professionally for a few years, and he said it was amazing to watch the data. Said he could see other bots come along and poke him, trying to look for weaknesses in his algorithm to exploit. I would expect a purely formulaic trader to underperform other traders who can take advantage of others. It’s no different than how you have to win a rhoshambo turnament.


Ok, I’ll bite - how, precisely, does one win a rhoshambo tournament?


I've wanted to start learning about this for a while but I'm really not sure where to start. I have a degree in CS and Math so I'm not a total layman wrt the maths. Do you have any suggestions?



Thanks, I'll give it a shot


Eh, don't bother reading any financial literature that other commentators suggest you for this competition. With anonymized input data and even the nature of the problem (is it even short or long term prediction? the two are very different beasts), you're not going to use any financial domain knowledge here. Besides, the good stuff is rarely if ever openly published. Just keep in mind two useful unchanging truths about finance here: it's time series data and with very low signal-to-noise ratio, so overfitting is a major concern.

Start by reading/running notebooks ("kernels") and winners solutions from this and previous competitions with tabular/time series data, building models and then grid searching, stacking, bagging, boosting 'em up. Here's a bit aged but still useful guide how pros do it: https://mlwave.com/kaggle-ensembling-guide/


I have a book called Empirical Asset Pricing by Wayne Ferson. Seems like as good a place to start as any.

If you want something online, check out https://www.quantopian.com/


It looks like I'm a bit too novice for Empirical Asset Pricing. I'll check out the website though, thanks.


Yeah, I haven't read that book myself yet. Borrowed it from work. But it seems like the Real Deal. So you could just use it as a stepping stone, and just see what basic material the author recommends to work through beforehand. You can usually find these recommendations in the preface or similar.


What's your goal?


Primarily to learn for the sake of the thrill of learning. I would like to have a better understanding of the analytical methods for understanding markets. I'm interested in the application of game theory from more of a beginners point of view but any mathematical tools to understand the world better are interesting to me. I'm not really focused on making money, just interested in better intuition/understanding of economics.


To have fun exploring and learning.


And also rule the world.


I am not that ambitious, just one or two billion in assets.


Also, prices aren't stationary. For an equity security, you are predicting the price for a company that is compounding capital over time. You can predict the price for the company at one point, the relative valuation for the company (for example, against peer group) may not change in one year but that company is investing their capital at X% so you get price growth.

The reason why relative valuation models are more effective is the same reason why most sports betting models use current odds as an input. Prices contain information but, in my experience, these methods aren't totally effective because they often miss important information about the company itself (big price moves happen because relative valuations are wrong). Value or quality appears to do fundamental work but is often woefully blind (for example, there are proven accounting issues with value strategies...does your average quant understand this? No. Have they ever read a set of accounts? No. They have no hope. None.)

Just imo, I think quant strategies are almost totally worthless beyond liquidity provision (even a strategy like front-running news in FX...humans do this better, and I know people who are still making tons of money doing this). I think there is massive value in that mode of analysis but the people who make the most are always going to be people who know the fundamentals better (I think firms like Marshall Wace that are doing this synthesis will move ahead) because that information is often not in the price at all.


> Also, prices aren't stationary

I'm not sure about the rest of your comment, but this is mathematically the correct reason why we don't predict absolute price levels.

And the reason why this is mathematically the correct reason is because for a non-stationary process, when you predict into the future, the variance tends to infinity which means taking expectation on any statistical model is useless since the variance is ridiculously wide


>The problem with predicting absolute levels, is that there is a game theoretic aspect which undermines any mathematical trading strategy as soon as it is public. optimal game theory trading strategies don’t produce great results, and they are relatively trivial to identify.

This isn't true if you're fast enough. Everybody knows how to do arbitrage but it's still extremely profitable if you're faster than everyone else. HFTs are consistently more profitable than slow trading firms, earning 40%+ returns, it's just a much more capacity-constrained form of trading so absolute returns are lower.


Just as an aside: the kind of prediction you are describing as relatively boring is still really useful, because it improves markets as far as the rest of society is concerned.

But yeah, it might not be that useful for the funds themselves.


> relative valuation of different assets, rather than prediction of the market

But isn't prediction an inherent part of valuation?


Relative price prediction is an inherent part of valuation. You predict what the price of, say, a bond is given the price of the discount rate over the life of that bond. You are not predicting the absolute price of the bond, you are not able to predict if rates are going to go up or down. That's the appeal of arbitrage, you don't need to see the future, you make money no matter what if you see that a particular asset is 'out of whack', mis-priced, cheap, expensive, and you buy/sell it (and execute the appropriate arbitrage hedging strategy until maturity of the trade).


>But isn't prediction an inherent part of valuation?

Doesn't need to be. You have wealth you have to store somehwere because it is not being spent now. Can store it as equities, debt (including a bank account), Paper cash money under the mattress, gold bars, BTC, etc etc.

Let's say you have the list of potential places to store it and can get a price/value ratio but with a common term of "X" in the value

  - A = 3.1/X
  - B = 2.8/X
  - C = 1.9/X
  - D = 4.2/X
Gotta put it somewhere if you aren't spending it. In the above you'd Choose "C". Lowest price to value ratio even though you don't know what the actual value is.

Yes, this is a massive simplification as it ignores diversification benefits and so on but there's the basics of the usefulness of valuation where you don't and can't know the value.

"prediction of the market" here means "Do you think the S&P500 is going up or down this year?" Warren Buffet completely ignores this, takes no view on it. Timing the market is very hard. Assuming the market will reflect the economy and the economy will grow because people work hard, are smart, we invent new things and population increases, then picking the best of what's available is the normal value investors route, a la Buffet as the most famous practioner.

Burton Malkiel's "A Random Walk Down Wall St." is still, IMHO the best summary of all the theory and practise of investing - excluding the quant funds like Renaissance. Maybe he has an update for them? Maybe it's also worth reading if it exists or maybe not. Don't know.

No idea if renaissance time the market or what.

edit: punctuation and the Malkiel recommendation.


This is highly dependent on the model in question. If you look at the parameters of the Black-Merton-Scholes model, there are assumptions embedded in the model that aren't necessarily predictions.


Yes, it's overwhelmingly unlikely that the winning model will actually be a competitive trading strategy.

Kaggle encourages a domain agnostic approach to modeling, in the sense that participants use sophisticated machine learning and statistical methods but typically have no domain expertise in the underlying data. This kind of approach to finance has historically performed poorly. [1]

Good quantitative trading is usually backed by a strong fundamental thesis and an interpretable model, which is obtained by cross-pollinating sophisticated math and statistics with domain expertise in some part of finance. That domain expertise might be in different kinds of assets, liquidity or market microstructure, but it's there.

$100k is cheap for Jane Street. If nothing else they have a new recruiting pipeline of people with demonstrable machine learning skills.

______________

1. I would also say this is a poor way to approach statistical analysis in most domains, and usually leads to spurious or overfit results. But the idea that you can just run a model and find patterns in pricing data is especially attractive and insidious.


> Kaggle encourages a domain agnostic approach to modeling, in the sense that participants use sophisticated machine learning and statistical methods but typically have no domain expertise in the underlying data.

Yes this is accurate and put very well. This is so much the case that if you have a strong background understanding of the field, the ML part can actually be picked up quite quickly or contributed by someone else. There are a few notable users who are both domain and ML experts and they tend to absolutely clean up in their field. I'm thinking of a couple of med students in particular who are formidable in every medical imaging competition.


"have no domain expertise in the underlying data. This kind of approach to finance has historically performed poorly"

I recently read 'The man who solved the market', about Jim Simons and Renaissance Capital. The way the book tells it, looking for patterns without seeking domain expertise (e.g. ignoring fundamental valuation of equities) is exactly what Renaissance did, and it worked out very well.


I can see why someone would characterize RenTech that way but it's not really fair to do so. There is a lot of mythos about how Simons hired computer scientists, mathematicians, signal processing and NLP experts, etc. When Mercer came over from IBM, he definitely contributed a significant amount of analytical expertise that was probably nonexistent in financial trading at the time (with the possible exception of the Ed Thorp diaspora). The astrophysicists RenTech hires every year bring new insights in ways to model and understand vast amounts of data with absurd dimensionality.

But all of this has to be utilized in the context of the data. The reality is that you're not going to develop a sophisticated options trading strategy without a strong understanding of what an option (and more generally, a derivative) is. You can't develop a viable statistical arbitrage strategy just by treating market microstructure as a blackbox signal to be solved with e.g. Fourier analysis. You can certainly find an edge in using fundamentally superior methods of analysis, but you still need to know what that data represents in the context of the market.

Don't be fooled: people working at firms like RenTech have a strong understanding of the underlying finance. It's just that they learned it on the job, because the ethos at these firms is that learning fundamental theory in math and statistics is harder than learning fundamental theory in finance. You don't have to take my word for it though. Read about one of the few strategies of RenTech's which has been publicized: https://www.bloomberg.com/opinion/articles/2014-07-22/senate.... Deutsche and RenTech didn't team up on this strategy (to fantastic success) by treating basket options as some kind of blackbox abstraction devoid of delta, gamma, theta and vega.


That DB/rentech basket option is a tax avoidance scheme. It has nothing to do options pricing and concepts like delta, gamma, etc.


Yeah, yeah. That controversy has been litigated on HN a dozen times already, I'm not going to rehash it. Do you dispute my primary point here? If so, why?

(Also, even if I agree it was purely intended for tax avoidance, I don't understand why you think that would obviate having to understand how the options work intimately well).


I don't really think you can categorize that they worked well as these companies are market maker. This is all the way more evident in 9 out 10 times they were profitable. It seems more as that options were 9X strike.


Yup.

https://www.rentec.com/Careers.action?computerProgrammer=tru...

They look for programmers with knowledge of Tax and Risk Management.


How much do you think does their advantage stems from having high quality proprietary/alternative datasets?


Not much if at all.

However, RenTech probably has the cleanest data warehouse of any firm. It was revealed that they have PHDs who do nothing but sort data into databases.


We should also remember that Jane Street is primarily an ETF market maker. Their main business isn't betting on prices of stocks or managing a portfolio.

I've only taken a quick look at the data, but the problem doesn't seem to be focused on their core competencies, but instead is much more general.


>I've only taken a quick look at the data, but the problem doesn't seem to be focused on their core competencies, but instead is much more general

How can you tell? All the features are completely anonymized.


They even say in the instructions:

Admittedly, this challenge far oversimplifies the depth of the quantitative problems Jane Streeters work on daily, and Jane Street is happy with the performance of its existing trading model for this particular question.


Can you comment on the specific setup of this kaggle competition? Versus other finance/trading related challenges?


Any model superior to what Jane Street is running is worth vastly more than the prizes they’re offering.

If you prove such a model out, get licensed (SEC, FINRA) and start soliciting to manage assets.

Disclaimer: Not investment advice. Not a lawyer, not your fiduciary.


>Jane Street has spent decades developing their own trading models and machine learning solutions to identify profitable opportunities and quickly decide whether to execute trades. These models help Jane Street trade thousands of financial products each day across 200 trading venues around the world.

>Admittedly, this challenge far oversimplifies the depth of the quantitative problems Jane Streeters work on daily, and Jane Street is happy with the performance of its existing trading model for this particular question. However, there’s nothing like a good puzzle, and this challenge will hopefully serve as a fun introduction to a type of data science problem that a Jane Streeter might tackle on a daily basis.

Sounds like it's just for fun/recruiting rather than trying to crowd source new strategies -- I'm sure if they were looking to crowd source strats they'd pay a whole lot more than 40k for first place


This contest seems like the equivalent of the "inventor's hotline" infomercial. If it identifies one promising new approach that they can iterate on, it has probably paid for itself. It also serves as a good PR and recruiting tool. The prize is probably designed to bring in clever non-professionals. It's a win-win for Jane Street


The inventors hotline infomercial is a scam where they get you to pay for expensive patent filing, consulting, and marketing packages. They never intend to actually use any of the inventions.


If someone has a good idea, you don't want the idea, you want the person. If you take the idea, at best you'll split the market with the person who had the idea. At worst they'll iterate and you'll get nothing. Far better to find people who have the skills to develop an idea.

Having said that you also want to find the (vastly more in number) people who can take someone else's idea and actually implement it.


I sincerely doubt they think they'll get actionable ideas. It seems like a fun recruiting play from a company that takes pride in hiring non-traditional talent.


why would they need to pay more? the nature of smart people is to undervalue themselves and to not negotiate, so as long as smart people keep doing that other people can take advantage of that.


Because they would hire the person in hopes of new ideas.


True, and I don't think they expect models that are superior to their own, I'd look at this as a hiring / marketing tool. Plus, even if you had a model that from a pure engineering standpoint was able to match Jane Street's approach, the model would not work without the wealth of proprietary (and expensive) data sources that Jane Street is sure to ingest, so you still couldn't just go out and do it yourself without some serious upfront investment first to get the same data. That is assuming all data they use is even available commercially, which I doubt as well. There are probably data sources that only become available to you through personal relationships with the right folks at the right places.


To me, this seems more like a funnel for recruiting


With an engineer phone screen and three on-site interviews, that's 4 hours of engineer time. $150/hr compensation per engineer, so cost is roughly 2x, $300/hr. So $1.2k to run a candidate through the pipeline post-initial-qualification.

To get one candidate and come out superior, acceptance rate should be 1%. (i.e. 99 failures). But if there are 50 leads from the program, and you convert a fifth, that's 10 candidates for a cost / successful recruit of $10k which means you have 10% acceptance rate to break even.

Hmm, back of the envelope seems to do all right as a strat. Relatively cheap. I recall the last time we were hiring, we projected cost per hire at $35k with the bulk of that actually being the recruiter referral fee.


I think you are significantly underestimating the cost per hour of jane street employees


I have first party information, two years out of date. Do you have contradictory first party information? Yes/no will be sufficient for me to adjust my priors and I will be grateful.


Disclaimer: I have no first hand info.

According to https://news.efinancialcareers.com/ch-en/307393/jane-street-... "Last year, Jane Street's graduate hires straight from college were said to be paid a $200k annual base salary, plus a $100k sign-on bonus, plus a $100k-$150k guaranteed performance bonus."

According to random people in reddit https://www.reddit.com/r/cscareerquestions/comments/69k0ap/d... "somebody said they got an offer from JS for $150k + $50k/yr "performance"-based bonus"

Both may be true.


Am I the only one who thinks "$100k-$150k guaranteed performance bonus" is a weird concept?

I get how maybe "bonus" can count as "not salary" by being paid as lump sum. But "guaranteed" and "performance" going together? What's the deal here, even?!


>But "guaranteed" and "performance" going together? What's the deal here, even?!

After interviewing with a bunch of similar finance shops over the past year, I think whoever reported that info omitted one important thing: the "guaranteed" part of "guaranteed performance bonus" applies to the first year of employment only (and it isn't a secret, every single recruiter and hiring manager made that very explicit and clear).

With this context in mind, it makes perfect sense, because you don't know how things work at the new company, you will have lots of learning to do, you are not sure of how performance bonus is structured or what goes into it, you don't know the details of how to affect it in your favor, etc. With that in mind, it makes sense for those companies to reassure candidates that you have a nice headstart by guaranteeing your performance bonus for the first year, to allow you to focus on learning all of the things relevant to your position without having to stress much about the performance bonus.


Think of this guaranteed bonus a retention tool (we see some churn after bonus period) - I also have 3 tiers/ratings for individual accomplishment and 3 for group. If you bag top tier for both, you are getting promoted plus the bonus is bigger than your base plus you get serious chunk in stock. It is nearly impossible to get top tier across and in my team of nearly a thousand people, I had three that did it.


Thank you.


The correct metric is not what the employee’s salary is, but the opportunity cost of their time. If all your engineers are working on urgent stuff, and each engineer adds in $1-2m of revenue a year, then the cost to your business of taking them off feature work to do recruiting is not $150/hr.


Yes, of course. Reasoning for not using that is as follows: if conservative estimates yield a yes, you don't need to assume more.

I know salary (2 years out of date). I don't know oppo cost.


Good point, fair.


Package for just-graduated SWEs is about $400k.


Thank you.


Such competitions might have two goals in mind: recruiting and signal diversification. The recruiting angle is obvious.

Any alpha that is not fully correlated to existing alpha is worth its weight in gold for an organization with the size, sophistication and complexity of JS. That's part of the reason why efforts such as 2Sigma's Alpha Studio exist: https://alphastudio.com/


That's a general problem with a lot of kaggle-esque comps: but then I don't discount the number of unemployed or intellectually curious very intelligent people out there, even if they're doing the equivalent of pushing down against the value of my wage/ bargaining power and doing the datascience equivalent of working for reputation/recruitment.

hell, the thing about us hackers is I can KNOW it's a dud deal, yet part of me still wants to give it a go because it's a problem and it's right there!


You don't have to (and certainly won't) beat all of Jane Street. The goal is to beat everyone else on Kaggle. A still difficult but much more accomplishable task.


Yeah, I’m arguing to not disclose the model. It’s worth far more held close.

If you want to work at Jane Street, go work for Jane Street. If you want to build your own models and run your own shop, the tools exist for you to do that without Jane Street (although there’s probably some amount of value learning the ropes there while they pay you, if that’s your thing).

My comments in thread are primarily around not having someone’s work exploited by sophisticated hedge/prop trading/investment professionals, which I’ve seen happen more than once, and for which you have no recourse.


My point is the winning model of the competition will not be worth anything beyond the prize. It will only be the best amongst other kagglers, almost none of whom are domain experts in finance. It will not stand a chance in "prod".


Assuming it passes some absolute measure(s) of quality. You could be the best on Kaggle and still have a mediocre model.


A backtested model is similar to a great startup idea. There is a huge amount of work to be done before it is worth much.


You are ignoring execution, infrastructure and real-market conditions. The model is just one part of the game.


+1 so much this

Jane Street is a high frequency firm. In this domain execution & infrastructure matters as much if not more than your algo secret sauce.

What good is kagglers' favorites giant boosted bagged lightgbm/xgboost/neural ensembles which will take seconds to predict, when my FPGA or ASIC running in the same rack as exchange's matching engine can make a million trades in the meantime with a much simpler strategy involving nothing beyond freshman math (take a cue in XTX Markets' firm name, for example)

And in the longer timeframes, the edge is more often than not in the data than training algo.


Quantopian tried this. Didn't work out. Now they have been acquihired by Robin Hood.


Talk about the blind leading the blind...


Robinhood knows exactly what it's doing. They are democratizing options trading, which allows market makers like Citadel and G1X to collect premiums on a larger universe of assets with high volatility and large spreads.

A portion of the profits these market makers collect are rebated to RH, which, alongside cash sweep and RH Gold, comprise the bulk of their revenue.

RH makes money by keeping trade volumes and user engagement high. They don't care if you make or lose money on their platform, as long as the trades keep happening.


part of what makes trading hard, and especially quantitative trading, is the necessary infrastructure. not just the obvious stuff like execution but also the infra to manage backtesting, risk sizing and management, etc. big firms offer this and lots of data and allow researchers to focus on the small parts where they can become subject domain experts.


'Admittedly, this challenge far oversimplifies the depth of the quantitative problems Jane Streeters work on daily, and Jane Street is happy with the performance of its existing trading model for this particular question'


Why do you think this competition will result in a model superior to what Jane Street is running?


The winner model will likely outperform anything that Jane Street could come up with this 130 feature set. With 3000+ competitors, the top 10 will likely be superior to what Jane Street can do in-house. Then an ensemble of the top solutions will be the best possible model anyone can come up with.


On a frozen feature set and evaluation dataset, sure. 3000+ competitors vs a few employees at JS who prepped this data. In light of very noisy data, the chances are not really on JS side here. But would the best model still be as useful on novel out of sample data is very questionable

Market prediction is low signal-to-noise problem, what's much more likely the contest will be an exercise in overfitting and the final shake up will be yuuuge.


I'm just fairly skeptical of this comment.

I understand that there are more models being thrown at the solution space, but if you think about people at jane street, while they don't have 3000 employees working on these, they've got a few hundred, and then if you think about the _time_ they spend on (the real, not this kaggle competition simplified stuff) models, I think there's a good chance that they'll have a significant advantage in man-hours (it's a full time job).

Also, while there is a high standard to people doing kaggle competitions, I think you can easily discard >90% of them as not being super competitive, then your 3000 models becomes 300, and those 300 have less than 1/3 of the time spent on them that jane street-ers are spending on their models.

(I realise it's not an apples to apples comparison, as the models they're working on are different since this is a toy example, but you said "will outperform anything that Jane Street _could_ come up with)


By a similar argument to https://danluu.com/sounds-easy/ , no one will beat Jane Street in a weekend.

Jane Street's hiring standards exceeds FAANG's.

This is a hiring/branding strategy. Good luck to them.


> Jane Street's hiring standards exceeds FAANG's.

As someone who has interviewed thrice and passed once, I disagree. Jane Street is extremely good at marketing themselves; before I ever worked there I received over $500 worth of "swag" (including a free iPad) just for attending talks and hiring events. The official recruiting strategy is grassroots where recruiters have anonymous reddit/hn/blind accounts to praise the company semi-anonymously. So you have a bunch of accounts posing as real employees or students talking up the company and the prestige of working there.

They sell exclusivity and mystery, but they definitely pay top of market salaries. I don't personally believe the actual interviews are any harder than top FAANG companies but they did seem to recruit exclusively from FAANG and/or top engineering schools.


Context: I worked at amazon and my buddy is at Jane street and used to practice some interview questions on me ages back (I think he mostly wanted to calibrate the questions on someone he knew fairly well, to know what kind of stuff he could expect).

I'd say the standard is arguably a lot higher, the things being tested are fairly different, more focussed on raw reasoning rather than specific techniques.

The main thing I'd say is that in general the standard of interviewing for FAANG isn't actually as high as people think it is.


Well the interview bars among FAANG aren't consistent with each other either. I would say Jane Street interviews are on par or perhaps easier compared to Google and Facebook, but it likely depends on your personal proficiencies.


Jane Street's interview was one of the funnest I've ever done. I failed but the questions were great!


What was so fun about it?


Not the parent comment you are replying to, and I cannot speak for full-time, but when I interviewed for an internship position with JS about 5-6 years ago, the questions were indeed fun.

Problems required knowledge of basic stats, general problem solving, critical thinking/reasoning, and all of it was wrapped in a nice package. Mind you, when I say "general problem solving", I don't mean brain teasers like "how many piano tuners there are in Manhattan" (I don't like that kind of problems at all).

I don't remember the details, but one problem was about estimating the most optimal move in a poker game given a specific situation at the table. The rules were explained in a very relevant and simple way to those not familiar with poker, which I wasn't at the time, and I didn't find that my lack of poker knowledge affected my ability to solve that problem at all (which is already an impressive feat, major credit to those who wrote that specific problem that way).

Zero memory of the rest of the problems, but most of them were relying on just basic stats and reasoning skills/critical thinking, no bs, and they were written/designed in such a way that it was interesting. The kind of a problem you would randomly see online in the middle of some discussion and won't be able to resist the temptation to stay glued to your computer screen until you solve it.


I actually give it 48 hours before the top 3 equals what Jane Street can do in-house on this exact same dataset. A week before the reasonable plateau is reached, and a month or so before the absolute most information is squeezed out.


Yeah, I meant that Jane Street aren't doing this to find a solution that'll improve their own tech.

I expect in a constrained environment, someone on Kaggle will beat Jane Street's solution.


> no one will beat Jane Street in a weekend

They have high hiring standard, but I suspect their 100s of smart engineers can't compete with the rest of the world.


Most of the engineers at Jane Street aren't touching anything similar to this challenge so you'd be right.


Isn't it pretty well known in the finance world that using stale public information to predict the market is a fool's errand?

Unless you have some kind of specialized non-public data (e.g satellite images of number of cars parked outside parking malls, number of cargo ships moving in and out), trying to predict the market with historical data does worse than "Just give me some monkeys, darts and a dart board".


Using purely historical price data it is harrowingly difficult. There are 130 anonymized features, so that's unlikely to be only price data. It could include information on the order book, correlated assets, fundamentals, vectorized/embedded text, etc.

Besides, I bet you can train monkeys to do (slightly) better than blindfolded random throwing. Even with public data (replace satellite images with Youtube mentions, or number of links moving into a company website) it is very possible to do better than average guessing on quite a lot of assets (especially smaller and newer markets).

Most hedge funds, even with specialized expensive non-public data, are not magical unicorns. Their quants really may just run a gradient boosting machine and leave it at that. Some hedge funds even prefer linear methods, because this lowers risk through lower variance. Such models can be beaten by experienced Kagglers for sure. For one, I did.


One thing we need to be clear about is that you're not aiming to be better than average. You're aiming to make a profit. There are probably hundreds of thousands of day traders, there are probably <100 market makers and tradingfirms (far less than that for a some specific products) and you'll probably find 99% of the day traders aren't making systematic profits. There are lots of strategies that are much better than average and still worse than putting your cash in a bank.


You can aim for both. If you just aim for profit, then you can get lucky with just average, or even random, betting. If you find a weighted coinflip (which is not impossible), provided by how many times you can flip that coin, you will see steady systematic profits. Of course, majority of day traders are getting owned by the big players, and they would do better doing more reasoned and long-term investments. Most day traders are not even using predictive models though.


On that point - it's pretty clear that this Kaggle competition is highly likely to result in a decent number of submissions that make more money through luck, than other make through strategy.


None of that information is non-public (you can find cargo ship data online for free), none of it is particularly valuable (you are looking for information, it is hard to know how much information is in cargo ship movement...it depends), and most non-quant hedge funds have been doing stuff like this for decades (i.e. hiring people to stand outside a retailer's stores and count customers)...most of this stuff is less useful than people think (again, you need information, data with intent).

Also, most of this stuff isn't in the price. Lots of people are collecting new data, it is definitely becoming more widespread but the actual synthesis is tricky (most people who are quants do not understand fundamentals, and most fundamental analysis don't understand data...most firms are swirling in a perfect storm of ignorance).


Theres actually still money to be made in small scale strategies. Sophisticated funds are running billions. They cant focus on strategies that only work for 100-500k. This is where big returns can be made. Even warren buffet will say, if he was only managing 1 million, he would get 100% a year returns.


That doesn't make sense unless there are very few viable small scale strategies, at which point they'd probably be difficult to identify. Your assertion might have been true before computers were able to help someone manage many strategies simultaneously.


No it makes a ton of sense. 100k is too small to have a researcher focus on full time. His compensation is probably 500k or more. Then add in fees, infra, cost of regulations, etc and small time strategies arent developed. Making money in the market is completely overrated from a difficulty perspective, the hard part is managing billions.


You're saying it'd require a salary of $500k or more to find someone competent enough to manage an investment opportunity worth $100k annually? That implies the investment opportunities are actually quite complicated. Obviously, opportunities exist for investment gains at all magnitudes. The question was whether they're easy enough for the average Jolanda. If Jolanda would require $500k annual salary to do the work, then, that easy opportunity doesn't really exist at all.

Alternately, if Jolanda doesn't need to make this opportunity her full-time job, then the investment company can pay her $500k to manage many of these small opportunities. After all, there must be many opportunities. Otherwise only 1 of the HN readers would be able to exploit it. In which case, again, it doesn't really exist at all.


Honestly, you seem like youre trying to reason this out with zero experience in the field. You cant just "run statistics to identify strategies". I'm not going to go into it further because you know nothing


I'm not sure how you came to either of those assumptions. First, I'm an economist. Second, I didn't suggest any method for identifying strategies. The question is the number of potential strategies.

If it's relatively easy to make money investing on a small scale, that means there are many potential strategies that don't require someone's full attention. If there are many potential strategies that don't require full attention, then someone skilled can go after many of those potential strategies and transform the small-scale into a large-scale opportunity. Since we've already established that these large-scale opportunities are hard ...


15% return on 5 billion is much higher than 100% return on 1 million. Its economics. Secondly, economics has nothing to do with investing, and is worthless. The sum of managing 1000 small strategies is way more work than the number you would manage with billions.

And yes, there is easy money laying around if you know what youre doing. There are dollar bills laying on the ground for smaller scale strats.


I don't see why those particular magnitude calculations are relevant to this analysis.

> economics has nothing to do with investing, and is worthless

Is it worthless because it has nothing to do with investing, or has nothing to do with investing because it's worthless? Further, why do you think the Federal Reserve hires so many economists if economics is worthless?

> And yes, there is easy money laying around if you know what youre doing. There are dollar bills laying on the ground for smaller scale strats. People just arent taking about them

Isn't that like saying, "Winning chess tournaments is easy if you know what you're doing."? If something requires expertise, then it isn't easy. If finding these smaller scale strategies requires expertise, then the expert probably has better things to do with their time. I suppose that holds to arguing with idiots on Hacker News.


I used to work on a team mixed of economics PhDs from Yale/Princeton/Harvard and with Phd Mathematicians. I know what economics is. All the best hedge funds are run by either math people or ruthless business men who cheat/activist invest/have non public information.

Also, the FED is in the process of destroying the US dollar and hyper inflating all the assets. Everyone but the capitalists are getting poorer.

The "Chief Economists" of many major US banks dont even hold economics degrees. People arent rational actors that you can plug into a math equation.

"If finding these smaller scale strategies requires expertise, then the expert probably has better things to do with their time. I suppose that holds to arguing with idiots on Hacker News."

Theres alot of friction of getting data to analyze, developing a strategy, executing the strategy, paying fees, managing risk, hedging. If you know how to do all that, you are probably already employed in that capacity or you run your own fund. But yes, if youre smart you can find an edge. Theres a ton of small man shops in NYC with 5 million in capital where everyone makes 600-1M a year.


> friction ... if youre smart you can find an edge

> easy money laying around

There's a bit of dissonance between the way you've characterized this process in different comments. That's my point.

"Hey, there's a dollar lying on the ground!"

"Can't be. Someone would have picked it up."

> destroying the US dollar and hyper inflating all the assets

Sounds like a good strategy to bring manufacturing back to the US. They've said the new policy is high employment.


"Hey, there's a dollar lying on the ground!"

Found the efficient market hypothesis believer, classic failing of economic theory.

Theres no alpha in the stock market!!!11!! If there was the completely efficient allocation of human capital, access of information, opportunity cost would strip it out so fast!!!


Yep, indeed I am. The same way I think casinos don't mind if you try to count cards at blackjack.


He said it pretty clearly. It's easy money if you look at it and work at it. The issue is most people don't have enough capital to try or don't have the balls to follow through. I've had many try my system and they can't hold the hard times.


Isn't that a contradiction in terms, "It's easy if you work at it."?


It's not easy to be Lebron James if you work at it. Also maybe it's just having a hidden talent you didn't know you had or a biological advantage.


no not really. you can use public information to give you an edge. And I say this as a person who trades and develops models at a vety successful market making firm.

Alternative data no one else can get easily certainly has tremendous value though.

Of course, predicting one or two seconds into the future (my primary concern) is easier than days or years, so there's that.


Granted, a typical Kaggle metagame-that-is-technically-against-the-rules is to use data from outside the dataset, which is one of the reasons winners have to be validated.


From: kaggle.com/c/jane-street-market-prediction/overview/code-requirements

"Freely & publicly available external data is allowed, including pre-trained models"


Generally this is allowed if you publish the data you're bringing in. They even create a sponsored thread for it in most competitions.


>Unless you have some kind of specialized non-public data (e.g satellite images of number of cars parked outside parking malls, number of cargo ships moving in and out)

Planet labs will sell you all of that data, in case people reading along here are curious.


I'm sure they've got more/better data in production. There seems to be some arbitrage that can be shaved off the edges for players with innovative enough strategies and good and timely enough data.


That’s not necessarily true.


I'm sorry, if you could build a model to predict markets, why will you post in to Kaggle to get $40k in prize instead of applying this model to your own broker account?


It is much harder to turn a model into a profitable trading strategy than people realize. Apart from transaction costs, risk management and market impact there are also a lot of small operational details which can make or break your execution. One example I vaguely recall was that the details of how a specific foreign exchange conducted its closing auction could make a substantial difference to a strategy that involved executing there alongside other trading venues.

The payoff for getting these operational details right or wrong is massively asymmetrical. If you get everything right, you'll only do as well as your model lets you. But if you get anything wrong, you run a real chance of losing far more money than you could have hoped to make!

Even just validating your strategy on historical data (ie back-testing) is harder than it sounds. If you make a mistake that leaks information to the code you're testing, you can end up with a much rosier return and risk profile than you really have. Another way to lose money when you go put your model into action.

If you get over these challenges and run your strategy successfully for a while, other market participants are going to start adjusting against it and you have to adjust in turn. You can't just "set and forget".

I should note that I am far from an expert on any of this, though! I just know enough to not trade with serious money—my real savings are all in index funds I don't touch, thank you very much :).


I believe what you are referring to is the fix. Foreign exchange markets, that I am aware of, do not have closing auctions.

I have heard of some quants trading foreign exchange markets, agreeing to trade at the fix with their counter-party, and not realising that traders often manipulate the fix resulting in the quant's strategy appearing not to work. It is almost comical (I worked in finance but not in FX, everyone knew this was going on for decades before the SEC starting fining people) that someone who managed money was making this error.

You are 100% correct about all the other stuff. Lots of issues with "production"...that is why financial firms employ traders/risk people/etc. Most people who trade themselves tend to go for lower-frequency strategies that they can implement personally. I actually don't think there are huge barriers, smaller investors have a huge advantage (when you trade at scale, the market moves against you) but you have to work with what you have and realise that you will get crushed if you try to replicate what someone with more money is doing.

Also, data. Data is expensive, and a huge fixed cost.


I was half-remembering some story I heard a while ago about the work needed to arbitrage between some US ETF and some securities on a Brazilian exchange, or something to that effect. I don't remember the details, and I'm not even sure that specific example was real, but it stuck out as a great illustration of the complexities involved in executing a strategy vs just coming up with a model.

Nothing foreign-exchange-specific there although, now that you mention it, dealing with different currencies is another problem you can run into with strategies.


A “foreign exchange” not “foreign exchange market”


Ah, same principle. I have heard of many similar stories.


Data quality is another real-world problem. There can be typos, deliberate biases, omissions that need to be guesstimated, sudden structural changes (e.g. a stock split event or a change in reporting cycles) etc.

There's also heterogeneous data sources to aggregate and consolidate, each with their own way of measuring things. e.g. You can see how different states and countries are tracking Covid related stats, they all have their own metrics and interpretations. Some even change the way they report overnight. Companies will similarly report their data in different ways.

Data cleansing is its own science and art for this reason, quite separately from developing any algos on it. It's a practical problem that's easy to overlook when you're just looking at ML transformations from input to output data sets.


This is basically why I’ve never seriously considered doing it myself. I had a neat idea in about 2005 which I tested on historical data, and it beat buy-and-forget on every share I tried except for Google, and it only had one free parameter.

But, even if I’d implemented it perfectly, and even if the algorithm has survived the financial crash, it would’ve only worked if I could trade for free, and other people copying the algorithm would probably have made it stop working.


Consider you have some extra money, what do you do with it? Do you put it (or keep it) in a bank-account? That is an investment choice you have. If you have more money than you need for living then you are already investing it in some way, maybe a bank-account. And you make this investment decision following some algorithm in your head. So if you algorithm was so great wouldn't it make sense to use it rather than put money into a low-interest bank-account?

My point is when it comes to investing, inaction IS action too. Therefore an investment algorithm does not need to beat the market. It just needs to beat doing nothing.

Consider that if you have earnings you can put money into an IRA account and then trade with it almost for free.


> So if you algorithm was so great wouldn't it make sense to use it rather than put money into a low-interest bank-account?

It wasn’t great outside of the spreadsheet.

I forget the exact numbers, but imagine: making 0.1% per transaction sounds amazing until you find out the transaction fee is 1%.

> Consider that if you have earnings you can put money into an IRA account and then trade with it almost for free.

This was the UK in 2005 — no IRAs [0], and I received 5% on my current account around then.

[0] Actually it’s worse than that — if you went into a UK bank in the early 2000s and said “IRA”, I’d expect the armed response unit to be called.


But the transaction fee is not 1% if you are a professional. It’s now higher, but there was a time you could easily get closer to 0.05% with some effort and nontrivial but not huge volume.

Iirc, last I checked, bitmex cost 0.075% to take liquidity, and paid 0.025 to give liquidity (which should be noted, is more than a tick - so, market making on bitmex is free money in an unpredictable market - which has made it technically expensive)

INET had a similar incentive structure before they were bought by Nasdaq, as did BATS when it started - it’s a common way to jumpstart liquidity in a new exchange.

Robinhood and IB will let you trade for free today (with other hidden and hard to quantify costs related to their order flow transactions instead)

So there’s no general solution - the details keep changing - but there’s likely a way to make nice profit of 0.1%.


>if you could build a model to predict markets, why will you post in to Kaggle to get $40k in prize instead of applying this model to your own broker account?

Because things aren't that simple. I find this argument very similar to that of devs who complain "I wrote this piece of code that made my company $3mil in revenue over the past year, but I only got paid a fraction of that, i am getting ripped off, oh woe poor me". If you could do that, you would have made it on your own and made that much money already.

Turns out, other people at the company are actually doing tons of work to make it possible to make that much revenue off your code. Same applies here. It isn't just as simple as having one good model at a single point in time to be able to make tons of money off it, there are a lot of other people doing their own work at those finance shops that make it possible for those models to bring in tons of money.


I'd be surprised if the data Jane Street provides isn't some form of high-frequency tick data. It's relatively easy to make accurate short-term predictions with such data, the challenge is being fast enough that behemoths like Jump and Citadel don't get all the good trades before you, leaving you with just the bad predictions. This requires a huge investment in infrastructure and connectivity, beyond the reach of individuals who aren't already quite wealthy.


This competition doesn’t predict markets. It’s a manufactured game to resemble things JS does. And even if it were an accurate representation, the competition is run on 128 unknown features that you would have to discover for yourself. And trust me, >90% of the work is identifying the features.


well because quant trading isn't about import xgboost, you need a sustainable infra to handle api failovers, bad data... not even going to mention risk management which is 50% of what quant trading is about. the data provided is anonymized but would probably be a mix of laggard measurements (moving averages, rsi...) and maybe some flow data... quant trading isn't really about finding "secret stuff" most profitable strats you can deploy can be based on stat-arb, basis trading or even just delta-neutral funding farming and such


Mostly because it’s impossible to accurately predict the market - and this is just a competition to see who can build the best model.


HFT firms aren't trying to predict "the market" as a whole - just small eddies of it. A typical example of this is arbing names at the bottom of index fund rebalances. Speed is important mostly to make sure someone else doesn't hit the arb first.


It's a good question. The basic answer is not everyone has capital and risk, but they may have the time and intellect.


This would be much more interesting if the features weren't anonymized.

At this point this is just a widest/deepest neural net competition on some unknown bunch of features.


100% this. Anonymised features remove any hint of lateral thinking that is the essence of science, and just focuses on the pure numbers. This is dangerous. One of my data scientists once ran up a $2000 bill on training a CNN on what was essentially a speed = distance/time calculation, and was really proud when he got 92% accuracy. A real facepalm moment.


I was just about to post the same thing. I'm a statistician in my day job and the raw math is only one part of building a model. Human judgement (while often flawed) can be key to improving actual model performance.


Is this a just a hiring competition under a different name? If you can beat their model, obviously you deserve a 100k signing bonus and a very generous compensation package.


Doing well on Kaggle is basically a resume builder and probably would get you into jobs you wouldn't otherwise get. Not much different than topcoder in that sense. Some people can make a living just winning competitions.


You don't beat their model, just those of everyone else in the competition. A very different game.


You can't beat their model because they don't include all the data they use in the competition dataset. It's a highly sanitized and simplified toy problem used for hiring and marketing. They don't even tell you what the features are so it's impossible to use domain expertise to constrain the fitting problem (this is a critical component to building profitable trading models because of the signal to noise ratio in the data)


So, if I parsed the training data correctly, one's output algorithm is completely agnostic to any actual market conditions. It's merely learned on the anonymized feature set of 130 variables. Making it qualitatively no different that any other abstract ML forecasting problem. There's no considerations of market microstructure, news driven events, leverage, etc?


In their "code requirements" section they say:

> Freely & publicly available external data is allowed

So I presume it would be fair to fetch and leverage additional data on your own.


Jane Street is big in OCaml and CompSci worlds but the only guy I know who worked there did ETF arbitrage/redemption/creation which has to be one of the most boring businesses on the street. Is it worth working there (aside from the salaries)?


It depends what you're looking for.

Best in class engineering and internet scale problems? Nope, you aren't going to find that at any hedge fund. They are much more like small start up cultures. Speed and results are favored over a mature engineering culture and maintainable code.

Want to have the potential to make a large direct impact and make a crap load of money? Well then, a hedge fund may be a good fit.


This isn't really accurate of Jane Street. As the other commenter mentioned, it's not a hedge fund. Secondly, engineering culture at JS is typically the opposite of what you described. A lot of time is spent to make sure all code written is quality, things move slower than a company like Facebook for sure.

> Best in class engineering and internet scale problems? Nope

This is mostly true apart from a few specific teams and projects. I think most passionate engineers would find the work uninspiring.


Sorry, I should have said "trading firms", but they are pretty similar culture wise. Most hedge funds have internal funds as well and they run it similar to a prop trading firm.

I think in general though a small company (JS has 900 employees, so I'm guessing around 50-100 devs) simply can't hyper optimize their entire engineering stack to the same extent that a large FAANG can. It's far too wasteful. And I'm talking about tooling, infra, and overall process, not just the code. Code review is the minimum any competent engineering org should be doing.


There are significantly more devs than that (the total is also over 900, that's from 2018).

It depends what you mean by "optimizing their engineering stack". They certainly do put a lot of effort into tooling, by necessity since historically there hasn't been much available for OCaml. For an example of stuff going beyond the open source work to make OCaml usable for large projects, see https://blog.janestreet.com/putting-the-i-back-in-ide-toward...

Obviously there is a lot of infrastructure that you need at a FAANG and not at a company with a few hundred devs. But any trading company that doesn't want to pull a Knight Capital needs to make sure their software is correct and reliable (probably to a greater extent than most of a FAANG).


Jane Street is not a hedge fund and they're pretty well-known for having a huge emphasis on high quality/maintainable code: https://www.youtube.com/watch?v=MUqvXHEjmus


There's a spectrum of roles. The role you describe is that of a trader at the extreme end (little coding ability required, manual monitoring of strategies/opportunities). But there are many traders who spend a majority of their time doing data analysis and programming while monitoring mostly automated strategies out of the corner of their eye. There are researchers who focus purely on statistics and ML projects. Some even get to spend a good portion of their time reading papers, expanding their knowledge and doing basic research, not just applying their existing knowledge to financial datasets. There are also Devs. Some work on ultra low latency systems (though this is not Jane Streets expertise). Some work on Jane Street's OCaml compiler.

Apart from the OCaml compiler, everything else is fairly typical of the spectrum of roles you can find at the very large high frequency firms. And mid-sized firms are similar yet again, minus the basic research. I would say it is definitely worth working in this industry if anything above sounds interesting to you.


>First place: $40,000

I'm not familiar with this area so I'm probably missing something obvious.

If you have a model that outperforms the market, why on Earth would you give it away for 40k rather than use it yourself and make millions?


Jane Street doesn’t tell you where their features come from. Some can be from data sources that cost millions per year. And you would still need to figure out what transformations they did to generate the features in the first place.


If I build a model that actually works well, I'm using it to get rich, patent it, and sell it to the company for a lot more than $100k.


Maybe or maybe not. You may need to build up your trading infrastructure first, which entails among other things low-latency connectivity to different venues, negotiate good deals with brokers to get low trading fees etc. If it were that simple, all the quants would be working for themselves. Trading is not just about having good prediction. Also if you publish/share your algorithm, people will copy it and it will lose its edge.

If you are really good so that whereas others can only predict the future 0.1s but you can predict the future 5s with the same accuracy, then sure, you could trade from home over the Internet, and if you are much more accurate than others, especially when the market moves a lot, you don't need low fees to be competitive.


"Also if you publish/share your algorithm, people will copy it and it will lose its edge."

That's why I list patenting it after getting rich.

The quants don't work for themselves because they're number crunchers and need the financial knowledge that the trading/portfolio managers have. Either way, the main reasons they don't work for themselves is risk and access to capital.


IDK who will still buy a patent for the trading algorithm knowing that it's publicly available and probably not so competitive anymore.


Example: See USAA licensing Vanguard tax savings technique


> Non-Exclusive: You hereby grant and will grant to Competition Sponsor and its designees a worldwide, non-exclusive, sub-licensable, transferable, fully paid-up, royalty-free, perpetual, irrevocable right to use, reproduce, distribute, create derivative works of, publicly perform, publicly display, digitally perform, make, have made, sell, offer for sale and import your winning Submission and the source code used to generate the Submission, in any media now known or developed in the future, for any purpose whatsoever, commercial or otherwise, without further approval by or payment to you.

This is from section A subsection 1 of the competition rules, just for your information. Competition sponsor is Jane Street obviously. If you manage to build a model that can generate returns that would be sufficient to them, you rather trade with it yourself, or use it as a proof-of-work for recruiting interviews.


Isn't that pretty standard throughout the internets. Otherwise the sites will be storing your copyrighted material on their domain (IANAL but that sounds like legal troubles for the site):

This is from HackerNews' legal page:

> By uploading any User Content you hereby grant and will grant Y Combinator and its affiliated companies a nonexclusive, worldwide, royalty free, fully paid up, transferable, sublicensable, perpetual, irrevocable license to copy, display, upload, perform, distribute, store, modify and otherwise use your User Content for any Y Combinator-related purpose in any form, medium or technology now known or later developed.


Just curious, how well-known is jane street outside of the OCaml world?


Well known to Ivy-League undergrads and FAANG engineers for having some of the highest paying jobs available.


Interestingly enough Jane Street has a fairly negative reputation within certain engineering schools that they love to recruit from. Hard to hire talented and passionate people if the only thing you can offer is a bit more money at the expense of everything else people look for in a job.


I interviewed there a few years ago. It wasn't bad, but I had a much more pleasant experience with FAANG (specifically Google and Facebook).


Jane Street has spent a lot of resources into making sure they're well known as one of the "top employers". Their recruiters put a lot of effort into pseudo-anonymously promoting the company on websites like reddit and blind. It's pretty forward thinking actually as younger tech savvy candidates no longer pay attention to sites like glassdoor and seek information through these discussion forums. The natural evolution of fake glassdoor reviews is astroturfing I guess.

Firm performance is excellent but IMHO the only reason to work there as an engineer is for marginally more money.


it's well known as a top tier market making firm. They're not as special as they are made out to be, though. They just have good marketing.


Often mentioned on Blind as one of the places top engineers can be paid handsomely.


i grind php daily, and am aware of them


Sine this is a competition, does someone mind explaining the reason for people to publicly post notebooks in the "Notebooks" section?

Seems counter-intuitive to provide competitors with free information, unless you are trying to throw them off.


Posting notebooks can get you upvotes, which contribute towards becoming a Kaggle (Grand)Master. It is also a good way to "win" some attention and goodwill, without spending months trying to actually win the competition itself. Publishing Notebooks also helps you improve your coding/presentation skills, for a popular notebook needs to be useful for a wide audience (or fairly competitive).

The best techniques, certainly coming from teams, are hardly ever published as Notebooks. But yes, many winning teams will eventually incorporate some of the information in the Notebooks, if only to hedge against the others doing the same.


Not a frequent Kaggle user but from what I’ve seen the ones posted in notebooks are baseline examples. The things anyone who is competitive enough to win has probably thought of and dismissed or could implement themselves in half an hour and iterate from there.


As a current example: the highest-scoring entry that has an associated notebook at the moment is basically a clean example of how to apply XGBoost [1] to this dataset. XGBoost ends up being tried in nearly every Kaggle competition, so the person isn't giving away many secrets there.

[1] https://xgboost.readthedocs.io/en/latest/


Whats the over / under on someone entering the inverse of what WSB does?


Pump in 2020, dump in 2021.


It always makes me laugh when I see people calling a rebound from a crash a “pump”. I usually hear it from people who were too scared to get in low or are short. Be careful of what you allow your cognitive biases to convince you of. It’s the fastest way to lose money.


You didn't look closely enough. Because "the market" has different sectors and some sectors are pumped up into the stratosphere and others are still down. They average down to "slightly above ATH" but the pump is real.


I don't disagree with that at all. Tech, EVs, etc are all pumped. What you said seemed to be more of a generalization that the market as a whole is pumped. That, I disagree with. Money will rotate out of the overpriced sectors and back into the underpriced sectors as Covid is (hopefully) eradicated.


RemindMe! 12 months

Buying overpriced loss making investments is the fastest way to lose cash.


We are in "greed" territory.


Tesla has a P/E ratio of 1100


Late to the game.

https://numer.ai is an entire hedge fund built around an anonymous ML prediction tournament. They solicit predictions, trade them, and reward the best performing ones. IIRC They’ve paid millions in prizes over the last few years.

They also recently introduced Numerai Signals, where they pay for the performance of actual training data. So you can make money providing datasets that perform well.


Numerai’s intro video[1] makes pretty wild claims and seems to target a pretty specific group of people.

Personally I’m put off by the language used in the video.

Also

> Stake on your model to earn cryptocurrency

No thanks. I prefer dollars in my bank account.

1. https://youtu.be/GWeC2PK4yXQ


What claim do you take issue with?

Staking was introduced to reward actual market performance instead of performance on holdout data, and to prevent spam.

Cryptocurrency is the method of payment because it’s a global competition, and that’s the only way to pay everyone across countries. Originally rewards were in Bitcoin but switched to an ethereum token to facilitate staking.


In a trading competition, the best strategy is the riskiest strategy. This maximizes likelihood of making a lot of money ... and losing a lot of money. But in a competition, the left tail doesn't matter. Second place and last place are identical.

Ideally, a trading competition should penalize risky investments. But this is hard to do retrospectively, especially when evaluating algorithms.



Market prediction with anonymized feature set, sounds like Numerai: https://numer.ai/


Join for the 100k, leave for the 10M model earnings you get from running it on the market.


Isn't this just an elaborate way to screen potential candidates who Jane Street might like to hire? A bit like GCHQ running its semi-regular "solve this puzzle" competition.


When I was working in a investors bank I presented an idea about this. This is something that I've always wanted to explore. I'll try to make the time to participate in this one.


Just give me some monkeys, darts and a dart board


Stonks only go up what more do you need to know?


I never heard of Jane street. I visited the website and saw a bunch of people crammed into an open office like chickens in a breeding factory. That is not attractive to me. Competitions like this just seem to me like a cheap way to outsource problem solving without having to pay anyone. Like bringing someone in for an interview, letting them solve your problem, and then telling them they are not a good fit while profiting of their work. Same idea, different package.


Knowing what/who Jane Street is massively changes perspective on this. They are definitely not a random shitty company that outsources their core business to some challenge.

They are quite known if only for the fact they practically adopted OCaml, which is quite impressive considering their size. I highly recommend checking out some of their talks on youtube like this one: https://www.youtube.com/watch?v=gXdMFxGdako


who cares if they use ocaml or punch cards? i give zero shits about their tech. i care about getting paid properly, having my own office to work quietly and have work/life balance, and generally having no daily drama.


Their pay is top tier, $300-400k+ for new grads.


the question is, what the executives get paid, and what the traders get paid, in comparison to what the engineers get paid, and then also in what conditions you need to work as an engineer.


It's privately owned - the firm trades the executives' (partners) capital. Wlb depends on the role. As a quant you won't have a life in any firm, as an SWE I heard it's decent for the pay.


Do they actually produce something valuable for society, or is it just the trading profit they are after?


How about being able to go buy stocks/ETFs paying nothing in commissions and razor thin spreads today? A few decades ago you'd pay O($10-100) per trade and then some in bid/ask

This is all largely thanks to HFT. Robinhood is only viable because big HFT firms are willing to pay dearly for the privilege to serve retail order flow


People who work there will tell you that they provide liquidity, which is a valuable thing to have in markets. I've always been a bit skeptical of how much value they provide there though given that they don't hold on to anything for very long but I'm not particularly informed about it.

Mostly it seems like they scrape pennies off out of the market to enrich the people who work there (they have no outside investors afaik), so imo they are pretty neutral. Not a bad place to be, lots of companies are negative.


> given that they don't hold on to anything for very long

Why would they need to?


facebook doesn't produce any value for society either


Facebook's tendency to display content which reflect more extreme versions of the opinions that the user already has is probably a sizeable force behind our societies growing division. They know this, but they also know that type of content stops people from clicking off their site and drives up engagement, giving them more opportunities to advertise to you. They are a net negative to society.


These are leetcode equivalent for data science and quant. So invest time only interested.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: