Nice work with the write up and thank you for sharing this. The post is interesting, but I think the problem with the YCRank approach currently is that the labelling appears to be subjective opinion, at least if I understand correctly.
Based on the post, you've trained the classifier by labelling a couple of examples of company descriptions you liked better than each other, based on subjective assessments like "harder to execute" or revenue growth that aren't part of the data you're running the classifier against.
If so, you've done a nice job of training a classifier to predict which companies you personally are more likely to be interested in. To improve this, you could use past YC batch company descriptions and success data to have more useful examples and labels for training the classifier based on past data, and which isn't so subjective. That might produce some interesting predictions that are more generalizable (although I think you may need more data points than the description and basic metadata).
If I've misunderstood, it would be interesting to know a little more detail about how the data was labelled.
I've based this on the following: "To investigate this, I made a neural network, YCRank, trained it on a handful of hand-labeled pairwise comparisons, and then used the learned comparator to sort the companies in the most recent W’22 batch."
And then: "I biased my ranking towards what was “harder to execute” on" and "I also tended to rank favorably companies that were already making monthly recurring revenue with double-digit growth rates".
Those may or may not be good criteria.
Based on that, this is essentially what you could call a "DudeRank Classifier" because as The Dude in the Big Lebowski says, "Yeah, well, that's just like, your opinion, man" :)
As I suggested above, it might be more interesting to label the example pairs and train the classifier based on the original company descriptions of known past successful and unsuccessful YC companies.
Possibly there is some signal in the company descriptions and limited metadata from Demo Day alone sufficient to predict successful companies from a batch.
Good luck!
Disclaimer: I am in the W22 batch. Our startup (Andi) ranks pretty well here. And this also is just, like, my opinion :)
[Edit: You could also test the classifier against historical batches to improve it then also!]
> Based on that, this is essentially what you could call a "DudeRank Classifier" because as The Dude in the Big Lebowski says, "Yeah, well, that's just like, your opinion, man" :)
Yes, but isn't human VC investing already just a big DudeRank classifier?
Yes, in the same way that institutional investing and society is one big "DudeRank" filter. It doesn't mean that there's no structure, it means that you're not looking at the right place.
The difference between a layman investing and a skilled top 10% VC investing is that the latter already stands in very strong position of human and social capital networks, and moreover (if they're actually skilled) has experience materializing that capital into strong 0-1 outcomes. Or being really good at survivor's bias.
I checked the top 6 and 5 of them appear to be vaporware without a real product. To be fair, they're working on hard products, but I wonder if the model is selecting for that. According to the announcement[0] 29% of the batch were accepted with only an idea, so I guess that's not surprising.
Slightly off topic, but I thought it was interesting that 29% of the batch has just an idea while 10% had more than $50k of monthly revenue when accepted. Given they all get the same deal, it would be wild to build out a business that's making a 600k+ a year and get the same terms as two guys with an idea. Seems to go very against everything I read from YC about validation and product market fit, but I guess it's good if they can get away with those terms. That is of course if you're doing a real business rather than MoviePass model (selling $10 bills for $5).
Is there anywhere I can get a breakdown of the companies that fall into these categories?
That's because it uses as only input the description of the company, which is 80% non-informative common words, and 20% buzzwords.
It's basically a buzzword detector.
I reckon you could reach the same results with TF-IDF and kmeans.
Investors get yield by investing in unique products. When there is enough* money floating around, anything that can be built in a few weeks will have been built in a few weeks.
This means that in order to get return, investors have to be willing to invest in longer term efforts which are sufficiently difficult to execute that other investors will either refuse to fund the project - or their teams will fail to deliver.
When there is enough* money available, it stops making sense to launch product - you can keep pitching a bigger vision indefinitely, but once you launch you are beholden to real metrics. In fact, launching means that your competition suddenly looks ridiculously capital efficient in comparison.
From my experience the two biggest pitfalls to startups are lack of product market fit and execution.
Product market fit is the hardest thing you have to do. Sometimes the current products are good enough or no one really wants the product. You also have to consider whether its economical. For instance, there may be a demand for flying cars but its uneconomical to driving, so it doesn't really work as a startup.
The other problem is execution. Take the high to medium end electric car market that sprung up after Tesla proved product market fit. There are probably a dozen electric car manufacturers that don't actually manufacture electric cars for sale. That's because its actually very hard to do that. It's easy to create a pitch deck and even a prototype, but shipping cars is hard.
A start up with no product has not proven either of these. They don't have product market fit because their idea has not been tested in the market yet. And they certainly didn't execute yet.
Compare that to a company with 600k+ a year revenue. They may not have proven the economics aspect (they're probably losing money) and they may never prove it. But they have proven that someone will pay for their product and they can create and deliver a product (assuming revenue isn't pre-sales). So if you were to compare to a few guys with an idea and a company shipping real product, I would think the one shipping is a lot further along and deserves a higher valuation in general
Here is where TAM starts to be a consideration. A company building 3D printed rocket engines can claim that the market is whatever size they want as they are not beholden to prove it for 5+ years. A company making a mobile blood tester isn't expected to have a working demonstration for 5+ years.
Compare this to a company with 600k in revenue, if they are in a competitive/small market - then this may mean a company which ultimately produces 100 million in revenue with 10-20 million in profit. Whereas the above two companies can pitch that they will eventually produce $BigNumber revenue with $HighProfit margins.
This scheme really only works when there is effectively infinite money floating around, and the opportunity cost of parking the money in a bad idea is low. I wouldn't expect a MagicLeap or Theranos to occur in any other environment.
Thanks for the analysis. Can't say I understood much, but the concept of automating venture investment is an interesting idea. As a side question, what would you recommend a total beginner to learn if they wanted to go from 0 to being competent at neural networks, or machine learning in general. I'm currently dabbling with R in grad school (biology), and I know Python is big within the machine learning world.
Is there a pathway you would recommend (e.g., first learn Python until you're familiar in X, Y, Z, then make sure to learn the required mathematics... etc)? Also, how long would this process of learning roughly take? Been thinking of potentially changing fields after grad school, or maybe working at the intersection of ML and medicine.
I got started with Andrew Ng's Coursera ML course in 2012, and have been learning ML ever since. I think a very diligent student could catch up in half the time (about 5 years).
Maybe I'm missing something, but wouldn't it make more sense to use data from older companies that had a few years to develop instead of only data from this batch? This model seems to extrapolate based on how you judge the companies rather than on how well they may or may not do.
What do people think of BayesDB being used for the same thing?:
"BayesDB, which is open source and in use by organizations like the Bill & Melinda Gates Foundation and JPMorgan, lets users who lack statistics training understand the probable implications of data by writing queries in a simple, SQL-like language."[0]
run it on old batches and see if you can predict which ones went on to raise a Series A, B, C or liquidity event.
Note: you might have to adjust for year since a Seed round in 2020 looks like a Series A in 2010.
Edit: I realize you have to figure out which companies raised out of 100s. As always, the data curation is harder than the algo. Not sure if techcrunch allows scraping.
I didn’t quite understand how the labeling part worked. So you went through a selection of startups and ranked them? And then model took it from there?
Based on the post, you've trained the classifier by labelling a couple of examples of company descriptions you liked better than each other, based on subjective assessments like "harder to execute" or revenue growth that aren't part of the data you're running the classifier against.
If so, you've done a nice job of training a classifier to predict which companies you personally are more likely to be interested in. To improve this, you could use past YC batch company descriptions and success data to have more useful examples and labels for training the classifier based on past data, and which isn't so subjective. That might produce some interesting predictions that are more generalizable (although I think you may need more data points than the description and basic metadata).
If I've misunderstood, it would be interesting to know a little more detail about how the data was labelled.
I've based this on the following: "To investigate this, I made a neural network, YCRank, trained it on a handful of hand-labeled pairwise comparisons, and then used the learned comparator to sort the companies in the most recent W’22 batch."
And then: "I biased my ranking towards what was “harder to execute” on" and "I also tended to rank favorably companies that were already making monthly recurring revenue with double-digit growth rates".
Those may or may not be good criteria.
Based on that, this is essentially what you could call a "DudeRank Classifier" because as The Dude in the Big Lebowski says, "Yeah, well, that's just like, your opinion, man" :)
As I suggested above, it might be more interesting to label the example pairs and train the classifier based on the original company descriptions of known past successful and unsuccessful YC companies.
Possibly there is some signal in the company descriptions and limited metadata from Demo Day alone sufficient to predict successful companies from a batch.
Good luck!
Disclaimer: I am in the W22 batch. Our startup (Andi) ranks pretty well here. And this also is just, like, my opinion :)
[Edit: You could also test the classifier against historical batches to improve it then also!]