
YC Fellowship optimal cut-off from 6500 applicants - DrNuke
Sam Altman said they got 6500 applications.<p>He expected 1000 for 20 (top 2%) fellowships out of, say, 50 interviews (top 5%), their usual rate of acceptance from interviews being about 40%.<p>What to do with 6500 applications in order not to miss value? Top 5% is 325 interviews and top 2% is 130 fellowships. Too many to handle, not happening imho.<p>What would you take as a cut-off then? How would you justify your numbers? Would top 3% interviews (195) and top 1% fellowships (65) be optimal?<p>Discuss.
======
gautamnarula
A good approach would be to use Elo ratings from chess in a similar fashion to
how Mark Zuckerberg used Elo ratings for Facemash when he was at Harvard.

Assign each application a rating of 1000. Randomly select two applications and
show them to a partner. The partner selects the better application, or
declares a tie. The Elo ratings are updated accordingly. Aggregate this over
many partners and many applications, and you now have a quantifiable measure
of YC's aggregate preferences for the applications. Further, you can now
calculate the probability that YC will prefer one application over the other
by comparing ratings, even if the two applications have never been directly
compared before.

Then, rank based on ratings and select the top N (N being the number they want
to interview).

The cool thing about this is it's very simple to implement (easily done in a
day, or even a few hours). It'd be a useful measure even for YC's regular
batches, which have more manageable numbers.

------
bmir-alum-007
Stanford BMIR takes 3% as students FYI. Filtering funnel is necessary, of
course, but making something too exclusive just passes on solid people for
nitpick bikeshedding on interview, often misleading, snap impressions and revs
up artificial pedigree factor. And that leaves millions or potentially
billions of dollars that could have been created.

~~~
smeyer
Filtering from 3% down to 20 individuals doesn't have to mean they really
think they're getting the best 20 out of that 3%. It just means that they're
not interested in growing the size of the program past that. If adding more
fellows would spread resources thinner and reduce the impact on each fellow,
adding good people from the rest of that top 3% could be subtracting rather
than adding "millions or potentially billions".

~~~
bmir-alum-007
Contrarian doubt, climate change denial-style. Do you work for the coal and
tobacco industries? ;)

Cutting off one or two stellar founder teams to meet some artificially-low
quotas is a lose-lose proposition: awesome founders don't get to meet other
awesome founders, awesome founders don't get a chance to pitch awesome
investors and so on. Limited flexibility is common-sense, irrational rigidity
is value-destroying.

~~~
smeyer
The assumption is that Y Combinator thinks that they're providing value
through resources like mentorship, and that if they spread those resources too
thin, they'll reduce the odds of producing a billion dollar company (or some
similar metric). Sure, that limit may be flexible a bit one way or the other
way based on the quality of the applicants, but it's neither necessarily
irrational nor artificial prestige-bumping quota to limit to a couple of dozen
teams rather than taking hundreds who they think are good.

~~~
bmir-alum-007
Scaling to something ridiculous like 100x is obviously a nonstarter, but
that'll never happen. 1-2 extra companies isn't spreading too thin. In fact,
one of Sama's purported goals is to scale up YC several times while retaining
it's core culture. It's hardish to do because investing/mentorship is
analogous to a lifestyle business model, but it's not impossible. It won't be
exactly the same, but it won't suck.

~~~
smeyer
Yeah, I agree, they obviously might end up adding a couple of extra companies
depending on who applies. Especially for a first pass at something, though, it
makes sense to aim smaller and grow from there. Ultimately, it's hard to know
what the optimal cutoff is, but I think we're on the same page here.

------
richardbrevig
> Sam Altman said they got 6500 applications.

Is this online anywhere?

As far as cut-off: if the major experiment here is working with teams
remotely, why question increasing the 20 seats? The number of applicants
shouldn't affect that experiment as far as I know. If they're using this as a
means to increase their reach for YC batches, that's another topic.

~~~
Yadi
Yup:

[https://twitter.com/sama/status/626523690533027840](https://twitter.com/sama/status/626523690533027840)

------
tima101
This time fellowship is an experiment, so number of selected will be low. The
goal is to figure out how to scale advice and work with remote teams. But I
believe that YC will eventually fund 1000 startups using this program. Two
birds are killed: more founders get help and YC discovers more talented
teams/big ideas.

------
TEMPsmalllab
If i were them i will be picking the most hungry ones! I am indeed hungry!

------
1arity
There is no technological solution to this.

Any automated solution presumes that you can program a query that detects
signals of desirability ( whatever they are in this context ). Yet if you can
do that, you have no trouble no matter the number of applies.

Since there is no transfer of expertise from the expert assessors to such a
query that is possible, you have to stick with humans.

Perhaps the large number suggests a different opportunity however. It's now an
opportunity to get more aggressive about quality. Raise the bar ( as long as
your signals still hold under their amplification ) because now you have a
bigger pool.

Previously looking for semi-coherent responses that piqued interest? Now look
for decisively coherent ones.

Previously looking for some evidence of lack of confidence and unassuredness
in the idea ( indicating self-critical thinking and honesty which are useful
traits in themselves and, by the heuristic that if your answers are so
bulletproof you wouldn't need funding or help, suggests desirability ( and by
the secondary heuristic that investors like to invest in risk otherwise it is
: A ) not interesting for them and B ) not okay if it fails ( since it was
such a sure thing ) ) )? Now go for absolute divulgence and transparency.

Then how do you gauge the effectiveness of this approach ?

Try it on a small batch, and see how well it correlates with an existing
ranking from an expert assessment of another small batch.

If you iterate like this, maybe you will find there is some technological
signal you can query ( like frequency of green and red flag words, or like
clustering texts into batches based on sentiment or topic, or bag of words and
seeing if you can't exclude whole batches, or like using cosine similarity
with some canonical great prior applications and terrible prior applications
to partition the applies ).

Another idea is it's like you have to mark 6500 test papers, so why not employ
someone like CrowdFlower ( who shepherds human intelligence tasks ) to apply
your score-sheet to each application, cross-checking a sample to validate
stability, to give each application a number?

As much as possible this score sheet distills the subjective appraisals of YC.

Finally I think it is unreliable for people to choose the top 5% or 2% of a
number of things. People I feel are far better at choosing the top 1 out of 3
things. So iteratively apply this by doing a "facemash" of 3 applications side
by side, making the selection into a game, and get the partners, part time
partners, and associates to play this until a stable ranking emerges. If this
takes too long then make it 1 out of 10. So each round of the game produces 10
preference relations that the selected application is better than 9 others.

A final, and perhaps the best idea, is to pre-compute a ranking based on meta
signals from the application process itself such as: number of revisions (
suggesting a lack of confidence in the pitch and anxiety over a small amount
of money ), length of the video ( longer suggesting unfamiliarity with the
idea or non-empathetic desperation ), Benford's law over the character ( or
word ) count distribution of response texts for the answers to detect
anomalies, "alignment" ( YC weights the application questions in importance
and computes the inner product ( or cosine similarity ) of the weights vector
and the word ( or character ) lengths of responses vector, to see how well
applicants weighting of importance aligns with YC. Interesting outliers ( such
a orthogonal perspectives ) could also be tagged for a closer look.

It seems that unless they have done this before, the workable procedure for
6500 applications is itself an experiment.

~~~
phlandis
TLDR: >People I feel are far better at choosing the top 1 out of 3 things. So
iteratively apply this by doing a "facemash" of 3 applications side by side,
making the selection into a game \- good idea

------
phlandis
Pick those that post in this thread. urbit

