

Ask HN: How would you pick successful YC applications? - progga

Suppose you have been asked to pick 300 successful applications out of 6000 YC applications in 20 days.  You realize that the usual rules [0] are tough to apply given the time constraints.  How would you go about doing this?<p>[0] http://ycombinator.com/howtoapply.html
======
patio11
The first thing we always try in AI is reducing it to a solved problem, so I'd
be inclined to try that first.

Train a naive Bayesian classifier on 25% of the successful and non-successful
applications to date. Run it on the last batch's applications. Observe if
results look promising. If they do, run it against the 6,000 applications,
splitting them into three groups based on how promising they looked. Group A
gets the most attention, Group B gets middling attention, Group C gets
attention as resources permit.

Alternatively, same deal but train with data from only the applications which
went on to be _successful_ YC companies.

There are any number of fairly obvious problems with this approach, but
scarily, in many, many fields dumb algorithms beat smart people because dumb
algorithms apply the meat-and-potatoes part of the classification successfully
every single time. (e.g. Credit risk scoring roflstomps over experienced
credit underwriters for making consumer credit decisions, partially because it
is free at the margin and partially because if you think the intangibles like
an applicant's character is more important than their credit history
statistically speaking you are wrong.)

~~~
dirkdeman
Wow. I'm both impressed and a little scared by this answer! I had this
romantic yet slightly naieve idea that there were a couple of people working
crazy hours reading all applications...

Now I'm pondering my mind on what criteria you're running the naive Bayesian
classifier since most of the questions in the application form are text, so
hard to compare I guess.

~~~
patio11
I'm not associated with YC, and this is just my "what I'd try" answer from a
"natural language processing was my first love" angle. YC, to my
understanding, does read everything the ol' fashioned way.

------
dgunn
I think I would work backward from the way you phrased it. I suspect it's
easier to pick out bad applications so I would focus on finding them first.
After that I would apply filters to specific questions and prioritize the
remaining applications based on the results (ex - teams with 2 or 3 founders
are given attention before those with any other number)

------
ig1
My understanding is that there are around 2000 applications, and most of them
aren't very good (pg has said in the past that most application are either a
definite yes or a definite no).

So in a quick first pass it's probably not too hard to weed out the no's which
would take the applications down to a managable number.

------
dirkdeman
They have a team of people sifting through the applications. Suppose there's
four of them, that makes 75 applications per person per day. A lot of them are
probably rejected pretty quickly, so they end up with a couple of dozens
applications, which will be then discussed.

------
Michael_K
You ask the founders from previously funded startups to help you out. PG
mentioned that most applications are a clear no. That should weed out 85%-90%.
The rest deserver a closer look.

