In the earlier thread, it seemed like some people were reaching different conclusions because they were using different definitions of "bias". I think my working definition would be something like "there existed in the actual applicant pool a subset of unfunded female founders who should have been statistically expected (given the information information available to the VC's at the time of decision) to outperform an equal sized subset of male founders who did in fact receive funding".
Alternatively (and I don't think equivalently?) one could reasonably take bias to mean "Given their prejudices, if the same VC's had been blinded to the sex of the applicants, they would have made funding choices resulting in higher total returns than the sex-aware choices they actually made." I'm sure there are many other ways of defining "bias". Could you define what would need to be true for your test to show that "the VC process is biased against female founders"?
I think that's equivalent (assuming there are no efficacy losses of blinding, of course). If you've got the two subsets in question, then they could make a choice with higher expected returns. If they can make a choice with higher expected returns, then there exist the subsets in question. (Let me skip the handwaving about continuity and measurability, please.)
Those definitions are good because they even describe bias in a scenario where VC's are less accurate at identifying the ability of one group versus another, even if both had equal distributions of "true observable ability" and equal-sized applicant pools, and if the VC's decisions still resulted in equal proportions of each group getting funded.
> This new test is also imperfect - it fails to handle noise in measuring inputs/outputs, for example.
This doesn't make the test imperfect, it makes it completely useless when the noise is a wild scrambling function, and worse (and even more importantly), it could be different scrambling functions for group A and group B. And we don't know what they are. In the case of investments, it is a very noisy function, and probably different too.
I didn't propose this test as a solution to PG's data. I have no idea how to handle the First Round Capital question, or whether FRC is biased, and I didn't claim otherwise.
I solved a toy problem - my goal was to understand one piece of the problem and solve it in isolation. In this case, I was handling the marginal vs mean problem. I.e., PG took step 1, I took step 2. (Both of us repeating work that Gary Becker did a while back.)
Can you do step 3? Or is your sole point to sit on the sidelines saying "haha, you don't have the answer yet, stupid math geeks?"
I think you're wrong to even call PG's test wrong, in comparison to this. His is executable while both have an assumption about distributions: with his that the distribution of group A and group B are the same, and with yours, if we actually try to solve the noise problem, that their distributions of outcomes are the same.
His assumption is more plausible than yours, because yours breaks down merely if the variance of outcomes (given a prior EV) is different. It's too fragile a test to hold up. His breaks own if the variance and tails are different, but even then it still gives you something to look at as a year-over-year metric.
His can also still show certain conclusions. If group A outperforms B in his test, that means either there's bias against A or group A has a fatter tail than B.
I don't think you took step 2. You took step 1+i.
> Or is your sole point to sit on the sidelines saying "haha, you don't have the answer yet, stupid math geeks?"
I think it's perfectly reasonable to criticize the cargo cult application of mathematics.
Edit: And that's the crux of this. PG's test is targeted toward the real-life situation where there's noise. That's a mere footnote to you, which means you're solving a completely different problem.
If the distribution of A and B is the same, any significant inequality of outcomes must be caused by bias.
With mine, you need to replace the min estimator by some sort of low quantile estimator (which is far less sensitive to noisE). Quantile estimators are, of course, invulnerable to variation on the right. I haven't worked out the details yet, but this is the pretty obvious next step.
I really don't get why you say this is "cargo cult application of mathematics". What sort of mathematics would you NOT consider "cargo cult"?
Mathematics that makes progress towards a solution. Looking at minimum for non-noisy distributions is just not the problem at hand, heck, you could look at a few graphs and the bias is obvious.
Let's say you have Group A and Group B with distributions where people have EV for the returns on a $1M investment ranging from $0 (lose all your money) to $1B (maybe it gets real thin around $10M), and investors try to invest in anybody with EV > $1M.
But maybe they're biased.
Oh and by the way, the only possible exits are $0 and you sell the company for $1B.
An attempt to make progress towards a solution would be better off targeting this model problem. Now you're attacking the noise head on.
In the earlier thread, it seemed like some people were reaching different conclusions because they were using different definitions of "bias". I think my working definition would be something like "there existed in the actual applicant pool a subset of unfunded female founders who should have been statistically expected (given the information information available to the VC's at the time of decision) to outperform an equal sized subset of male founders who did in fact receive funding".
Alternatively (and I don't think equivalently?) one could reasonably take bias to mean "Given their prejudices, if the same VC's had been blinded to the sex of the applicants, they would have made funding choices resulting in higher total returns than the sex-aware choices they actually made." I'm sure there are many other ways of defining "bias". Could you define what would need to be true for your test to show that "the VC process is biased against female founders"?