These things are in no way similar.
The article seems to pre-suppose that an algorithm can not be biased. The truth is, if the algorithm is trained on past deals then it could easily encode bias. More than this, it can give plausible deniability to biases/prejudiced behavior because “the algorithm did it”.
I have personally heard: "Not enthusiastic enough/Too enthusiastic" (Irony: two different people sitting at the same presentation) "Too little/too much experience" "Not enough/too many generalists." It goes on and on ...
You learn to ignore the excuse and move on--the excuse is irrelevant.
"No" is "no". Move on.
"Maybe" is "no". Move on.
"Yes" is no until you cash the check and it clears.
That's actually a great attitude in almost everything in life: recognize that some things are best modeled as a random variable, and if you fail there might not be a causal explanation and you just have to keep trying.
The downside is that you might be given valuable information (feedback) and you might dismiss it as noise.
So it's a real skill telling signal from noise.
The point is that doing the equivalent of blind auditions might solve it for everyone. It is similar enough to the blind auditions example anyway to make the “it is no way similar” argument clearly wrong.
This is not a blind audition... it’s something quite different.
With an algorithm, you show the inputs, show the outputs and say “no bias, it’s all automated”.
Look, part of the problem might be that at seed stage the optimal short term strategy might be to be biased. If series A,B investors are sexist or racist, then that’s going to make things harder for those companies to progress. An algorithm might pick up on that, particularly if it can find features correlated with sex, race, or some other factor that investors may be biased against.
As to your point regarding the average VC. It’s possible that selecting companies at random might do better than the average VC. I suspect if we did, the societal outcomes might be better.
There could be a real world bias.
They specifically talk about this risk, current data models only use business data and they discuss the dangers of using personal data. It seems to me that data like customer loyalty and cash on hand are perfectly fine to use and don’t come with any direct gender or ethnicity bias.
The following article discusses this in the context of policing for example:
There are a number of parallels with the time when I was trading fixed income at a hedge fund. We had a senior guy looking at the output of various opportunity scanners, and deciding what to do.
There's several problems with this approach.
- The human is always out to prove himself. If you don't override the system now and again, what's the point of you? This means the humans are always on the looking for some special one-off condition they can claim.
- The algo dev stops short of where he could go with it. You ought to be fully automating it, but you don't because you need to leave something on the table. There's a number of data problems that you just don't get around to solving because it's tedious and you aren't going to use it.
- The VC guys have a much worse data problem, by the looks of it. Not every startup will fill out the form. If they don't need your money, no form. If they crash early, no form. After they fill out the form, how do you track what happened to them? Seems like a big problem. Also if you're going to use ML you need a fairly large number of rows. Not just filled out forms, but also labels for how things turned out. And the more features you collect, the more labelled rows you'll want.
So there's a real risk of falling into the pseudo-systematic hole here. You take the data that you have and make conclusions that are very close to your initial priors. Basically you end up with stylized "facts" that aren't necessarily true, just believed.
Seems like a they've thought about these things though, will be interesting to see what happens.
Investment types with many rows of data, enabling truly systemic decision making are securities markets. Whether you are using ML, or a human analyst with a theory, the economic conclusion is the same. A securities market has lots of data. This lets traders price systemically. Pricing, investing & trading are the same thing, in a securities market.
Startup investing is not like that, generally. A person using their subjective faculties is heavily involved, biases and all. This exists because human's subjective cognitive abilities are not just delusion, they are a real cognitive ability even if flawed^.
Systemic & non-systemic systems have their strengths and weaknesses. Human biases are a big weakness for the systemic side. "Searching where the lights (data) are" is the big systemic one. A pseudo-systemic system is basically just a compromise. If the weaknesses of a non-systemic system are a major problem (the whole premise here is that it is), then it is not a stupid idea necessarily.
It doesn't even have to be all that sophisticated. Manual overides are not necessarily a bad system. They make it clear where subjective judgement was used, how often. At least you are aware that it has taken place.
I think the blind orchestra is a good metaphor. It identifies the kind of human bias you are trying to minimize. If one orchestra looks like a Viennese orchestra and the other looks like a hillbilly high school, you want to hide this from human judges. You want human judges to focus their subjective brains elsewhere. You don't want to create an objective test of "good music" and "bad music," because the definition you would be forced to use would suck. At least, the definition would be different, like a sport with rules. So, compromise.
I don't think it's impossible to think of pseudo-systemic investment strategies, based on a similar compromise.
^ For best results, use skin in the game.
Still, I agree the weight of personal relationships and human-powered-decision-making guiding the 'tech' industry is a bit ironic.