Hacker News new | comments | show | ask | jobs | submit login
Replacing Judgment with Algorithms (schneier.com)
79 points by henrik_w 743 days ago | hide | past | web | favorite | 62 comments



Schneier makes a startling claim:

It's already illegal for these algorithms to have discriminatory outcomes, even if they're not deliberately designed in. This concept needs to be expanded. We as a society need to understand what we expect out of the algorithms that automatically judge us and ensure that those expectations are met.

Schneier is discussing an unpleasant fact; unbiased algorithms often discover that things we previously attributed to bias were actually unbiased predictors. But we don't like to admit this, so we instead make silly statements like the above.

If we want to mandate the outcome of our decision process, let's just do that openly and honestly. E.g. for credit, just impose a quota of credit that must be issued to each protected class.

If we want fair and unbiased decisions then we need to instead examine the internals of our process and accept the outcomes, whatever they may be.

But we need to stop pretending we can do both.


>> It's already illegal for these algorithms to have discriminatory outcomes, even if they're not deliberately designed in.

> unbiased algorithms often discover that things we previously attributed to bias were actually unbiased predictors

That may be true, but Schneier is speaking on a different topic -- redlining. There's a long history of people discriminating based on race, sex, class, etc. and pretending they're using an unbiased algorithm instead. The word redlining comes from the practice of using geography to make decisions. It's obvious to most people that geography is highly correlated with race and class. Therefore, even if geography itself carries no intrinsic bias, to use geography as a variable in evaluating credit-worthiness is discriminatory.

https://en.wikipedia.org/wiki/Redlining


> Schneier is discussing an unpleasant fact; unbiased algorithms often discover that things we previously attributed to bias were actually unbiased predictors.

That's an interesting interpretation. To me it looks more like he's discussing that bias can (probably will) be baked into algorithms even unintentionally and often subtly. And that highlights a significant difficulty with the "unbiased predictors" thing you say: can you distinguish between an unbiased algorithm discovering that something that looks like bias isn't and bias being subtly baked into the algorithm? I think that's where the "we need to understand what we expect out of the algorithms and ensure the expectations are met" bit comes in, at least in part.


Definitely - one makes more money than the other. If a bank is biased against some group they are turning away profitable customers. This is also purely a statistical problem; once some quant discovers they can make more money by fixing the bias, they'll do it.

Understanding what to expect out of the algorithms is absolutely the wrong way to determine this. The fact is that we simply don't know apriori the optimal way to allocate credit. That's why we need an algorithm in the first place.


Or, having realized that algorithms making predictions based on past data are politically conservative (pretty much by definition), if we really want to improve the economy, we should take political action to change the situation. This is how it's always been done, and it can work; sometimes rather well.


If you want to force banks to make suboptimal credit allocation decisions for political reasons, go ahead and openly advocate for it. And be sure to openly accept blame for it if those loans go bad and we have another banking crisis.

Just don't try to claim you are attempting to eliminate bias. Eliminating bias is a job for statisticians, not politicians.


> If you want to force banks to make suboptimal credit allocation decisions for political reasons, go ahead and openly advocate for it

You and I both want to force banks to make suboptimal credit allocations for political reasons (or we both don't). Leaving the local optimum can only be done with coordinated action, but that doesn't mean that your algorithm (that keeps us there) is any less political. By not calling for social action, you are forcing banks to stay in their Nash equilibrium, which is suboptimal for them.

What I want is for us to acknowledge that we have (possibly unintentionally) programmed our algorithms to be conservative, and then make the conscious decision whether we want conservative algorithms or progressive algorithms. Sadly, neutral algorithms are impossible[1] if based on their output we take actions that may affect the distribution of power.

I'm a progressive and I want a progressive agenda. You're a conservative and want a conservative agenda. Both are perfectly fine, but you must understand that there's nothing neutral in your stance. I believe political action for social change often works and can be very helpful, so I wish to encourage it; you believe the opposite, and therefore wish to discourage it. But neither of us is neutral, and your position is not based on math; your math simply serves your position. Your own equations deliberately omit feedback that we know for a fact (through the study of history) to exist. You make assumptions that reflect your conservative values just as I make assumptions reflecting my liberal values.

> Eliminating bias is a job for statisticians, not politicians.

You think that when people talk of social bias they mean statistical bias? This may sometimes happen to be (also) true, but in general, when people talk of social bias they mean any human behavior that creates a dynamics for present and future unfair distributions of power. Sadly, statisticians can't eliminate that. Or, rather they can't eliminate that alone, because politics is not (just) what politicians do. Politics[2] is what we all do when we make decisions that affect the distribution of power in society.

Interpreting statistical snapshots as predictors of future human behavior is not statistics but politics. Statistics tells us what is (or sometimes how things will likely be if we change nothing); certainly not what we should do.

------

[1]: Well, it's possible to pretend some are neutral if we artificially (and incorrectly) reduce the domain of our value functions. In an interconnected system like human society, it is is disingenuous to direct our action based on its outcome for a very limited sector (like a bank), without saying what we want the action to achieve for all other sectors. So, for example, if you ask me if I want bankers to be happy, then the answer is, of course I do. But that doesn't mean I'll favor any action that makes them happy if it also makes other people less happy. My point is that our value functions must be total; we can't be lazy and assign a value to a narrow look at an outcome, but we must assign a value for a complete outcome (i.e. for all of humanity).

[2]: https://en.wikipedia.org/wiki/Politics : Politics... is the practice and theory of influencing other people... Furthermore, politics is the study or practice of the distribution of power and resources within a given community... as well as the interrelationship(s) between communities.


According to you, in the past, your vague power "theories" predict that "the pattern is long periods of stability". In fact, that's the only concrete prediction you've ever been willing to state.

https://news.ycombinator.com/item?id=9987011

It's "conservative" to take as an assumption a prediction from your "theory"? ("Theory" is in quotes, since you work so hard to avoid testable predictions.)


> It's "conservative" to take as an assumption a prediction from your "theory"?

It's not a "theory" any more than the the big bang is a "theory". It is the result of decades (nay, centuries) of research, and I don't see why you'd scoff at it so smugly (and ignorantly). Second, of course it is conservative. As I explained to someone else here, I am not expecting individuals to change their behavior without coordination. Nash equilibria can't be escaped that way anyway. The responsibility is on society. A society that acts to ensure things stay as they are is carrying out a conservative agenda.

I don't think I understand your question, though. Working to keep things as they are is a conservative ideology. Periods of relative stability are a historical fact. Those periods have been no doubt assisted by conservative ideology (although ideology is far from being the only social force), which is nothing new either. Where do you see the contradiction? People, with their opinions and needs and fears and desires and hysteria and ideology and actions and follies are the ones making the system. They are the observers, object and subjects of the system's behavior all at the same time. In a dynamical system with feedback you could be predicting a result, making it, and directing it based on your agenda at the same time. There's perfect coherence here. Look at election polls; they are somewhat similar: they both predict the result and make it, and there's a lot of political agenda involved (or see my examples of some dynamical systems here: https://news.ycombinator.com/item?id=10874683).

How people behave and how they should behave are two perhaps equally interesting but completely different questions. If it is your opinion that people should behave in a way that simply mirrors how other people behave, then yes, your answer to the second question is "according to conservative ideology". If you think that my opinion is that on occasion people should behave in a way that isn't in accordance with their immediate, narrow material self-interest, then you'd be correct. If, however, you think that I think that people often behave this way, then you'd be wrong.

Finally, it's not "conservative" but conservative, just as you are not a "conservative" but a conservative.


A theory is a machine for making testable predictions (i.e., positive claims that are a reason for rejecting the theory should they fail to be true). You go to great effort to NOT make any testable predictions.

My point is that if your one testable prediction of slow change is true, then using the past to predict the future will be highly accurate. By "accurate", I mean that if the predictor says group X will have a 40% default rate, the empirical default rate of group X will be close to 40%. (I'm assuming a good model is built, etc.) Once in a blue moon (the rapid changes) this will fail, then everyone can re-adjust their models and go back to using the slow change assumption.

This is strictly a positive (i.e., value-free) claim about the world. It's based on your positive belief.

As for your dynamical systems verbiage, feel free to state a testable prediction.


> You go to great effort to NOT make any testable predictions.

How do you figure? I don't make testable predictions when I don't have the tools to make them. I will just point out that 1. perfectly predictive models could be completely wrong in a dynamical system with feedback, and 2. that human society has been known to be such a system.

> This is strictly a positive (i.e., value-free) claim about the world. It's based on your positive belief.

You are absolutely right. Not only have I said the exact same thing, but I even went a step further and showed how making that prediction could be right even ad infinitum. That does not, however, mean that there isn't another correct prediction and that the choice between the two isn't ideological. Systems can (and do) have more than one stable state. I will say this again: your predictions both predict the outcome and determine it (https://news.ycombinator.com/item?id=10874683).

I can give you an example that may make the point clearer: if society insisted that humans can't fly, it would have been correct for a very long time. If it had insisted that pursuing such goals is wasteful, it would have been correct to this day. This isn't mere speculation. In 1485, the sultan of the Ottoman empire forbade the printing of Arabic books (because the Arabic script is holy). That decision helped turn Arab society from the most advanced in the West in the middle ages, to a rather underdeveloped one.

> As for your dynamical systems verbiage, feel free to state a testable prediction.

That I've shown your math to be wrong does not mean that I have a better formula. I don't. I don't know how to predict society's behavior with any useful accuracy, but I do know to spot the errors in your predictions. This is made easy by two facts: 1/ that identical predictions could have been made in various periods throughout history and they would have been wrong then, and 2/ that your biases are so strong and that you so insist in remaining blind to their existence and keep making the same wrong assumptions over and over.

I am truly sorry: we don't yet know how to mathematically model society well. That, however, does not mean that we can't study it and make some useful qualitative observations. When I studied non-linear differential equations (only ODEs, though), I was dismayed that the same could be said on much simpler models than human society, but there you have it.

But if you insist on something, here you go: if society decides to continue to bet 1, the result will be 1 until unforeseen forces change it; if, however, society decides to bet -1, the result will rather quickly converge to -1.


I don't make testable predictions when I don't have the tools to make them.

You just admitted you don't have a theory (aka, the tool to make testable predictions).

Also, you haven't shown the math to be wrong. All you've done is speculated that if we start making bad loans to groups that are disproportionately likely to be deadbeats, they'll suddenly start paying back their loans. If you really believe that (I don't think you do) then go forth and earn your billions.


[flagged]


[flagged]


[flagged]


Would you two please stop going at it in the most tedious way every time you both show up in a thread?


> Schneier is discussing an unpleasant fact; unbiased algorithms often discover that things we previously attributed to bias were actually unbiased predictors. But we don't like to admit this, so we instead make silly statements like the above.

There is a legitimate question as to whether an algorithm is actually unbiased. It could be quite accurate or there could be a bias, e.g. because the algorithm penalizes drug convictions and drug convictions have an unjustified bias due to racist or sexist enforcement.

The problem is it isn't a simple question to answer. And people are going to want to choose their answer based on their politics.

But that can't work. We can't let a black box take discriminatory inputs and expect it to produce a just result. We also can't just assume that every unequal outcome had an unjust cause, or we'll end up e.g. causing loans to be given to uncreditworthy people and wrecking the economy (see also housing crash).

Which means that Schneier is wrong about quotas but exactly right that what we need is transparency. Because if there is unjustified bias then we need to understand why so we can prevent it, and if there isn't then we need to understand why so we can accept it.


Consider your drug conviction example, and lets say the algorithm is linear regression so I can give a very simple explanation.

The model will then be FICO = stuff + (-1) x (drug conviction).

I.e., your FICO score is lowered by 1 point if you've been convicted of drug use.

If you are correct, some enterprising quant can take your hypothesis. He can then rerun his regression with (drug conviction, black) and (drug conviction, white) as variables. If your hypothesis is correct the result will be:

FICO = stuff + (-0.5) x (drug conviction, black) + (-1.5) x (drug conviction, white).

This is because the bias in drug convictions makes it less predictive for black people.

If you are wrong the coefficients will be equal [1]. If you are right he just made millions of dollars for the bank he works for and probably added $100k to his bonus.

What you are discussing is strictly a statistics problem.

[1] For examples of this type of analysis on social indicators (I don't know of any on credit allocation), see these papers: https://www.rand.org/content/dam/rand/www/external/labor/sem... http://egov.ufsc.br/portal/sites/default/files/anexos/33027-... http://ftp.iza.org/dp8733.pdf


You're essentially making an argument from the efficient market hypothesis:

https://en.wikipedia.org/wiki/Efficient-market_hypothesis

In other words, the algorithm won't be biased because there is a buck to be made by removing the bias.

But the efficient market hypothesis is weird. It's kind of like Schrodinger's Cat. If you can find some specific instance where it's wrong and publish it then the market corrects itself, so if you observe a bias then it ceases to exist. So now on one hand, we know that biases can exist because people have discovered them in the past; it's just that those biases have already been published and are being taken into account. We can speculate that more exist but we don't know what they are. On the other hand, maybe now we've finally found them all (or the remaining ones are negligible) and the efficient market hypothesis becomes true.

But notice that the thing to do if you believe the efficient market hypothesis is wrong in a particular case is clear. Prove it, because you'll be able to make a lot of money by making the world more just.

Which is why we need transparency. Because the market participants can't find the bias if they don't have the information, and then the efficient market hypothesis will certainly be false.


I made no claims about the EMH. My only claim is that if bias like what you describe exists, it's something that statisticians rather than politicians need to identify and fix, and that the statisticians in the position to fix it have perfect incentives to do so.

Politicians have neither the ability nor incentive.

Similarly, if someone were claiming that we need political oversight of UI/UX choices, and that American web pages need more red/white/blue to improve conversion rates, I'd suggest that this is a job for web designers rather than politicians. I'd also say that if you just want red/white/blue designs for patriotic purposes, you should openly advocate for that and not talk about conversion rate.


The point is that assigning a FICO score -- any score -- is a strong predictor of future scores, to the person and those dependent on them. If the current score is unfairly distributed in society due to other historical effects, the computation will actively work to maintain the distribution.

A quant doesn't care how they make their money, and they certainly have no desire in leaving their Nash equilibrium.

Again, if you say you want to enact a conservative policy, say so. It's perfectly OK. But don't program conservatism into an algorithm and then claim "it's only math". It isn't.

Of course, being in a Nash equilibrium makes conservative algorithms much easier, as they don't require bargaining with other players, but that doesn't make them any less conservative.


Suppose we have some algorithms which are trying to predict which team will win a game. People are going to bet money based on what the algorithms say, and teams benefit when more money is bet on them.

The red team plays dirty.

First they spread false rumours about the blue team. Most of the algorithms respond by reducing the blue team's chances of winning against the red team, but algorithm #5 successfully predicts that the rumours were false and people who bet using algorithm #5 made a lot of money. Now lots of people use variants of algorithm #5 and thereafter the red team has no luck getting anyone to believe their false rumours.

In other words, if the algorithms are incorrect then we're all better off to fix them.

Next the red team successfully lobbies the government to raise the blue team's taxes (even though the red team makes more money). Now the blue team has less money for training and equipment, which has actually reduced their chances of winning the game. Most of the algorithms successfully predict this, and then the blue team loses the next game.

Your argument seems to be that the red team should be punished for this by having your algorithm stop preferring them. But now you're screwing over everyone else relying on your algorithm, who weren't responsible for the behaviour you're trying to punish. And thereafter no one trusts you to provide true information.

If the red team is playing dirty and the math accurately predicts the result, the problem is not the math.


Your analogy is incorrect because when it comes to social systems, how people bet actually changes the outcome, at least in the long run. So the dynamics are more like that the team that wins is the one that has the most people bet it would win.

In general, the problem is not the math but our interpretation of the result. All the result says is that if we change nothing, this is the likely outcome. By interpreting the results to mean merely "this is the likely outcome", we are depriving people of the opportunity of getting out of local optima. Deciding to coordinate or deciding to have society maintain its Nash equilibria are both political choices -- not mathematical outcomes.


> Your analogy is incorrect because when it comes to social systems, how people bet actually changes the outcome, at least in the long run.

That stipulation doesn't make the analogy incorrect. It's analogously an argument that we should bet on the team likely to lose tomorrow because more people betting on them will make them more likely to win next year. That is not a successful betting strategy even if it's true. It's just charity. And there is nothing wrong with charity, but then why use such a convoluted mechanism? If charity is what you want then just give your money to the blue team.

> In general, the problem is not the math but our interpretation of the result. All the result says is that if we change nothing, this is the likely outcome.

Which is what we need to know unless we're specifically trying to change something. And if we can find some particular unjustified bias in the algorithm then we are, and then we know what to do because we can adjust the algorithm. But if we can't find any such thing to adjust, if the algorithm is unbiased and accurate so far as we can tell, then what is it we're supposed to be trying to change?

Just looking at e.g. the racial composition of the output doesn't actually tell you if anything is wrong. It could be (and often will be) that the algorithm is measuring other things that are correlated with race.


> That is not a successful betting strategy even if it's true.

If we can determine the outcome, why is it not a successful betting strategy?

> It's just charity.

For something to be charity, you need to determine that the receiver does not rightfully deserve it, and the giver does. That distinction relies on values, and every person may come up with a different labeling of something as charity or not. I studied history and my views are largely shaped by that. I believe that the current distribution of power and resources in society is largely a result of what you may call "charity" to the people who are now rich (and that's putting it very kindly).

> Which is what we need to know unless we're specifically trying to change something.

As what we do or don't do determine the outcome, any decision is a political choice. Saying that we're specifically trying to change something is no different from saying that we're specifically trying to keep something the same. Today's bet is tomorrows outcome, and you must place a bet.

> Just looking at e.g. the racial composition of the output doesn't actually tell you if anything is wrong.

We liberals have an assumption. It is no more arbitrary than conservative assumptions. The assumption is that -- unless proven otherwise -- no group of people wishes to yield power over themselves to others, and that different groups have similar capacity for "power-gaining" achievement. Therefore, if we look at the racial or sexual makeup of a certain source of social power and we find a gross disparity, we want to balance it.

> It could be (and often will be) that the algorithm is measuring other things that are correlated with race.

Absolutely. The problem is not what the algorithm tells you, but what you do with the information. Because what determines the future outcome is not the output of the algorithm, but your decision on how to act.


> If we can determine the outcome, why is it not a successful betting strategy?

Because the bet is for what happens today but what the bet changes is what happens the next year or the next generation. It's giving someone a car loan when you know they're likely to default and then pay pennies on the dollar in collections. They will then have paid half the market price for their car and you will have paid the other half. That is clearly not profitable for you so doing it on purpose is charity.

> For something to be charity, you need to determine that the receiver does not rightfully deserve it, and the giver does.

Gibberish. Charity means giving to the less fortunate. The robber barons were terrible but you can't arbitrarily redefine words in order to claim they didn't give to charity.

And you're missing the point. I don't care what you call it, if your goal is to have the government take from the rich and give to the poor then just do that. Collect simple taxes and then give the money away. Don't come up with weird complicated economically distorting contortions with large and hard to predict negative externalities.

> As what we do or don't do determine the outcome, any decision is a political choice. Saying that we're specifically trying to change something is no different from saying that we're specifically trying to keep something the same.

Of course it's a political choice, but that doesn't tell you anything about what should be done or not done. And saying you're specifically trying to change something tells people what you're specifically trying to change.

> Therefore, if we look at the racial or sexual makeup of a certain source of social power and we find a gross disparity, we want to balance it.

But your categories are arbitrary. Race is a thing idiots made up. It isn't a real thing, it's a social construct. What justifies groups to be defined as they are? Why don't we also care about the economic disparities (which actually exist) between Irish Americans and English Americans? Or short and tall? Ugly and pretty? Consider whether the reason is because those lines don't map with existing political coalitions.

Everyone who is not born rich is so because of some historical misfortune not shared by their rich-born contemporaries. That's the line that sums up all the other lines. If you want to help the poor, help the poor. If you find a specific instance of racial discrimination (as in causation not correlation) then stamp it out. But Mercedes is not doing something wrong or obligated to do something different just because there is a racial disparity in who can afford their cars.


> Don't come up with weird complicated economically distorting contortions with large and hard to predict negative externalities.

I am not trying to. I am simply pointing out that treating past data as a future predictor and acting accordingly on the assumption that the social structure doesn't change as a result of your actions is a conservative political action; not a neutral one, and certainly not an objective one justified by "math".

> It isn't a real thing, it's a social construct.

A social construct is often as real as it gets.

> What justifies groups to be defined as they are?

You're looking at it the other way around. If someone arbitrarily makes up a group (say, race), and based on that arbitrary bias creates a social structure where that group is discriminated against and marginalized from power, fixing this bias is correcting an unfairness. We're not trying to make up groups in order to fix the situation; the groups had already been made up in the process of doing the wrong.

> But Mercedes is not doing something wrong or obligated to do something different just because there is a racial disparity in who can afford their cars.

I never said that that what they're doing is wrong. But a society that is not telling Mercedes to treat data differently is carrying out a conservative policy. I am not saying that changing that particular behavior is where we should best direct our efforts at change, but I am saying that we shouldn't pretend there's anything neutral about acting in this way.

I am also not saying that people (or companies) have a responsibility to individually act against their self interest; that's not how Nash equilibria are escaped anyway, so that's just not going to work. Social action is best done through consensus and compromise. Just as political systems now cooperate on not changing things, they can cooperate on changing them.


> I am simply pointing out that treating past data as a future predictor and acting accordingly on the assumption that the social structure doesn't change as a result of your actions is a conservative political action; not a neutral one, and certainly not an objective one justified by "math".

That doesn't actually mean anything. All action and inaction has consequences. There is no neutral.

An algorithm is objective because it's falsifiable. If you're trying to predict e.g. whether someone will pay back a loan then every time the algorithm tells us to make the loan we can see if it was right. We can also take a random statistical sample from the times it says not to make the loan and do it anyway to find out what happens then. And if the algorithm made bad predictions then we can improve it.

Now suppose we have a nice algorithm. Best available information. Prediction accuracy very good. It says the risk of John not paying back the loan is higher than the value of any reasonable interest the lender could charge.

There is now an obvious objectively correct answer to the question of whether the lender should typically loan John the money, if the lender wants to stay in business. It isn't "neutral" because nothing is. But it is unaffected by John's race.

> A social construct is often as real as it gets.

We made it, we can unmake it.

The low income housing people proved that concentrating poverty is dangerous. You put all the low income housing together and it becomes a slum. You put one or two low income units in each of several middle class neighborhoods and they don't.

Grouping a vulnerable population together has the same danger. It draws an imaginary line around them, separating them from everyone else, concentrating their shared troubles and isolating people who need help from people who could provide it.

Grouping humans by "race" is poison, even if you're trying to help.

> You're looking at it the other way around. If someone arbitrarily makes up a group (say, race), and based on that arbitrary bias creates a social structure where that group is discriminated against and marginalized from power, fixing this bias is correcting an unfairness.

You didn't create the groups but you're choosing which ones to care about. There are so many different groups that meet those criteria that you'll never meet someone who isn't in at least one of them. And they correlate and half-correlate and complement each other. You can't balance that.

But you also can't balance it because humans are not a fungible commodity. They're not equivalent. If you have fifty white coal miners, ten black doctors and three white CEOs, and you just average their salaries and say "white people have an advantage" because the CEOs each make a billion dollars, the coal miners are going to disagree with you and have a legitimate point.

Let's even pretend we can balance humans to see how quickly absurdity follows. So the problem we want to solve is that there are a higher proportion of low income black Americans than white Americans. OK, so we either need fewer low income black Americans or more low income white Americans. Possibility: Eject some low income black Americans. Nope, violates our principles and the US constitution. (But the fact that it would otherwise be effective should make you suspicious.) Next possibility: Get more low income white Americans. There are millions of low income people in eastern Europe who would like to be US citizens, so we let them. Hurray! Racial disparity in America solved!

If what we really care about is "balance" then that is an actual solution, but it's also so obviously ridiculous that it demonstrates how that can't possibly be the real problem.

And if it is to be a real problem then it also demonstrates why you can't solve it. Because the opposite of that is what has been happening. A large proportion of existing white American families immigrated here after the abolition of slavery. They, at a minimum, had enough wealth and education to afford passage (in the early days) and gain American citizenship (in modern times). So they've been bringing up the average the whole time. You're trying to balance an open system.


> An algorithm is objective because it's falsifiable. If you're trying to predict e.g. whether someone will pay back a loan then every time the algorithm tells us to make the loan we can see if it was right.

I'm sorry, but that's just mathematically wrong in the presence of feedback: https://news.ycombinator.com/item?id=10874683 When you have a dynamical system with feedback, you can have a perfectly predictive model that is is both wrong and unobjective.

> Grouping humans by "race" is poison, even if you're trying to help.

I agree. But I am not trying to group them. I am trying to undo the damage of the grouping.

> You didn't create the groups but you're choosing which ones to care about. There are so many different groups that meet those criteria that you'll never meet someone who isn't in at least one of them. And they correlate and half-correlate and complement each other. You can't balance that.

That's true in theory. Which is why you go to historians and anthropologist who seriously study this stuff and ask them. It turns out race and women are pretty much the big ones, not just in the West but in most societies (though not all).

> If what we really care about is "balance" then that is an actual solution, but it's also so obviously ridiculous that it demonstrates how that can't possibly be the real problem.

Let's say you have a headache. One solution is for me to kill you. Hurray! Problem solved! Headache gone! This shows that your headache couldn't have possibly been a real problem. WAT?

> And if it is to be a real problem then it also demonstrates why you can't solve it.

I studied history in grad school. And one of the constant things in history is that people will always explain why the problem can't be solved. You can go online and read some of the extremely elaborate, pseudo-intellectual explanations why letting women vote would be disastrous, why slavery is good for blacks, and why women should never be allowed to practice law or medicine. Yet social action solved all three. I could try to explain why your reasoning is wrong but it would take too long. In short, you think that the solution must necessarily solve the proximate cause without studying the ultimate cause, and you're wrong about what constitutes the main effect. The "open system" accounts for a very small portion of the problem.

One of the advantages to studying history is the perspective you get about things that today seem natural and immutable to us, and then you learn that things haven't always been like that. The social structure in society is constantly changing. The only question is do we want to help direct the change as much as we can, or act like incurious beings, that just let things happen to them without understanding them. I always find it curious (though not really; it is a known and familiar phenomenon, which happens over and over) how it is the people who are otherwise the most curious about nature and technology become primitive and unquestioning when it comes to human society. All of a sudden the model is either too simple, or too complex, and we certainly cannot change it. Which is funny because people had never flown (on wings) before the invention of the airplane, yet people have completely changed the social structure over and over, through political activism.


do you mean red team / blue team as in attack / defense or as in red tribe / blue tribe?

or just names for teams?


> Which means that Schneier is wrong about quotas

I'm not a big fan of quotas in general, don't think they are the end-all solution to discrimination problems, and know they can look like discrimination themselves.

But there is an interesting point of view to consider. When we use these kinds of algorithms to optimize some wanted result, we are always applying some form of reinforcement learning [1]. Reinforcement learning always needs to balance exploration and exploitation of our current knowledge, and quotas are one socially justifiable way to increase exploration - and not get stuck in self-reinforcing local maximum with a positive feedback cycle.

So, even from a purely utilitarian point of view, a certain amount of reshuffling of the cards (eg with quotas, but not limited to those) can be useful.

[1] https://en.wikipedia.org/wiki/Reinforcement_learning


The problem is not so much about whether predictions are biased. It's a problem of whether the algorithm is reinforcing pre-existing biases. In other words, it's about taking short-term losses in return for long-term gains.

Suppose a certain group of people have experienced some kind of historical discrimination, and as a result people in that group are starting from a disadvantaged position (wealth, etc) compared to the rest of the population. It may well be an unbiased prediction that such people will be a greater credit risk, etc. However, by making decisions solely on the basis of this, although the bank saves some money in the short-term, it reinforces the cycle of discrimination.

If your goal is to improve society in the long-term, making decisions based on these short-term predictors is not a desirable thing to do.


This is going to get really hairy. Who gets to decide the protected classes? The classic ones grouped by race, religion, gender, etc will be too crude when computers are judging. Imagine some weird class like people who are old but don't wear prescription glasses - perhaps indicating that they're too irresponsible with money to be able to pay for them. Nevermind that they're an exception whose eyesight is OK if that data isn't visible to the computer. There could be all kinds of unexpected special cases like this that lead to individuals getting generally bad treatment throughout life.


Protected classes aren't about justice, they're about identity politics. If a group can deliver votes or violence as a bloc and demand group representation then they'll get it. But invisible groups like left-handers or citizens who are not native born have immense difficulty coordinating and mostly don't.


This sounds reasonable but you're missing a piece here. In addition to algorithms and output, there's a 3rd piece: input.

We don't want to mandate the outcome of our decision process. We want a fair and unbiased decision process where if the input is fair, the output will be fair.

We want fair and unbiased decisions, but in addition to the internals of our process, we need to ensure the input is fair before we'll accept the outcomes.

The real problem is people pretending the input is fair, that the effects of historical discrimination has evaporated, against protected classes or otherwise (the queer community comes to mind).


We don't need fair and unbiased inputs. In the real world these don't exist, and the field of statistics has devoted extensive effort to the problem of getting good answers out of bad inputs.

E.g., for an unbiased measurement (m = x+g, g ~ N(0,\sigma^2)), you use the likelihood function G(x-m,\sigma^2). For a biased measurement g ~ N(b, \sigma^2), you use the likelihood function G(x-m+b, \sigma^2). Here G(x,\sigma^2) is the pdf of the distribution the noise g is drawn from.

But this is mainly just something that can be determined by studying the process. Mandating specific outcomes is just a way of ensuring that the predictor can't discover certain facts.


> and the field of statistics has devoted extensive effort to the problem of getting good answers out of bad inputs.

You are talking about measurement bias. Social bias is not measurement bias. Social bias may be a real statistical distribution; it is called "bias" because it is caused (in the sense of physical causality) and reinforced by biased human perception and the actions that follow. When we talk about "fixing bias" we don't mean "offset measurement errors", but change the actual distribution through action.

> Mandating specific outcomes is just a way of ensuring that the predictor can't discover certain facts.

Exactly. And neglecting feedback from the equations mandates a specific (conservative) outcome. See my reply to barry-cotter for examples of dynamical systems where conservative assumptions can lead to empirically valid yet mathematically wrong results, by failing to detect other stable states: https://news.ycombinator.com/item?id=10874683

If your dynamic system contains feedback, it is quite possible (and very likely) to come up with a model that may be completely predictive, yet quite wrong, and then use that mistake to justify conservatism (what's worse, it is ironically a mathematical error that helps paint a political view as a neutral, "mathematical" one).


I'm not sure what your example is supposed to prove. Besides the fact that your likelihood function isn't an "unbiased algorithm" like your original comment was talking about, it assumes the bias of the input is known and accepted rather than heavily contested.

We certainly shouldn't mandate specific outcomes. We should analyze processes, outcomes, and inputs alike for bias.


The algorithm would be Bayesian inference using the aforementioned likelihood.

The example is a simple illustration of how to fix bias. If bias is unknown then treat it as another unknown variable and use bayesian inference to identify it. If this is unfamiliar, I'm actually publishing a blog post about it tomorrow (how to find the bias in your phone's compass) which you might find helpful.


Yes, I used that when I worked on sensor fusion algorithms and we had to overcome both random and bias errors in measurements provided by radars. In this case, however, your math is wrong. Radars and compasses don't change the actual position of aircraft or the north, but social predictions really do change future real outcomes (we know that thanks to research in history, social psychology and sociology). The system is dynamic, not static, and has a very strong feedback[1]. Neglecting that feedback makes you miss other stable states; or, in other words: gives you a wrong answer.

Just to show you how tricky that feedback can be, I can tell you that in our tracking algorithms we had to carefully track the error's growth with time, as well as allow for unpredictable change in the model (how the aircraft behaves). But if your system has self-reinforcing bias -- i.e. the feedback is stabilizing (as in my example of the {-1, 1} dynamic system) -- you can neglect either of those things and your correct predictions would lead you to the conclusion that your model is actually simpler than the radar case, when, in fact it is far more complicated.

Our mistakes are rarely arithmetic errors when applying formulas. They are almost always in the hidden (or not) assumptions we make about the model, and in how we interpret the results[2]. In the case of 99.9% of social statistics, the results mean, "this is the likely outcome provided nothing changes, including not using this result to take any action". Interpreting the results any differently would be making assumptions completely unjustified by math.

----

[1]: Trying to correct biases in sensor measurement also has a feedback effect on future measurements, but it is a predictable and simple feedback that we can usually fix (with gradient techniques as we know the derivatives, though drift may still occur) but that is still not at all the kind of feedback we have in social systems.

[2]: Of course, it is often the same thing: how we interpret the result of any mathematical statement depends on the assumptions we've made to arrive at it. In order to apply a proven theorem you must make sure that its assumptions are held in your case. If you've ever used any mechanical provers like proof assistants, you know how unforgiving they are in that regard; you must spend as much care and effort on the assumptions as on the proof.


Published the aforementioned post, which illustrates exactly how biased measurements are just a stats problem: https://www.chrisstucchio.com/blog/2016/bayesian_calibration...


It's not just about bias. Even if we know there is predictive value, there's a notion of whether it's fair to include it. Regardless of what it correlates with, it seems unfair to make someone more/less likely to get a loan because of the color of their skin (or anything else they cannot change). There are some attributes that we should turn a blind eye to.


Why should fairness be a goal though? If men are really are more dangerous drivers, why shouldn't they pay more in car insurance? Or teenagers? It's not like they can control their age, but they are objectively more likely to get into an accident. So are people with eyesight problems, etc.

Algorithms should make the most accurate predictions possible. If somehow we had a time machine and could just see the future, we would do that. People who are more likely to get in accidents should pay higher insurance, or bank loans, or whatever. If you are less of a risk, then you should pay less.

Omitting statistically relevant features is silly. It's not like the algorithm has human biases. It doesn't hate people of a different race, it's not sexist. It just looks at the cold hard facts and makes the best prediction possible.


Schneier's point is not only that those attributes should be ignored, but there should be no correlation between those attributes and the end result of the algorithm. You can quite easily find examples of algorithms which do not include race or gender as an input, yet have a disparate impact. Schneier is saying that algorithms with disparate impact should be avoided without regard to procedural fairness, proportionality of outcome, or correctness.


> unbiased algorithms often discover that things we previously attributed to bias were actually unbiased predictors

Absolutely. Or the exact opposite: that the algorithms simply reflect results caused by bias, a far more likely relationship given what we know of history.

The reason we're against this is not because we think our biases are not supported by real data; we know that some of them most certainly are. But we also know that a snapshot of current society is often little more than a reflection of social forces that has made the society what it is at this particular point in time, and treating them as immutable (or simply placing faith in their persistence) is nothing more than a political policy that helps perpetuate them. Treating past correlations as future predictors in a dynamic system such as society is programming conservatism into your algorithm. Acting on them is a political policy which strives to maintain the status quo -- there's nothing wrong about it per se, if that's what you want, but we need to be aware of this fact. Alternatively, we could program the algorithms to predict or even assist with social change.

The worst thing we can do is accidentally program politics into algorithms, and then fall for our own ruse and start believing that the algorithms are apolitical. That takes away our agency in making political choices.


> Or the exact opposite: that the algorithms simply reflect results caused by bias, a far more likely relationship given what we know of history.

Unsupported assertion

> But we also know that a snapshot of current society is often little more than a reflection of social forces that has made the society what it is at this particular point in time, and treating them as immutable (or simply placing faith in their persistence) is nothing more than a political policy that helps perpetuate them.

This is anti-inductive. Knowledge can only be contingent based on past experience. This is John Stuart Mill stuff, empiricism.

> Treating past correlations as future predictors in a dynamic system such as society is programming conservatism into your algorithm.

Yes. All anyone has to go one in making predictions is past data. If something changes radically in the environment then the model may break. This does not mean we should just make shit up for some of our sample data/training set.

> Acting on them is a political policy which strives to maintain the status quo -- there's nothing wrong about it per se, if that's what you want, but we need to be aware of this fact.

A model of the world can be more or less accurate. That is apolitical. It can be biased. This can be a political matter. So reduce or eliminate the bias.

> Alternatively, we could program the algorithms to predict or even assist with social change.

If you want to enact social change go for it but stop pretending that this has anything to do with statistics, modeling, prediction or math, just go straight to the politics.


> Unsupported assertion

It is well supported. History is full of examples of social change resulting from abandoning biases (sometimes freely, though usually due to "encouragement" by other forces, often economic or demographic). From the "Struggle of the Orders" in the Roman republic to the rise of the middle class in Europe, if you are of European descent (and probably even if you are not), then your own education and social standing are likely a result of such change.

> This is anti-inductive. Knowledge can only be contingent based on past experience. This is John Stuart Mill stuff, empiricism.

Quite the contrary. Suppose the dynamics of a system is that it behaves so: -1, 1, -1, 1, -1, etc. If your previous observation was 1, your prediction should empirically be -1. If you are presupposing that the state of the system is unchanging -- and therefore predicting 1 -- that is anti-inductive.

But consider a more interesting system. It can obtain the values 1 and -1, and its initial state is 1, but its future state is identical to your previous prediction. What should your prediction be, then? You cannot make an inductive argument for favoring either prediction, and the correct answer is "whatever you want the result to be".

Finally, consider a more interesting system yet. The first three states are 1, and thereafter the state is the majority of your last three predictions (another interesting next-state formula is: -1, if the previous three predictions are all -1, or 1 otherwise). What should your predictions be, then? The correct answer is still "whatever you want the result to be" (although you may miss a few), but the difference is that this time some people will (incorrectly) argue that predicting 1 is just the "mathematically" right thing to do. It is not. It is simply a system that makes it easy to mistake a conservative policy for a neutral one.

> This does not mean we should just make shit up for some of our sample data/training set.

No, but it does mean we should study society in depth rather than arbitrarily decide that what was is what will continue to be, especially if by making that decision we are forcing that outcome. I propose that we stop making shit up when modeling society by treating past behaviors as future predictions, and instead try to study social dynamics. Once we understand those -- more or less -- we should decide what future outcome we want. Once we know that, we can decide how to factor past data into the system's dynamics so that we can get the outcome we want.

> If you want to enact social change go for it but stop pretending that this has anything to do with statistics, modeling, prediction or math, just go straight to the politics.

I am. I'm just pointing out that no view is neutral. Interpreting past behavior as future behavior is a conservative political action.


Paraphrasing Sussman-Minsky story [1], just because you rely on judgement does not means there is no algorithm, it's just that you don't know what the algorithm is. In a country like China with rampant corruption, Algorithms are a welcome relief. Schneier having never faced any kind of corruption at individual level completely ignores this giant problem. While the exact details of the algorithm might be debatable, yet having an algorithm as opposed to corrupt bureaucrats making arbitrary decisions is a welcome step.

[1] "So Sussman began working on a program. Not long after, this odd-looking bald guy came over. Sussman figured the guy was going to boot him out, but instead the man sat down, asking, "Hey, what are you doing?" Sussman talked over his program with the man, Marvin Minsky. At one point in the discussion, Sussman told Minsky that he was using a certain randomizing technique in his program because he didn't want the machine to have any preconceived notions. Minsky said, "Well, it has them, it's just that you don't know what they are." It was the most profound thing Gerry Sussman had ever heard. And Minsky continued, telling him that the world is built a certain way, and the most important thing we can do with the world is avoid randomness, and figure out ways by which things can be planned. Wisdom like this has its effect on seventeen-year-old freshmen, and from then on Sussman was hooked."


Regarding the Sussman story, it made it into a Hacker Koan

https://en.wikipedia.org/wiki/Hacker_koan#Uncarved_block


This article is mainly about algorithms used by the government for surveillance and control. Bad stuff, of course.

But in general, I think algorithms are something to embrace rather than fear:

Even very simple algorithms reliably outperform experts by a large margin, in certain domains: http://lesswrong.com/lw/3gv/statistical_prediction_rules_out...

Despite this, humans are heavily biased against algorithms. Even when we know they perform much better than humans. We also tend to vastly overestimate human performance: http://lesswrong.com/lw/lsc/link_algorithm_aversion/

This has a lot of consequences. Humans are terribly biased, and often aren't even aware of our biases. Not just stuff like race, but even people that are ugly are heavily discriminated against: http://lesswrong.com/lw/lj/the_halo_effect/ We discriminate against people of different political parties even more than we do based on race: http://slatestarcodex.com/2014/09/30/i-can-tolerate-anything... Or just random factors we can't predict. Nothing about being judged by a human is "fair".

And beyond fairness, they just give better predictions. Humans just aren't very good at prediction, even when we think we are. And in some domains that's extremely important, like in medicine. The first link referenced several places where statistical methods did better at diagnosis than doctors who specialized in it. Yet there has always been massive resistance to using algorithms in diagnosis.


> I think algorithms are something to embrace rather than fear

He has the same sentiment in his article.


Replacing judgment with algorithms has always been part of bureaucracy.

Once encoded as programs, though, they have a chance -- however unlikely, and still not without flaws -- of being audited either by the public or at least by independent auditors.


Why do you supposed that programs written in a programming language will be audited more effectively than programs written down on paper?


A large part of our lives is already devoted to gaming the human judgement system. The clothes we wear, the way we speak, the way we walk. Who we associate with - "Associate with deadbeats, and you're more likely to be judged as one." - has always happened in real life too. Women wear makeup to work to game the judgement system and men try to show off their confidence. All just to try to influence how other people see them.

So adding computers isn't really a big deal. It's already a major part of our daily lives and efforts.


I think we can probably get transparency and ethics built into the algorithms. But I think we'll have a much harder time controlling how pervasive and intrusive the algorithms are. And as a result, our behavior will be more effectively controlled by corporations and governments.


If any such algorithms are going to be built, it better be open source or be able to explain to me how it arrived on that judgment. I am not going to trust some blackbox on this.



calling attention to a story without upvotes or comments isn't very useful.


There's already very active mathematical work in this area. See fatml.org . I appreciate jeremy kun calling this out in Schneier'so comments.


>The first step is to make these algorithms public. Companies and governments both balk at this, fearing that people will deliberately try to game them, but the alternative is much worse.

This needs a lot more than the one sentence given to justify. How do we know that an open source security algorithm will be better? The justification isn't powerful enough for the claim.


It seems to me that the whole article is about proving this point.

Relevant quotes:

> The secrecy of the algorithms further pushes people toward conformity. If you are worried that the US government will classify you as a potential terrorist, you're less likely to friend Muslims on Facebook.

> If you know that your Sesame Credit score is partly based on your not buying "subversive" products or being friends with dissidents, you're more likely to overcompensate by not buying anything but the most innocuous books or corresponding with the most boring people.

> Uber is an example of how this works. Passengers rate drivers and drivers rate passengers; both risk getting booted out of the system if their rankings get too low.

> Many have documented a chilling effect among American Muslims, with them avoiding certain discussion topics lest they be taken the wrong way.


Those are trying to establish a loss from having these algorithms. They don't show whether the benefits of open source outweigh the costs. That part only seems to be handwaved in the sentence I quoted.


I think you're being unfair. The author has given us many clear issues with closed source algorithm that would not exist with open source algorithms.

Showing that the benefits outweigh the costs would require a model of the impact the (closed or open) algorithmic judgments on the society, and thus a model of the society itself.

Such model would inevitably be subject to even more subjectivity and discussions.

If you're not buying it, it's your choice, and you may even be right. But that doesn't make the whole argument unreceivable. Instead, you should mention the benefits of closed source algorithms for the society, they're not that obvious to me.


"the alternative is much worse."

That's a definite statement, and shouldn't be made without such a model. If you only have handwavey models, you can't claim one is "much worse" than the others. The level of confidence isn't justified. I'm fine with a list of possible bad outcomes, but don't claim way A is much worse than way B without stronger evidence.

>Instead, you should mention the benefits of closed source algorithms for the society, they're not that obvious to me.

One is mentioned in the part I quoted: that they're harder to circumvent. This isn't the same as regular security software where they theoretically are secure, but practically have holes, and so there's gains by opening it up and letting people report holes. There would be no theoretical security model here, it would only be probabilistic. There may not be a "patch" for the kind of holes (I'll give an example soon). So security by obscurity might make more sense here than in general.

Example: suppose our data shows that people that drive car X have a greater security risk. If this is kept secret, we are able to more effectively target people (probably, many different factors will combine until someone is an extreme risk, but this is simplified.) The fact that X is weak evidence of risk is information that becomes useless relatively soon after it is publicized (because potential terrorists just put "don't buy X" on their checklists.) So publicizing the algorithm would concretely lead to worse results.

Realistically, car model probably isn't too predictive, but possibly the union of all behaviors that are easy to change when known predicts risk reasonably well.


It's not necessarily better, but you can study it, understand it, criticise it and suggest improvements. All of which are difficult/impossible if the algorithm stays secret.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: