(Perhaps she does somewhere, however, I will freely admit my Bayesian priors on that probability suggest it is not worth my time to try to find it.)
I'm not demanding this to the nth degree, obviously; I don't expect her to submit a working patch to Google's results engine. But what amounts to a vague wave in the direction that can't even be nailed beyond "society" is not helpful to anyone, just an incendiary attack.
A more insidious explanation is that society as a whole is to blame. If Google’s Adsense service learns which ad combinations are more effective, it would first serve the arrest-related ads to all names at random. But this would change if it were to discover that click-throughs are more likely when these ads are served against a black-identifying name. In other words, the results merely reflect the discriminatory pattern of clicks from ordinary people.
Keep in mind that the professor did not write this article, which buries her methods and conclusions. In her actual paper (http://arxiv.org/abs/1301.6822) she has possible solutions in the conclusions.
Would the politically enlightened consider this increased association of men with arrest records (vs women) to be unfair bias against men? Would there be a call for a followup to determine whether it was a biased algorithm, a discriminatory corporation, or our entire anti-male sexist society that was to blame? Might they shrug it off, placing most of the blame, not on the ad system, but on the statistical behavior of men themselves?
If there is, as I suspect, a similar "bias" against men in this ad system, then any discussion of how to "fix" it should not address the question of black/white names. It ought to deal with the larger question of what to do when the data show that the probability of X given name 'A' is greater than the probability of X given name 'B'.
Let me go ahead and spell out what I mean. Should the Google search results give the exact same % of results for each race? Should it accurately reflect the % of searches? Should it accurately reflect the % of ads put in? (Those are all 3 very different things.) If so, should Google tweak the percentage of those ads? If so, in what manner? If a combination, which combination?
The implication in the conclusion is that first one, but in that case, why does that one override the others? Might in fact be racist to hide the incidence of these searches, based on race? A case could be made for that, after all.
These are rich and interesting questions, and it does not simply go without saying what the desired outcome is.
Methods are spelled out beginning on page 10, observations and how they compare to specific expectations are spelled out beginning on page 20. I can only infer that you didn't bother to read the paper.
That makes no sense whatsoever.
It is you who is 'just flinging around wild accusations'. Read the paper and critique it on its merits, by all means. As it is, your little rant tells us far more about you than its subject.
The intro handwaves a very general definition, but never brings it back to the question at hand.
Just off the top of my head, why not go make a Google Ads account and bid on keywords that sound like black names, and then bid on keywords that sound like white names (according to the professor's criteria)? Wouldn't the price differences show that there is greater demand for one set or another?
My hypothesis would be that the discrimination happening here is simply happening in the ad marketplace, and that while it may reveal something unpleasant about our culture which we would like to see changed, doesn't lend itself to prescriptive intervention-type remedies. I'd like to believe that it may only be revealing that when a bunch of white marketing guys buy adwords in bulk they aren't cognizant of their list of "common names" being essentially "white names," and the arrest record websites caught on to this fact first.
Treating racism as a akin to a charge of wrongdoing is counter-productive.
It should be dealt with rationally and clinically, with the idea that everyone is probably affected by it, and probably embodies some of it. Treating racism as something wrong is like treating lust as something wrong. It makes it impossible to talk about things that would be awkward to talk about even without such strong emotional reactions attached to them. Just as lust is simply a consequence of our biology, racism is an unfortunate consequence of our psychology and our history.
We are all creatures "of our time." Our only hope of transcending that is to admit of the possibility and freely question ourselves.
She identifies possible mechanisms for the discrimination; presumably the desired remedy would differ based on which mechanism is actually responsible. This paper is step one: tell people who may not know that the discrimination is occurring.
Isn't that part of the strategy?
Hitting people with an inflammatory stick would be calling out a smoke-filled racist room in the Googleplex; she doesn't do that.
All that said, my point is that if one bought ads with keywords for 100% of the unique first name-last name pairs found among prisoners/arrestees, "stereotypically black" names would be over-represented relative to the total population and relative to the internet-using population.
In other words, in my view, this is a symptom of the criminal justice system, not a racist policy choice by Google's algorithm. Google's algorithm sees only one color: green.
It's funny ("funny", as in, it was a confusing coincidence, not as in, the study is suspicious) that the OP mentions the ad-delivery on Reuters.com.
I tried it out for myself using the professor's name and got this massive correction note to a December Reuters story involving her study (the correction was so major that the story has been removed from Reuters archive:
(Reuters) - Please be advised that a November 25 article reporting that Instantcheckmate.com's advertising relies on racial profiling has been withdrawn. The story, "Professor finds profiling in ads for personal data website," contains errors.
The headline of the article and the article itself incorrectly assert that Harvard Professor Latanya Sweeney's research showed that Instantcheckmate.com, an online background research website, had engaged in racial profiling in its advertisements.
Sweeney says the preliminary results of the research found "significant discrimination" in Instantcheckmate.com's online ad search results, but were insufficient for the article's assertion of deliberate racial profiling by Instant Checkmate. Her research is ongoing. Instant Checkmate denies any such activity, which it describes as being at odds with the company's values. The company says further that it hasn't seen Sweeney's research.
There will be no substitute story.
This doesn't have any bearing on the legitimacy of the OP's summation of Ms. Sweeney's work, just that her work has been written about before, and apparently, easily misinterpreted by the media.
edit: If you want to read the pulled-Reuters story, this appears to be a copy of it:
The Reuters story focused more on Instantcheckmate.com's practices and apparently made too strong of a conclusion. Strangely, the author of the piece, a Reuters corespondent, is also a Harvard fellow who is collaborating with Sweeney for a book.
(I had the same reaction you did when first reading the post, but when I re-read it, I realized what it was saying. The post could have been clearer though.)
If anything, this post is directed at commenters who have reflexively dismissed the study because they dislike the OP's summation. Just because the article summarizing the study may jump to a contentious conclusion does not mean that the study itself did. The fact that Reuters apparently messed up here is just an example of how a study with possible controversial implications can be misread by someone trying to write about it.
Do these scam sites even check public records? "The Cesspool of Online Ads is Poisoning Online Ad Delivery" might be a better title.
Sadly, as the researcher purchased that information, I think it's working.
They're professional liars, so take it with a grain of salt.
This is precisely why witch-hunt attitudes towards racism are counter-productive. Racism isn't an evil or a personal shortcoming , it's a consequence of unfortunate history combined with shortcomings in the way our minds process social information. Treating racism as a kind of evil makes communicating rationally about it impossible. And since it's a once huge social factor asymptotically decaying, it's going to be all around us. It would be much better for us as a society to be able to talk about it rationally. That's not what the current social climate is conducive to, however.
Basically, everyone's attitude towards stuff like racism and sexism should be somewhat like the stance "Everyone Poops" takes towards, well, poop. It's not the most pleasant thing in the world. It's just a consequence of where we came from. The only difference is that there is hope that eventually we will overcome it. (Well, maybe when people's minds are uploaded into computers, we won't poop or judge people overwhelmingly on external morphology.)
EDIT:  - Harboring some racist attitudes or ideation is perfectly understandable, but if you go and perpetrate some sort of crime or act of cruelty as a result, this is certainly wrong. We all experience hate and negative emotions. This doesn't excuse you from acting like a civilized human being.
> Ignoring it doesn't make it go away.
How is advocating a less tense attitude towards the whole notion of racism to achieve a calmer, more rational discussion of it, "ignoring?" Are you emotionally invested in the notion that it should be punished? Or, did you read some sort of meta-witch-hunt accusation into what I wrote?
EDIT: my point 1 and 2 are dumb and clearly explained in the paper.
3) As I understand it the US has very many black people in prison. I've heard a variety of stats; 1 in 3 black men are either in prison, on probation or on parole. Wikipedia says that the US Bureau of Justice Statistics says that 39% of the prison population is non-hispanic black (while the black including hispanic population is just 13% of the US population.) That suggests that people with a black name will need legal services more than someone with a white name. The algorithm hasn't been tweaked by racists; the algorithm is just responding to a racist society.
This post is not meant to bash the professor's work! I haven't read the paper yet. I'm about to give it a read.
Given that, those long tail names are going to be cheaper on Google ads. In my experience, the headwords are always more expensive. Thus, if this website is scooping up cheap traffic, that will tend to be biased toward "black sounding" names.
Google isn't doing anything other than selling keywords.
That is, how would one qualify a name as "black-sounding" or "white-sounding"? I hope it was not based only on intuition as that may give misleading results.
Is there a public data set somewhere that correlates given names with ethnicity?
Edit: On second thoughts, it might be more interesting to perform similar google search experiments with both intuitive and empirical name/ethnicity pairings and see if the results differ.
You see this with Amazon and eBay too - if you search Google for something weird which no one has bid on, you'll often see "Find [bananaphone] on eBay/Amazon". These "people search" results are similar: extremely long-tail, very generic ad.
So, I think its probably algorithmic vs intentional. Could be ML that's learned racism from the internet itself.
Who knows though! Your explanation would certainly make sense. Its a fascinating problem.
This was my first thought.
Google is involved to the extent that they serve ads in response to specific keywords. Instant Checkmate is involved only to the extent that they ignore the source of their referrals. Someone with a little more free time could probably trace HTTP traffic to find the specific affiliate responsible.
I've worked with the people who optimize things like this (although they never thought of this particular one, probably a result of being Canadian)
They can and do reason about the legal and social consequences. They don't give a shit as long as it makes money, and if there are legal consequences then they hide behind anonymous proxies and vps payed for with prepaid credit cards (hides them from google as well). Plenty of them make jokes about how shady and dishonest their business is and then go to church every weekend with their families and think nothing of it.
If they were forced to they would justify things like this by pointing out that they are just optimizing keywords, if society is racist and the data algorithms detect that then so be it. Mostly they would just not care and continue to worry about a bad daily fluctuation wiping out a months profits or google catching on to the blackhat tactics they use sometimes and shutting their adwords accounts down.
There is literally zero chance of convincing people doing ad network arbitrage to consider social consequences, and IANAL but the what are the legal consequences of using automated keyword optimization tools that would, by definition, reflect any biases of society at large?
But if that's the case, why is society more interested in the arrest-records of people with black-sounding names? Perhaps it's because in America, blacks are disproportionately likely to have criminal records. (Some quick Googling - in 2010, according to the census, blacks were 13.6% of the population, but in 2009, according to the FBI, 28.3% of arrests were of a black person.)
I'm not making any judgements about black people by quoting these statistics - perhaps these arrest rates reflect institutional racism, or disproportionate levels of poverty, or lack of access to opportunity. But I'd rather the professor focus her time on correcting that disparity instead of trying to make Google's AdWords algorithm correspond to something other than the interests of the public.
But it's all circular. If society associates "black-sounding" names with criminal records, that leads to employment discrimination (http://www.nber.org/papers/w9873) which results in "disproportionate levels of poverty, or lack of access to opportunity." It's all related and it all needs to be at least understood.
Arrested 7389208 3027153 10416361
Not Arrested 216164057 35902166 252066223
Targeted advertising is nothing new. Do you think you will see the same advertising on BET as you will on TCM? They are going to use what data is on hand to focus ads. Most ads don't really accurately target most people, but enough do to keep them advertising. If it did not work then they would not do it.
Most ads on Facebook want me to get an MBA online.... I already have one, but they have to work with the data they have available.
Of course media sites always overhype research with incredible titles, but I can't see that this tells us much about online ad delivery in general ... though it perhaps raises interesting (unanswered) questions about Google's ad delivery....
What?!? Does not follow!
That takes a leap that is based upon correlations: that there is a correlation between someone searching for "Latanya Sweeny" and them being black, based upon historical birth records. Of course, it's this same type of blind correlation-based thinking that results in racism. Swap out "birth name" with some other less appealing attribute and "black" with your race, ethnic group, or other group of choice and you have a textbook example of racist thinking. It doesn't exactly serve their point well to use the same mechanism which brings about racism as a means to make an argument.
So sure, should there be mechanisms in place that can prevent "libel" in Google because of correlations between names and less-than-savory behavior (regardless of race)? Yes, of course. If my last name is Manson it is probably a bit unfair to me as a person if Google is advertising stuff to people searching for me as if I am associated with a famous serial murderer, too. But the implication that this is all about race and is some systemic racist thing is just data mining and likely the authors projecting their own biases. I'm sure you could find all kinds of interesting clusters of names that have certain negative advertising, and only a subset of them would be clusters that highly corresponded to race or ethnic groups.
On the topic of *-sounding names...
I've worked at least one technical job where having having a "black" or even just generally "American" name would send a job application straight to the trash.
However, I'd expect any established employer to have an automatic background check / verification system in place, so a possibly suggestive Google search wouldn't be particularly relevant.
I'm thinking prospective dates are more likely to be Googling names than employers.
Relevant: "Are Emily and Brendan More Employable than Lakisha and Jamal?" http://www.chicagobooth.edu/pdf/bertrand.pdf
edit: I see martey beat me to it.
So this is probably just a reflection of black sounding names to be statistically more likely to be shared with criminals. In the same way that googleing for "teen girls" has a high likelyhood of returning porn.
I'm sorry, but I look at you people like you people look at young earth creationists.
It's not clear from your comment what you consider to be an accurate descriptor of reality, so perhaps you'd do better to offer your own position rather than leading off with a pronouncement that everyone else ('you people') is wrong.
Google's AdSense is nothing more than a pattern matcher, and it is (of course) fundamentally simplistic and stupid when compared to a human. To ask it not to be racist is to ask it to be smarter than ourselves. No doubt some of Dr. Sweeney's colleagues in the Computer Science department at Harvard are working on that very thing -- she should take it up with them.
As we do more stuff algorithmically and globally this approach does not scale at all. This can cause technology to re-enforce our biases no matter how subtle.