Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I read the Bolukbasi paper and I now see where you are getting at. I think that the problem is not one of bias but rather that words vector embed the whole meaning regardless of the context. And that debiasing is rather stop-gap solution as there is an endless list of bias to correct depending on the usage of the model.

You may have fixed PROGRAMMER - MAN + WOMAN = HOMEMAKER, and still get SOLDIER - AMERICAN + ARAB = TERRORIST, the list is endless



Yep, so there's a lot of work to do. Bolukbasi, Cheng, et al. will straightforwardly admit that their model only applies to the one axis of gender bias. Fortunately there are other people working on this too, and I am one of them.

The baked-in assumption that Arabs or Muslims are terrorists, or that terrorists are Arabs or Muslims, is something that the de-biasing process in ConceptNet Numberbatch (which I make) attempts to mitigate at the same time as gender and racial bias. And of course there is much more to do.

It's a fascinating and productive field of research. Why did you have such a negative initial reaction to it?


Because I find in AI's candid responses to the questions we ask a fascinating opportunity to understand humanity better. When we try to conform those answers to our desired biases we learn less from it. But now I understand that such aspects are a subset of the greater problem of getting our models to answer in a specific context.

Ultimately the Model would need to detect & learn the contexts by itself.

And more to the point; I see you are publishing data sets after correcting them for fairness. I think the unfair results can sometimes be more useful. I once prototyped a short-story tagging & recommendation system based on word2vec and I would not be surprised if raw vectors gave better results. People's taste in litterature (especially romantic) are very dependant on gender stereotypes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: