Semantics derived automatically from language corpora necessarily contain human biases

Here is a draft of the paper I promised last month:

Aylin Caliskan-Islam, Joanna J. Bryson, & Arvind Narayanan, Semantics derived automatically from language corpora necessarily contain human biases.

This draft was submitted to arxiv 24 August, and released there 26 August; we are working on a journal submission as well. (This blogpost has since been updated and is kind of a hash, sorry.)

Our work extends from the distributional semantics hypothesis, that meaning really is no more or less than how a word is used. Our work shows that all kinds of associations, not just dictionary definitions –even viscera are part of a word's meaning if we define meaning this way. AI developed from large corpus linguistics will absorb all these aspects of meaning, including prejudice. We have demonstrated this empirically by developing

The Word Embedding Association Test (WEAT), which matches statistics over collections of word embeddings (representations of a word's use) to the Implicit Association Test (IAT), a well established test of human implicit bias.
The Word Embedding Factual Association Test (WEFAT) where we basically check whether the nearness for one word embedding between two baskets of word embeddings representing a concept correlates to real-world data. So for example, we show that how female or male a word embedding for a job name is is 90% correlated with what proportion of people who hold that job are male or female.

This work is an extension of my research programme into semantics originally deriving from my interest in the origins and utility of natural cognition, but now with help from the awesome researchers at Princeton I've merged this with my AI ethics work, and also managed to pitch for cognitive systems approaches to AI.

Arvind has written a nice, clear blogpost about all of our results.
See also the discussion of this work we had last month on this blog, Should we let someone use AI to delete human bias? Would we know what we were saying?
April 2017 I had to take this post down for the few weeks while this work was actually in press in Science. The Science article is way better and more accurate than the arxiv article! I have learned so much about both our theory and about writing from the experience of working with the great reviewers and editors at Science. I think we are all slightly mortified by bits of the arxiv paper now. This really proves to me how science is a collaborative process, and the value of full, formal peer review and paid full-time editors (both content and copy.) I have a new, fuller blog post on the paper now too.

Update: This paper is now in Science.

Other AiNI blogposts on this work:

Should we let someone use AI to delete human bias? Would we know what we were saying? 28 July 2016
Semantics derived automatically from language corpora necessarily contain human biases 24 August 2016
FAQ for our Semantics paper 13 April 2017
We Didn't Prove Prejudice Is True (A Role for Consciousness) 13 April 2017

Adventures in NI

Search This Blog

Semantics derived automatically from language corpora necessarily contain human biases

Comments