Aylin Caliskan-Islam, Joanna J. Bryson, & Arvind Narayanan, Semantics derived automatically from language corpora necessarily contain human biases.

Our work extends from the distributional semantics hypothesis, that meaning really is no more or less than how a word is used.  Our work shows that all kinds of associations, not just dictionary definitions –even viscera are part of a word's meaning if we define meaning this way.  AI developed from large corpus linguistics will absorb all these aspects of meaning, including prejudice.  We have demonstrated this empirically by developing
  • The Word Embedding Association Test (WEAT), which matches statistics over collections of word embeddings (representations of a word's use) to the Implicit Association Test (IAT), a well established test of human implicit bias.  
  • The Word Embedding Factual Association Test (WEFAT) where we basically check whether the nearness for one word embedding between two baskets of word embeddings representing a concept correlates to real-world data.  So for example, we show that how female or male a word embedding for a job name is is 90% correlated with what proportion of people who hold that job are male or female.
This work is an extension of my research programme into semantics originally deriving from my interest in the origins and utility of natural cognition, but now with help from the awesome researchers at Princeton I've merged this with my AI ethics work, and also managed to pitch for cognitive systems approaches to AI.
