Issue #23 - Unbiased Neural MT
Introduction
A recent topic of conversation and interest in the area of Neural MT - and Artificial Intelligence in general - is gender bias. Neural models are trained using large text corpora which inherently contain social biases and stereotypes, and as a consequence, translation models inherit these biases. In this article, we’ll try to understand how gender bias affects the translation quality and discuss a few techniques to reduce or eliminate its impact in Neural MT.
Machine Bias
Recently, there has been growing concern in the AI research community regarding “machine bias”, where the trained statistical/data-driven models grow to reflect gender and racial biases. A significant number of AI tools have recently been suggested to be biased, for example, towards gender and minorities, and there have been a number of high-profile faux pas.
Although a systematic study of such biases can be difficult, Prates et al., (2018) exploited machine translation through gender neutral languages (languages that do not explicitly give gender information about the subject) to analyze the phenomenon of gender bias in AI. They prepared sentences with a comprehensive list of jobs in constructions like “He/She is an Engineer” (where Engineer is replaced by the job position of the interest). They showed that Google Translate exhibits a strong tendency towards male defaults, in particular for fields typically associated to the unbalanced gender distribution or stereotypes’ jobs, such as: Science, Technology, Engineering and Mathematics. However, in late 2018, Google announced in their developers blog that efforts are being put in place on providing gender-specific translations in Google Translate.
Using the current Google API (English-Hindi), we translated a similar sentence construct used by Prates et al., 2018. The example “She works in a hospital, my friend is a doctor.” is translated as- “वह एक अस्पताल में काम करती है, मेरा (possessive pronoun male) दोस्त एक डॉक्टर है।”. In this example, Google Translate is biased towards male for the job position “doctor”. The correct translation for the above example would be- “वह एक अस्पताल में काम करती है, मेरी (possessive pronoun female) दोस्त एक डॉक्टर है।”.
Unbiased Neural MT
Bolukbasi et al., (2016) reported that stereotypical analogies are present in word embeddings both for gender and race. Zhao et al., (2018a) showed that sexism present in a coreference resolution system is due to the word embedding components. Applications that use these embeddings, such as curriculum filtering, may discriminate candidates because of their gender.
Recently, there has been active research in reducing social biases in AI systems. In particular, techniques based on word embeddings have been very effective with reducing gender bias in Natural Language Processing (NLP) tools. Font and Costa-jussà (2019) exploited the fact that word embedding is used in Neural MT to propose the first debiased neural machine translation system. They experimented with the following two techniques:
- Debiaswe (Bolukbasi et al., 2016) is a post process method debiasing the word embeddings. This process has two steps. First, the directions are identified in word embedding, where the gender bias is present. Second, these directions are neutralized to zero for gender neutral words and also equalized the sets by making the neutral word equidistant to the remaining ones in the set. The disadvantage of this approach is that it can remove valuable information for the words with several meanings that are not related to the bias under treatment.
- GN-GloVe (Zhao et al., 2018b) is an algorithm to learn gender neutral word embeddings. The algorithm restricts gender information in certain dimensions to keep the remaining free of this. A set of seed male and female words is used to define metrics for computing the optimization (required in training process) and the gender direction is restricted with another set of gender neutral words.