From Google Translate to Amazon Recruiting: When Good AI Goes Bad
The following is an adapted chapter from the book, “Real World AI: A Practical Guide for Responsible Machine Learning,” by Alyssa Simpson Rochwerger and Wilson Pang.
We all interact with AI almost daily—when we deposit checks through an app, when we talk to Siri and Alexa, when we scroll through our personalized social media feeds.
When AI works well, it makes our lives easier. But when AI goes bad, it can have far-reaching negative effects. Biases in race, gender, class and other markers have made their way through to the output of even the most well-intentioned and thought-out AI systems.
If you are involved in AI, whether as a data scientist building the model or a business decision-maker defining the model’s objectives, you need to understand the ways in which AI can go bad, because the harmful effects are felt not just by the business but by our society. Here are two examples of this phenomenon.
Google Translation: Addressing Gender Bias
Google Translate is arguably the most impressive translation tool we have today, but it is not perfect. The translation software runs on a deep learning model called neural machine translation (NMT). The model learns from hundreds of millions of pieces of data across hundreds of languages, using already-translated bits of text. The problem is that this inadvertently leads to gender bias.
The issue arises because different languages have different treatments for masculine and feminine word forms. This ultimately taught the model to deliver a masculine translation when fed words like “strong” or “doctor” and to deliver a feminine translation when fed words like “nurse” or “beautiful.”
When fed long strings of sentence constructions following the template “[gender-neutral pronoun] is [adjective],” the results are worrying. For example, the adjective “hardworking” will assign a “he.” The adjective “lazy” will assign a “she.”
Google has articulated every correct intention in promoting fairness and avoiding bias in its translation tool, but because its training data is the vast corpus of human language, the model’s lessons are heavily influenced by gender conventions around the world. Arguably, it was not Google’s fault that the AI model was biased; it was the fault of our languages. But because Google is committed to promoting fairness, they wanted to correct this bias anyway.
Because Google cannot alter the input, to fix this issue they focused on curating the output.
In 2018, they launched a targeted initiative to reduce gender bias in the translation software. Their fix was to have the translator react to any gender-neutral input given in English with results that both forms, i.e., “he is ___” and “she is __.” It was a small change, but one with a big impact.
Amazon Recruiting: Abandoning a Biased Tool
Amazon began a project in 2014 that automated job applicant reviews using AI to score candidates on a scale of 1 to 5. With a high volume of applicants and resources required to evaluate them, the need for such an automation tool was high.
After a year of work, Amazon realized there was a problem with its system. The model was trained to evaluate candidates by learning from patterns sourced from résumés submitted in the past ten years. But gender diversity was a relatively new emergence in the field, so most of those résumés had been submitted by men.
As a result, the model learned that male candidates were preferable and penalized any applications that included the word “women’s,” such as mentions of women’s sports teams or extracurricular activities.
By 2017, Amazon had to abandon the tool. There was no data they could use to train the model that would not result in a gender-biased outcome. Instead, the company pivoted to a different solution: they developed an AI tool that spotted current candidates worth recruiting across the internet. To train the model, they fed it past candidate résumés and taught it to recognize certain career- and skill-related terms.
But even after that step. it still resulted in bias. Because the model was trained on mostly men’s résumés, it learned to favor words more commonly used by men to describe their skills and responsibilities. The candidates returned by the web crawler were overwhelmingly men.
Amazon made the ethical decision to shut down the project.
Sometimes it is simply not possible to overcome bias in an AI project. In those instances, it is best to abandon the prejudiced tool, even if there is great demand for it.
With Great Power Comes Great Responsibility
AI represents the largest technological shift many of us will see in our lifetimes. It is transforming the world on every level, from moment-to-moment interactions people have with devices in their homes to large-scale decisions made by global organizations that affect millions of people.
With such widespread power inherent to the technology, it is the responsibility of those creating AI applications and systems to ensure that their AI is ethical, safe and in service to the world—essentially, that it makes the world a better place, not a worse one.
Therefore, it is important to monitor the results of your AI models and adjust as needed, whether that means altering outputs, abandoning biased models entirely, factoring race and other markers into the model or restricting a model’s usage only to its designed setting.
About the Authors:
Co-author Alyssa Simpson Rochwerger is the director of product at Blue Shield of California and formerly served as vice president of AI and data at Appen. She also worked as vice president of product at Figure Eight and was director of product for IBM Watson. She earned a BA in American studies from Trinity College.
Co-author Wilson Pang joined machine learning data training company Appen in November 2018 as CTO and is responsible for the company’s products and technology. He previously was the chief data officer of Ctrip in China, the second-largest online travel agency company in the world and was senior director of engineering at eBay. He earned his master’s and bachelor’s degrees in electrical engineering from Zhejiang University in China.
Their book, “Real World AI: A Practical Guide for Responsible Machine Learning,” is available from Amazon in print or as an ebook.