The AI ​​is spreading old stereotypes to new languages ​​and cultures

The AI ​​is spreading old stereotypes to new languages ​​and cultures

Margaret Mitchell is A pioneer when it comes to trying tools for the generation for bias. He founded the Ethical AI team on Google, along with another well -known researcher, Timnit Gebru, before they were fired from the company. He is now working as a AI Ethics leader in Hugging Face, a software startup focused on open source tools.

We talked about a new data set that helped create how the AI ​​models continue to perpetuate stereotypes. Unlike most bias mitigation efforts that prioritize English, this data set is male, with human translations to test a wide range of languages ​​and cultures. You probably know that the AI ​​often has a flattened vision of humans, but you may not realize how these problems can be even more extreme when the outings are no longer generated in English.

My conversation with Mitchell has been edited by duration and clarity.

Reece Rogers: What is this new data set, called Shades, designed to do and how did it join?

Margaret Mitchell: It is designed to help evaluation and analysis, from the Bigscience project. About four years ago, there was this great international effort, where researchers around the world met to form the first model of great open language. In fully open, I mean, the training data is open and the model.

The FACE hug played a key role in keeping it advancing and providing things like calculation. Institutions around the world also paid people while working on some parts of this project. The model we published was called Bloom, and it really was the dawn of this idea of ​​”open science”.

We had a lot of working groups to focus on different aspects and one of the working groups with which I was tangentially involved was the evaluation. It turned out that making social impact assessments was very complicated, more complicated than forming the model.

We had this idea of ​​an evaluation data set called Shades, inspired by gender tones, where you could have exactly comparable things, except for the change of some features. Gender shades looked at the tone of gender and skin. Our work analyzes different types of biases and exchange between some characteristics of identity, such as different genres or nations.

There are many English resources and English evaluations. Although there are some multilingual resources relevant to bias, they are often based on machines as opposed to the real translations of people who speak the language, embedded in culture and who can understand the type of bias at stake. They can gather the most relevant translations for what we try to do.

Much of the work around Ai bias mitigation focuses only on English and stereotypes found in some select cultures. Why is it important to expand this perspective to more languages ​​and cultures?

These models are being deployed between languages ​​and cultures, so mitigating English biases, even translated English biases, do not correspond to the relevant biases in the different cultures where they are deployed. This means that you risk deploying a model that spreads really problematic stereotypes within a certain region, because they are formed in these different languages.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *