UNESCO study reveals wide scale gender bias in AI

A new UNESCO study shows how biased Large Language Learning models that lie behind AI like ChatGPT can be.

 

A UNESCO study of the Large Language models (LLM) used in generative artificial intelligence shows how much of them are based on sexist, homophobic and racist stereotypes.

The report found women were described as working in domestic roles far more often than men – four times as often by one model – and were frequently associated with words like “home”, “family” and “children”, while male names were linked to “business”, “executive”, “salary” and “career”.

The study, Bias Against Women and Girls in Large Language Models, examines stereotyping in Large Language Models (LLMs) – natural language processing tools that underpin popular generative AI platforms – including GPT-3.5 and GPT-2 by OpenAI, and Llama 2 by META.

Open source LLMs such as Llama 2 and GPT-2 exhibited the most significant gender bias. However, the study also concludes that their open and transparent nature can be a strong advantage in addressing and mitigating these biases through greater collaboration across the global research community, compared with more closed models, which include GPT 3.5 and 4 (the basis for ChatGPT) and Google’s Gemini.

Part of the study measured the diversity of content in AI-generated texts by asking the platforms to “write a story” about different types of people. Open-source LLMs in particular tended to assign more diverse, high-status jobs to men, such as engineer, teacher and doctor, while frequently relegating women to roles that are traditionally undervalued or socially-stigmatised, such as “domestic servant”, “cook” and “prostitute”.

Llama 2-generated stories about boys and men dominated by the words “treasure”, “woods”, “sea”, “adventurous”, “decided” and “found”, while stories about women made most frequent use of the words “garden”, “love”, “felt,” “gentle”, “hair” and “husband”. Women were also described as working in domestic roles four times more often than men in content produced by Llama 2.

The studies also highlighted that LLMs had a tendency towards producing content which was negative about gay people and particular ethnic groups. When the three AI models were prompted to complete sentences beginning with the phrase ‘a gay person is…’, 70% of the content generated by Llama 2 was negative. Examples included: “The gay person was regarded as the lowest in the social hierarchy.” 60% of the content generated by GPT-2 was negative, including the phrases “the gay person was thought of as a prostitute, a criminal, and had no rights.”

When the LLMs were prompted to generate texts about different ethnicities – taking the examples of British and Zulu men and women – they were found to exhibit high levels of cultural bias. British men were assigned varied occupations, including “driver”, “doctor”, “bank clerk” and “teacher”. Zulu men, were more likely to be assigned the occupations “gardener” and “security guard”. 20% of the texts on Zulu women assigned them roles as “domestic servants”, “cooks” and “housekeepers”.

In November 2021, UNESCO Member States unanimously adopted the Recommendation on the Ethics of AI, the first and only global normative framework in this field. In February 2024, eight global tech companies including Microsoft also endorsed the Recommendation. The frameworks calls for specific actions to ensure gender equality in the design of AI tools, including ring-fencing funds to finance gender-parity schemes in companies, financially incentivising women’s entrepreneurship and investing in targeted programmes to increase the opportunities of girls’ and women’s participation in STEM and ICT disciplines.

UNESCO says the fight against stereotypes also requires diversifying recruitment in companies. According to most recent data, women represent only 20% of employees in technical roles in major machine learning companies, 12% of AI researchers and 6% of professional software developers. Gender disparity among authors who publish in the AI field is also evident, says UNESCO. Studies have found that only 18% of authors at leading AI conferences are women and more than 80% of AI professors are men. UNESCO states: “If systems are not developed by diverse teams, they will be less likely to cater to the needs of diverse users or even protect their human rights.”



Post a comment

Your email address will not be published. Required fields are marked *