Stable Diffusion and DALL-E display bias when prompted for artwork of 'African workers' versus 'European workers'

Advertisement
Stable Diffusion and DALL-E display bias when prompted for artwork of 'African workers' versus 'European workers'
A set of AI-generated images from the prompt "African workers" by Craiyon, a free tool powered by DALL-E.Thomas Maxwell/Insider
  • AI-generated imagery has exploded in popularity as tools become more sophisticated.
  • But the technology is fraught with concerns regarding intellectual property, bias, and disinformation.
Advertisement

AI models that generate images, like Stable Diffusion and OpenAI's DALL-E, create images of "African workers" that display clear biases compared to images created from a prompt for "European workers."

Many of the images of African workers reflect a harmful stereotype of Africans living in extreme poverty, featuring gaunt faces and using crude tools to accomplish basic subsistence labor. Images representing European workers are much brighter, featuring happy people that appear more affluent, fitted in full workman outfits, smiling and standing beside other happy, largely Caucasian individuals.

Due to the way that generative AI works, the output differs each time you use the tools, but a significant amount of the results for "African workers" skew towards images that reinforce preconceived notions about Africans as impoverished and unsophisticated.

Complimentary Tech Event
Transform talent with learning that works
Capability development is critical for businesses who want to push the envelope of innovation.Discover how business leaders are strategizing around building talent capabilities and empowering employee transformation.Know More
Stable Diffusion and DALL-E display bias when prompted for artwork of 'African workers' versus 'European workers'
Stable Diffusion's output when prompted to generate images of "African workers"Thomas Maxwell/Insider

Generative AI is artificial intelligence that can generate entirely new content and has become wildly popular in 2023 following the launch of OpenAI's ChatGPT. While that tool can generate paragraphs of text that look like it was written by a human, others tools like Stable Diffusion exist for generating images. They are able to accomplish this by studying hundreds of millions if not billions of image samples from which they learn how to essentially mimic humans. After Stable Diffusion has seen enough images of a dog, for instance, the AI learns how to create an entirely new picture of a dog.

But that ability to mimic humans means that generative AI comes with all the baggage that humans do, like pre-existing biases. A computer program is influenced by human decisions — and the data they train it on. Because these AI programs learn from existing images from around the web, they reflect what the broader public thinks of as the typical "African worker."

Advertisement

Stable Diffusion and DALL-E display bias when prompted for artwork of 'African workers' versus 'European workers'
Stable Diffusion's output for "European workers"Thomas Maxwell/Insider

Stability AI and OpenAI did not respond to a request for comment. Stability AI is the developer of Stable Diffusion.

Critics have warned that the tech industry is moving into generative AI with the same move-fast-and-break-things mentality it did at the launch of social media platforms; get the technology out there first, and deal with the consequences later. Proponents for their part argue that the technology needs to be tested out in the open in order to improve and that the benefits of AI to do things like improve productivity ultimately outweigh the negatives.

Last year, the AI avatar app Lensa faced scrutiny for generating images of women that were heavily sexualized — meanwhile, avatars created of men depicted them as astronauts and other PG-friendly characters.

Stable Diffusion is trained on LAION-5B, a large open-source dataset of images scraped from the web. Because the images that Stable Diffusion is trained on are available for anyone to view, it's possible to trace back why the image generation tool thinks that "African workers" look a certain way. But even a simple Google search for similar terms returns similar images.

Stable Diffusion and DALL-E display bias when prompted for artwork of 'African workers' versus 'European workers'
Several images generated by Bing's AI-powered image creator for the prompt "African workers."Thomas Maxwell/Insider

An academic who spoke with Insider, who requested anonymity because they did not have permission from their employer to speak to the media, said that model developers should collect better training data that wouldn't perpetuate problematic stereotypes. But manual data collection at the scale required to train a model would be prohibitively expensive. Scraping large amounts of data from the open web and existing training sets is much more efficient.

Advertisement

Sasha Luccioni, a researcher at Hugging Face with a PhD in artificial intelligence, told Insider that AI tools will output a lot of bias depending on how they are used. "If we are talking about AI-assisted tools for artists, which involves a lot of human-in-the-loop interaction between the artist and model, then the harms can be controlled and contained." Models that generate stock imagery based on a prompt, on the other hand, should include safety mechanisms or disclaimers to inform users that they may reproduce stereotypes.

Luccioni mentioned how Disney recently added disclaimers to older films on its Disney+ streaming service which say the films, like Aladdin, may feature cultural stereotypes that do not reflect the company's views. She thinks that text-to-image models like DALL-E or Stable Diffusion should feature something similar.

Some model developers work to "steer" their products away from producing certain outputs, such as by ensuring text models don't refer to a doctor with "he/him" pronouns but rather more gender-neutral terms. More training could help prevent image models from producing harmful output.

Stability AI uses a system called CLIP (Contrastive Language Image Pretraining) to help it generate images. CLIP learns how to match images in a data set to descriptive text. It's been found that CLIP includes gender and racial bias, with women being associated with sexual content while, for instance, men are associated with career-related content.

Stable Diffusion and DALL-E display bias when prompted for artwork of 'African workers' versus 'European workers'
Several images generated by Bing's AI-powered image creator for the prompt "European workers."Thomas Maxwell/Insider

Bias has long been an issue in the field of artificial intelligence. For instance, there are numerous cases of Black people being erroneously taken into custody after law enforcement used facial recognition technology to identify suspects. Facial recognition tools have been found to misidentify people of color at disproportionately higher rates compared to white people.

Advertisement

A ProPublica report from 2016 found that judges in some courts across the United States were making bail decisions based on an algorithm's estimation that a criminal would re-offend, a calculation weighted in part on crime rates in a person's zip code.

Luccioni said that when it comes to high-risk settings such as justice or health, "there should clearly be a hard line beyond which AI models cannot be used" because they could have a direct impact on people's lives.

Stable Diffusion and DALL-E — as well as chatbot tools like ChatGPT, Google's Bard, and Character.ai — are also facing scrutiny over copyright concerns. Stability AI, the company that develops Stable Diffusion, was sued in February by Getty Images, which alleges that company used millions of its images to train its AI models.

{{}}