Keeping up with an industry as fast-paced as artificial intelligence is a tall order. So, until an AI can do it for you, here’s a helpful roundup of recent stories in the world of machine learning, along with notable research and experiments we didn’t cover on our own.
This week in AI, Google halted the ability of its Gemini AI chatbot to create images of people after complaints from a section of users about historical inaccuracies. Claiming to depict “a Roman legion,” for example, Gemini would show an anachronistic, cartoonish group of racially diverse foot soldiers while rendering the “Zulu warriors” as Black.
It appears that Google – like some other AI vendors, including OpenAI – had clumsily hard-coded under the hood to try to “correct” for biases in their model. Responding to prompts such as “show me images of only women” or “show me images of only men,” Gemini would refuse, arguing that such images could “contribute to the exclusion and marginalization of other genders.” Geminis were also loathe to create images of people defined solely by their race – e.g. “white” or “black people” – out of an apparent concern for “reducing individuals to their physical characteristics.”
The right-wing has latched onto the bugs as evidence of an “awakening” agenda perpetuated by the tech elite. But you don’t need Occam’s Razor to see the less malicious truth: Google has been burned by its tools’ biases before (see classifying black men as gorillaswrong thermal weapons in the hands of Blacks as weaponsetc.), is so desperate to avoid history repeating itself that it manifests a less prejudiced world in its image-producing models — however flawed.
In her bestselling book “White Fragility,” anti-racist educator Robin DiAngelo writes about how erasing race—”colorblindness,” in another phrase—contributes to systemic racial power imbalances rather than mitigating or alleviating them. By claiming to “see no color” or reinforcing the notion that simply recognizing the struggle of people of other races is enough to call oneself “woke,” people perpetuate harm by avoiding any meaningful preservation of the subject, says DiAngelo.
Google’s ginger treatment of race-based prompts in Gemini didn’t avoid the problem, per se — but it disingenuously attempted to hide the model’s worst biases. One could argue (and many have) that these biases should not be ignored or ignored, but addressed in the larger context of the training data from which they emerge—that is, society on the world wide web.
Yes, the datasets used to train the image producers generally contain more whites than blacks, and yes, the images of Blacks in these datasets reinforce negative stereotypes. That’s why image generators sexually certain women of color, they depict white men in positions of power and general favor rich western prospects.
Some might argue that there is no profit for AI vendors. Whether they face – or choose not to face – the models’ biases, they will be criticized. And this is true. But I suppose that, in any case, these models lack explanation — packaged in a way that minimizes the ways in which their biases manifest.
If AI vendors would address the weaknesses of their models head on, with humble and transparent language, they would go much further than haphazard attempts to “fix” what is essentially unaddressed bias. We all have bias, the truth is — and as a result we don’t treat people the same. Neither do the models we manufacture. And we would do well to recognize that.
Here are some other notable AI stories from the past few days:
- Women in Artificial Intelligence: TechCrunch has launched a series highlighting remarkable women in AI. Read the list here.
- Stable Diffusion v3: Stability AI announced Stable Diffusion 3, the latest and most powerful version of the company’s image-building AI model, based on a new architecture.
- Chrome gets GenAI: Google’s new Gemini-powered tool in Chrome lets users rewrite existing text on the web — or create something entirely new.
- Blacker than ChatGPT: Advertising agency McKinney developed a quiz game, Are You Blacker than ChatGPT?, to shed light on AI bias.
- Call for Laws: Hundreds of AI luminaries signed a public letter earlier this week calling for anti-deepfake legislation in the US
- AI Matching: OpenAI has a new customer in Match Group, the owner of apps including Hinge, Tinder and Match, whose employees will use OpenAI’s AI technology to complete work-related tasks.
- DeepMind Security: DeepMind, Google’s AI research arm, has created a new organization, AI Security and Alignment, made up of existing teams working on AI security, but also expanded to include new, specialized teams of researchers and engineers GenAI.
- Open models: Just a week after releasing the latest iteration of the Gemini models, Google has released the Gemma, a new family of lightweight, open-weight models.
- House Working Group: The US House of Representatives has established a task force on artificial intelligence that—as Devin writes—feels like a culmination after years of indecision that show no sign of ending.
More machine learning
AI models seem to know a lot, but what do they actually know? Well, the answer is nothing. But if you phrase the question slightly differently… they seem to have internalized some “meanings” that are similar to what humans know. Although no artificial intelligence really understands what a cat or a dog is, could it have some sense of similarity encoded in the embeddings of these two words that is different from, say, cat and bottle? Amazon researchers think so.
Their research compared the “trajectories” of similar but distinct sentences, such as “the dog barked at the burglar” and “the burglar made the dog bark,” with those of grammatically similar but different sentences, such as “a cat sleeps all day.” and “a girl jogs all afternoon.” They found that what people would find similar were indeed internally treated as more similar even though they were grammatically different, and vice versa for grammatically similar. Ok, I feel like this paragraph was a bit confusing, but suffice it to say that the concepts encoded in the LLMs seem more powerful and complex than expected, not completely naive.
Neural coding proves useful in artificial vision, Swiss researchers at EPFL have found out. Artificial retinas and other ways to replace parts of the human visual system generally have very limited resolution due to the limitations of microelectrode arrays. So, no matter how detailed the image is, it must be transmitted at very low fidelity. But there are different ways to downsample, and this team found that machine learning does a great job at it.
“We found that if we applied a learning-based approach, we had improved results in terms of optimized sensory coding. But the most surprising thing was that when we used an unconstrained neural network, it learned to mimic aspects of retinal processing on its own,” said Diego Gezzi in a press release. It does perceptual compression, basically. They tested it on mouse retinas, so it’s not just theory.
An interesting application of computer vision by Stanford researchers hints at a mystery in how children develop their drawing skills. The team asked and analyzed 37,000 drawings from children of various objects and animals, and also (based on the children’s responses) how recognizable each drawing was. Interestingly, it wasn’t just the inclusion of signature features like a rabbit’s ears that made the designs more recognizable to other kids.
“The kinds of features that lead older children’s drawings to be recognizable do not appear to be determined by a single feature that all older children learn to include in their drawings. It’s something much more complex that these machine learning systems are picking up,” said lead researcher Judith Fan.
Chemists (also at EPFL) found that LLMs are also surprisingly good at helping with their work after minimal training. It is not just doing chemistry directly, but rather being perfected in a body of work that individual chemists cannot know all about. For example, in thousands of documents there may be a few hundred statements about whether a high-entropy alloy is single-phase or multi-phase (you don’t need to know what that means – they do). The system (based on GPT-3) can be trained on this type of questions and yes/no answers, and will soon be able to extrapolate from it.
Not a huge advance, just more evidence that LLMs are a useful tool in this sense. “The thing is, this is as easy as a literature search, which works for many chemical problems,” said researcher Berend Smit. “Looking for a foundational model can become a common way to start a project.”
Last, a word of caution from the Berkeley researchers, although now that I’m re-reading the post, I see that EPFL also addressed this. Go Lausanne! The team found that images found through Google were much more likely to enforce gender stereotypes about specific jobs and words than text that mentions the same thing. And there were also a lot more men in both cases.
Not only that, but in one experiment, they found that people who saw pictures instead of reading text when researching a role associated those roles with a gender more reliably, even days later. “It’s not just about the incidence of gender bias online,” said researcher Douglas Guilbeault. “Part of the story here is that there’s something very sticky, very powerful about representing images of people that text just doesn’t have.”
With things like Google’s image generation differentiation fracas going on, it’s easy to overlook the well-established and often verified fact that the data source for many AI models is severely biased, and that bias has a real effect on humans.