Ahead of the holidays, Microsoft he said It revamped the AI model behind Bing Image Creator, the AI-powered image editor built into the company’s Bing search engine. Microsoft promised that the new model – the latest version of OpenAI’s DALL-E 3 model, codenamed PR16 – would allow users to create images “twice as much as before” with “higher quality”.
But it didn’t work. Complaints quickly flooded X and Reddit.
“The DALL-E we loved is gone forever” he said a Redditor. “I use ChatGPT now because Bing has become useless to me.” he wrote other.
The backlash was such that Microsoft said it would roll back the previous model of Bing Image Creator until it could address the issues.
bring back the old dalle 3! the picture quality is much better on the old model. like these images for example. the image the new model creates sucks 🙁 pic.twitter.com/BjIM8MS4ng
— ze ᡣ𐭩ྀིྀི (@riegrowl) December 28, 2024
“We were able to [reproduce] some of the issues mentioned and plan to return [DALL-E 3] PR13 until we can fix them,” said Jordi Ribas, head of search at Microsoft. he said in a post on the afternoon of X Tuesday. “The development process is very slow unfortunately. It started a week ago and will take another 2-3 weeks to reach 100%.
So what went wrong?
It is difficult to compare model results from anecdotal reports, particularly when the messages are not standardized. But many users said the PR16 tended to make images look less realistic. Mayank Parmar, writing for Latest Windowsnoted that the images produced by the PR16 lacked detail and polish and appeared strangely cartoonish and “lifeless”.
I don’t know who you think you’re kidding with that. DALL-E is objectively worse than ever after this “update” and you are outclassed by other companies like Google. It’s absolutely night and day comparing the image quality now to just a few months ago pic.twitter.com/EdSdk7aign
— out (@roccynoxy) December 19, 2024
This is not the first time that an image model that supposedly passed internal checks has not been well received publicly. In February, Google was forced to halt its AI chatbot Gemini’s ability to create images of people after user complaints historical inaccuracies.
The errors show how difficult it can be to measure model improvements in the real world. According to Ribas, Microsoft’s benchmarking found the PR16’s quality to be “slightly better on average” compared to the previous Bing Image Creator model.
Whatever internal metric the company used, it seems clear that it doesn’t align with most people’s preferences.
TechCrunch has a newsletter focusing on AI! Sign up here to get it in your inbox every Wednesday.
