Close Menu
TechTost
  • AI
  • Apps
  • Crypto
  • Fintech
  • Hardware
  • Media & Entertainment
  • Security
  • Startups
  • Transportation
  • Venture
  • Recommended Essentials
What's Hot

A married founding duo’s company, 14.ai, is replacing customer support teams at startups

The candidate that Silicon Valley built is now the one they want to tear down

Users are abandoning ChatGPT for Claude — see how you can make the switch

Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
TechTost
Subscribe Now
  • AI

    Users are abandoning ChatGPT for Claude — see how you can make the switch

    3 March 2026

    No one has a good plan for how AI companies should work with government

    3 March 2026

    OpenAI reveals more details about its deal with the Pentagon

    2 March 2026

    Google is trying to tackle long-standing RCS spam in India — but not alone

    2 March 2026

    Billion dollar infrastructure deals are fueling the AI ​​boom

    1 March 2026
  • Apps

    X adds “Paid Partnership” tags so creators can skip hashtags

    3 March 2026

    ChatGPT uninstalls increased 295% after DoD settlement

    3 March 2026

    Figma is working with OpenAI to support Codex

    2 March 2026

    Let’s explore the best Discord alternatives

    2 March 2026

    X tries to attract advertisers by letting them reuse creatives created for other platforms

    1 March 2026
  • Crypto

    Hackers stole over $2.7 billion in crypto in 2025, data shows

    23 December 2025

    New report examines how David Sachs may benefit from Trump administration role

    1 December 2025

    Why Benchmark Made a Rare Crypto Bet on Trading App Fomo, with $17M Series A

    6 November 2025

    Solana co-founder Anatoly Yakovenko is a big fan of agentic coding

    30 October 2025

    MoviePass opens Mogul fantasy league game to the public

    29 October 2025
  • Fintech

    Stripe wants to turn your AI costs into a profit center

    3 March 2026

    3 days left: Save up to $680 on your ticket to Disrupt 2026

    25 February 2026

    More startups surpass $10M ARR in 3 months than ever before

    24 February 2026

    Stripe, PayPal Ventures Bet on India’s Xflow to Fix Cross-Border B2B Payments

    24 February 2026

    InScope raises $14.5M to solve financial reporting pain

    20 February 2026
  • Hardware

    Apple is packing the smarts into its new $599 iPhone 17e

    3 March 2026

    Apple is speeding up the iPad Air with an M4 upgrade, starting at $599

    2 March 2026

    Honor launches its new slim foldable Magic V6 with a 6,600 mAh battery

    1 March 2026

    Xiaomi launches 17 Ultra smartphones, an AirTag clone and an ultra-thin powerbank

    28 February 2026

    Last 24 hours to get Disrupt 2026 tickets at the lowest prices of the year

    27 February 2026
  • Media & Entertainment

    Paramount+ and HBO Max will merge into one streaming service after the WBD deal closes

    2 March 2026

    What you need to know about Warner Bros.’ landmark Discovery sale

    1 March 2026

    Apple and Netflix team up to stream Formula 1 Canadian Grand Prix

    27 February 2026

    Netflix pulls out of bid for Warner Bros. Discovery, giving studios, HBO and CNN to Ellison-owned Paramount

    27 February 2026

    Book the best deals for Disrupt 2026 | TechCrunch

    26 February 2026
  • Security

    A new app alerts you if someone nearby is wearing smart glasses

    3 March 2026

    Hacktivists claim to have breached Homeland Security to release ICE contract data

    2 March 2026

    The resulting data breach is growing, affecting at least 25 million people

    28 February 2026

    India cuts off access to popular developer platform Supabase with block order

    28 February 2026

    CISA replaces deputy director after a difficult year on the job

    27 February 2026
  • Startups

    A married founding duo’s company, 14.ai, is replacing customer support teams at startups

    3 March 2026

    India’s Pronto takes home help official as valuation grows 8x in less than a year

    3 March 2026

    Why China’s humanoid robot industry is winning the early market

    1 March 2026

    Jest, a marketplace for messaging games, is challenging the app store status quo

    28 February 2026

    Superhuman bets on redesigned smart ring to win back US market after Oura controversy

    27 February 2026
  • Transportation

    Self-driving truck startup Einride raises $113M PIPE ahead of public debut

    27 February 2026

    It’s time to pull the plug on plug-in hybrids

    26 February 2026

    Harbinger acquires self-driving company Phantom AI

    26 February 2026

    Waymo robotaxis are now operating in 10 US cities

    25 February 2026

    Self-driving tech startup Wayve raises $1.2 billion from Nvidia, Uber and three automakers

    25 February 2026
  • Venture

    The candidate that Silicon Valley built is now the one they want to tear down

    3 March 2026

    Parade’s Cami Tellez Announces New Creator Economy Marketing Platform, $4M Funding

    3 March 2026

    SaaS in, SaaS out: Here’s what’s driving the SaaSpocalypse

    2 March 2026

    Investors are shedding what they are no longer looking for in AI SaaS companies

    2 March 2026

    After Zomato, Deepinder Goyal is back with a $54 million brain-monitoring bet

    28 February 2026
  • Recommended Essentials
TechTost
You are at:Home»AI»Microsoft created a fake market to test AI agents – they failed in surprising ways
AI

Microsoft created a fake market to test AI agents – they failed in surprising ways

techtost.comBy techtost.com6 November 202502 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
Microsoft Created A Fake Market To Test Ai Agents
Share
Facebook Twitter LinkedIn Pinterest Email

On Wednesday, Microsoft researchers released a new simulation environment designed to test artificial intelligence agents, along with new research showing that current agent models may be vulnerable to manipulation. Conducted in collaboration with Arizona State University, the research raises new questions about how well AI agents will perform when working unsupervised — and how quickly AI companies can deliver on the promises of a future.

The simulation environment, named the Magentic Marketplace by Microsoft, has been built as a synthetic platform for experimentation in the behavior of AI agents. A typical experiment might involve a customer-agent trying to order dinner according to a user’s instructions, while agents representing various restaurants compete to win the order.

The team’s initial experiments involved 100 separate client-side agents interacting with 300 business-side agents. Because the source code for the market is open source, it should be simple for other groups to adopt the code to run new experiments or replicate findings.

Ece Kamar, managing director of Microsoft Research’s AI Frontiers Lab, says this kind of research will be critical to understanding the capabilities of AI agents. “There’s really a question about how the world is going to change by having these agents working together and talking to each other and negotiating,” Kamar said. “We want to understand these things deeply.”

Initial research looked at a mix of leading models, including GPT-4o, GPT-5, and Gemini-2.5-Flash, and found some surprising weaknesses. Specifically, the researchers found several techniques that businesses could use to manipulate customer representatives into buying their products. The researchers observed a particular drop in efficiency as a customer agent had more options to choose from, overwhelming the agent’s attention span.

“We want these agents to help us process a lot of options,” says Kamar. “And we’re seeing the current models really get overwhelmed by having too many options.”

Agents also experienced problems when asked to cooperate towards a common goal, apparently unsure of which agent should play which role in the cooperation. Performance improved when the models were given clearer instructions on how to work together, but the researchers still felt that the models’ inherent capabilities needed improvement.

Techcrunch event

San Francisco
|
13-15 October 2026

“We can guide the models – as we can tell them, step by step,” Kamar said. “But if we’re testing their collaborative capabilities natively, I would expect those models to have those capabilities by default.”

agents AI agents created failed fake market Microsoft surprising test Ways
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleDiscord’s Family Center update now allows parents to track weekly purchases
Next Article Lucid Motors’ chief engineer is leaving after 10 years
bhanuprakash.cg
techtost.com
  • Website

Related Posts

Users are abandoning ChatGPT for Claude — see how you can make the switch

3 March 2026

No one has a good plan for how AI companies should work with government

3 March 2026

OpenAI reveals more details about its deal with the Pentagon

2 March 2026
Add A Comment

Leave A Reply Cancel Reply

Don't Miss

A married founding duo’s company, 14.ai, is replacing customer support teams at startups

3 March 2026

The candidate that Silicon Valley built is now the one they want to tear down

3 March 2026

Users are abandoning ChatGPT for Claude — see how you can make the switch

3 March 2026
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Fintech

Stripe wants to turn your AI costs into a profit center

3 March 2026

3 days left: Save up to $680 on your ticket to Disrupt 2026

25 February 2026

More startups surpass $10M ARR in 3 months than ever before

24 February 2026
Startups

A married founding duo’s company, 14.ai, is replacing customer support teams at startups

India’s Pronto takes home help official as valuation grows 8x in less than a year

Why China’s humanoid robot industry is winning the early market

© 2026 TechTost. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer

Type above and press Enter to search. Press Esc to cancel.