Close Menu
TechTost
  • AI
  • Apps
  • Crypto
  • Fintech
  • Hardware
  • Media & Entertainment
  • Security
  • Startups
  • Transportation
  • Venture
  • Recommended Essentials
What's Hot

Waymo halts freeway routes after robotaxi race in construction zones

How VCs and Founders Use Inflated ‘ARR’ to Crown AI Startups

Google prefers glitter with disco ball icons: “Are you sure you still want this?”

Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
TechTost
Subscribe Now
  • AI

    How VCs and Founders Use Inflated ‘ARR’ to Crown AI Startups

    23 May 2026

    Hark Raises $700M Series A for Secret ‘Universal’ AI Interface

    22 May 2026

    Six search engines worth trying now that Google isn’t Google anymore

    22 May 2026

    Spotify adds AI-powered question-and-answer capabilities to podcasts

    21 May 2026

    Jensen Huang Says He’s Found a ‘Brand New’ $200B Market for Nvidia

    21 May 2026
  • Apps

    Google prefers glitter with disco ball icons: “Are you sure you still want this?”

    23 May 2026

    Meta is quietly launching a new Reddit-like app called Forum

    22 May 2026

    Spotify and Universal Music strike deal allowing AI covers and remixes by fans

    22 May 2026

    Spotify takes on Google’s NotebookLM with its new app

    21 May 2026

    Airbnb enters hotels, extends AI to host integration and customer support

    21 May 2026
  • Crypto

    As crypto cools, a16z crypto raises $2.2 billion in capital

    6 May 2026

    Coinbase to lay off 14% of staff as part of broader restructuring

    5 May 2026

    British cryptographer Adam Back denies NYT report that he is Bitcoin creator Satoshi Nakamoto

    9 April 2026

    Hackers stole over $2.7 billion in crypto in 2025, data shows

    23 December 2025

    New report examines how David Sachs may benefit from Trump administration role

    1 December 2025
  • Fintech

    General Catalyst just led a $63 million bet in India’s travel payments market

    21 May 2026

    Startup Battlefield 200 applications close on May 27

    21 May 2026

    Venmo’s biggest makeover in years comes at a very interesting time

    11 May 2026

    Fintech startup Parker files for bankruptcy

    10 May 2026

    Robinhood’s venture fund IPO attracted 150,000+ private investors, CEO says

    7 May 2026
  • Hardware

    We tested Google’s AI glasses and they’re almost there

    23 May 2026

    Finnish phone maker HMD ropes Indian AI chatbot into new smartphone to reach local market

    22 May 2026

    Flipper unveils a Linux-powered networking gadget designed for hackers and tinkerers

    22 May 2026

    Minimalist Light Phone teams up with Andrew Yang’s Noble Mobile, which pays you to stop doomscrolling

    20 May 2026

    Mach Industries just spent $50 million to solve a major defense technology problem

    20 May 2026
  • Media & Entertainment

    Spotify launches an audiobook creation tool powered by ElevenLabs

    22 May 2026

    New York City Mayor Zohran Mamdani Takes To Twitch To Chat With New Yorkers

    21 May 2026

    Clouted wants to take the guesswork out of making short videos go viral

    21 May 2026

    ‘Ask YouTube’ Brings AI Chat Search to Video, Adds Gemini Omni to Shorts

    20 May 2026

    Google’s Gemini Omni turns images, audio and text into video — and that’s just the beginning

    19 May 2026
  • Security

    Scammers abuse an internal Microsoft account to send spam links

    22 May 2026

    Law enforcement shuts down VPN service used by two dozen ransomware gangs

    21 May 2026

    GitHub says hackers stole data from thousands of internal repositories

    21 May 2026

    Customers say Trump Mobile is leaking their personal information

    20 May 2026

    US cyber agency CISA has exposed bundles of passwords and cloud keys to the open web

    19 May 2026
  • Startups

    This startup raised $43 million to create a hive mind for ships

    22 May 2026

    Maka Kids redefines kids’ screen time with a streaming app optimized for wellness, not engagement

    22 May 2026

    This new startup is taking on a fragrance industry that hasn’t changed in nearly half a century

    21 May 2026

    Imperagen raises £5m to use quantum physics, AI to engineer enzymes

    21 May 2026

    NanoClaw creator rejects $20M takeover offer, raises $12M instead

    20 May 2026
  • Transportation

    Waymo halts freeway routes after robotaxi race in construction zones

    23 May 2026

    Who will benefit most from SpaceX’s IPO? Mainly Elon — and a few of his inner circle

    22 May 2026

    Waymo extends layoff to four cities as robotaxis continue to drive flooding

    22 May 2026

    Waymo halts service in Atlanta as its robotic car continues to drive into floods

    21 May 2026

    SpaceX’s IPO filing is filled with AI bets, Starship dreams and Elon Musk at the center

    21 May 2026
  • Venture

    Convective Capital Raises $85M Fund to Build Disaster Resilience

    22 May 2026

    Sam Altman does a ‘mic drop’ pitch to every Y Combinator startup

    21 May 2026

    Startup Battlefield 200 applications close on May 27

    20 May 2026

    Stilta raises $10.5M from a16z and YC to help companies rediscover patents they forgot they had

    20 May 2026

    Forget Streaming: Status AI Raises $17 Million To Turn Social Media Into Interactive Entertainment

    19 May 2026
  • Recommended Essentials
TechTost
You are at:Home»AI»People use Super Mario to compare AI now
AI

People use Super Mario to compare AI now

techtost.comBy techtost.com4 March 202502 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
People Use Super Mario To Compare Ai Now
Share
Facebook Twitter LinkedIn Pinterest Email

The thought that Pokémon was a harsh reference point for AI? A team of researchers argues that Super Mario Bros. It’s even tougher.

Hao Ai Lab, a research org at the University of California, San Diego, threw AI on live Super Mario Bros. on Friday. Anthropic’s Claude 3.7 made the best, followed by Claude 3.5. Gemini 1.5 Pro and Openai’s GPT-4 fought.

It was not the same version of Super Mario Bros. as an initial liberation of 1985, to be clear. The game ran into a simulator and incorporated with a frame, Gamingagentto give AIS control over Mario.

Image credits:Hao lab

Gamingagent, which was developed at home, supplied AI’s basic instructions, such as, “If an obstacle or enemy is near, move/jump left to Dodge” and screenshots in the game. AI then created inputs in the form of Python code for Mario’s control.

Still, Hao says that the game has forced every model to “learn” to design complex maneuvers and to develop play strategies. Interestingly, the workshop has found that the models of reasoning such as the O1 of Openai, which “think” through step-by-step problems to reach solutions, performed worse than “non-erotic” models, despite being generally stronger at most benchmarks.

One of the main reasons why reasoning models find it difficult to play real-time games, such as they take a little time and seconds, usually decide on actions, according to researchers. In Super Mario Bros., the timetable is everything. One second can mean the difference between a jumping jumping clearing and a fall to your death.

Games have been used to compare AI for decades. But Some experts challenged wisdom Drawing links between AI game skills and technological progress. Unlike the real world, games tend to be abstract and relatively simple and provide a theoretically infinite amount of data to train AI.

The recent fancy gaming points shows what Andrej Karpathy, a researcher and founding member of the Openai, called “Evaluation Crisis”.

‘I don’t really know what [AI] measurements to look at the moment, ”he wrote in a Post in x. “My reaction is that I don’t know how well these models are right now.”

At least we can watch AI Play Mario.

compare Games Mario people reference points Super Super Mario Bros
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleMicrosoft closes on Skype: Service for closure May 2025
Next Article General catalyst loses three top investors as the business extends beyond Venture, examines IPO
bhanuprakash.cg
techtost.com
  • Website

Related Posts

How VCs and Founders Use Inflated ‘ARR’ to Crown AI Startups

23 May 2026

Hark Raises $700M Series A for Secret ‘Universal’ AI Interface

22 May 2026

Six search engines worth trying now that Google isn’t Google anymore

22 May 2026
Add A Comment

Leave A Reply Cancel Reply

Don't Miss

Waymo halts freeway routes after robotaxi race in construction zones

23 May 2026

How VCs and Founders Use Inflated ‘ARR’ to Crown AI Startups

23 May 2026

Google prefers glitter with disco ball icons: “Are you sure you still want this?”

23 May 2026
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Fintech

General Catalyst just led a $63 million bet in India’s travel payments market

21 May 2026

Startup Battlefield 200 applications close on May 27

21 May 2026

Venmo’s biggest makeover in years comes at a very interesting time

11 May 2026
Startups

This startup raised $43 million to create a hive mind for ships

Maka Kids redefines kids’ screen time with a streaming app optimized for wellness, not engagement

This new startup is taking on a fragrance industry that hasn’t changed in nearly half a century

© 2026 TechTost. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer

Type above and press Enter to search. Press Esc to cancel.