Close Menu
TechTost
  • AI
  • Apps
  • Crypto
  • Fintech
  • Hardware
  • Media & Entertainment
  • Security
  • Startups
  • Transportation
  • Venture
  • Recommended Essentials
What's Hot

Fika Jobs Raises $4M to Build Video-First Recruiting Platform Where AI Agents Interview Candidates

Ribbie turns real-time baseball stats into arcade-like, pixel-art shows

4 days left to save up to $190 on Founder Summit 2026

Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
TechTost
Subscribe Now
  • AI

    Fika Jobs Raises $4M to Build Video-First Recruiting Platform Where AI Agents Interview Candidates

    23 June 2026

    Founder Summit success rates increase on June 26

    22 June 2026

    US says ASML’s top chip tool may be in China, but how?

    22 June 2026

    When the Trump administration hits Anthropic, who benefits?

    21 June 2026

    In the Weights is your new AI-centric vanity quest

    21 June 2026
  • Apps

    Ribbie turns real-time baseball stats into arcade-like, pixel-art shows

    23 June 2026

    Amazon is testing Alexa+ in India with Hindi support

    23 June 2026

    WhatsApp gets new head as Meta taps CRED India founder Kunal Shah, invests $900 million in startup

    22 June 2026

    Adobe adds AI assistant to Premiere, Illustrator and InDesign

    22 June 2026

    Beyond Siri: Here are the handy AI features coming to your iPhone in iOS 27

    21 June 2026
  • Crypto

    Startup Battlefield 200 applications close today

    27 May 2026

    5 days left: Save up to $410 on Disrupt 2026 passes

    25 May 2026

    As crypto cools, a16z crypto raises $2.2 billion in capital

    6 May 2026

    Coinbase to lay off 14% of staff as part of broader restructuring

    5 May 2026

    British cryptographer Adam Back denies NYT report that he is Bitcoin creator Satoshi Nakamoto

    9 April 2026
  • Fintech

    4 days left to save up to $190 on Founder Summit 2026

    23 June 2026

    Robinhood’s note on 10% layoffs shows that blaming AI doesn’t cut it

    17 June 2026

    Anthropic’s latest spat with the Trump administration may actually help it, sales figures suggest

    17 June 2026

    Ramp raises $750M at $44B valuation as investors thirst for fintechs with AI history

    5 June 2026

    Last 24 hours to save up to $410 on your Disrupt 2026 ticket

    29 May 2026
  • Hardware

    AI chipmaker Groq confirms $650m raise and staff shakeup after Nvidia’s $20bn rent-free deal

    23 June 2026

    Aura’s stunning e-ink frame doesn’t even look digital

    20 June 2026

    AI hurts Apple in more ways than one: It could force iPhone price hikes

    18 June 2026

    Snap is finally debuting its long-awaited AR glasses, the specs, and, ugh, they’re not cheap

    17 June 2026

    Qualcomm wants to be the chip in everything that replaces your smartphone, and it just announced two products to that end

    17 June 2026
  • Media & Entertainment

    Instagram looks set to take on streaming services with a longer, episodic and live format for its TV app

    22 June 2026

    Spotify’s reserved ticket sales to music superfans are now live

    18 June 2026

    Google is betting on Gemini to reinvent the smart home speaker

    18 June 2026

    Mastodon is looking for newsletters to help revive the open social web

    17 June 2026

    60 percent of US consumers say ‘artificial intelligence’ in brand messaging is a turnoff, survey finds

    16 June 2026
  • Security

    A new unpatched flaw in Apple’s chips opens the door to an iPhone jailbreak

    23 June 2026

    Tata Electronics, a major technology supplier to Apple and Tesla, confirms the data breach

    22 June 2026

    Cybercriminals reportedly hacked tens of thousands of Fortinet firewalls used by major companies around the world

    17 June 2026

    Apple is planning to change the Hide My Email privacy feature that could make it less effective

    17 June 2026

    The US government’s ban on Anthropic models was never about an AI jailbreak

    16 June 2026
  • Startups

    Ethan Thornton tries to do everything at once

    22 June 2026

    Founders Fund’s extreme bet on humanely killed fish

    21 June 2026

    DeepL acquires Mixhalo for live audio streaming and translation

    20 June 2026

    It made the free video player work smoothly. Now he does this for robots.

    20 June 2026

    Pixi’s new iOS app turns text messages into interactive AR experiences

    19 June 2026
  • Transportation

    Tesla brings back Autopilot narrative after fatal Texas crash

    23 June 2026

    Lucid Motors’ new CEO cuts 18% of staff to ‘simplify the company’

    22 June 2026

    TechCrunch Mobility: A new robotaxi scorecard shows China’s dominance

    21 June 2026

    Rivian owners file lawsuit alleging false promises about self-driving features

    19 June 2026

    Waymo recalls nearly 4,000 robotaxis to stop them from driving in highway construction zones

    18 June 2026
  • Venture

    Seedcamp Raises $320M for New Fund to Expand US Footprint

    22 June 2026

    The 11 startups that stood out from YC’s demo day, according to VCs

    19 June 2026

    Roelof Botha joins SpaceX board of directors

    18 June 2026

    Chi-Hua Chien saw Facebook coming – now he says the real AI winners won’t sell AI

    18 June 2026

    PayPal Ventures is shutting down as the company continues to restructure

    17 June 2026
  • Recommended Essentials
TechTost
You are at:Home»AI»Silicon Valley bets big in ‘environments’ to train agents AI
AI

Silicon Valley bets big in ‘environments’ to train agents AI

techtost.comBy techtost.com22 September 202509 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
Silicon Valley Bets Big In 'environments' To Train Agents Ai
Share
Facebook Twitter LinkedIn Pinterest Email

For years, Big Tech CEOs have inaugurated AI agents who can use autonomous software applications to complete people. But take today’s AI Agents for a rotation, be it the Openai Chatgpt agent or the Perplexity comet and you will quickly realize how limited the technology is. Making AI agents can get a new set of techniques that the industry is still discovering.

One of these techniques carefully simulates the workplaces where agents can be trained in multiple-step duties-known as reinforcement environments (RL). Similarly in the way the data sets supplied the last wave of AI, the RL environments begin to look like a critical element in the development of factors.

Researchers, founders and investors AI tell TechCrunch that top AI laboratories now require more RL environments and there is no lack of newly formed businesses that hope to supply them.

“All the big AI laboratories build RL environments at home,” said Jennifer Li, a collaborator at Andreessen Horowitz, in an interview with TechCrunch. “But as you can imagine, creating these sets of data is very complicated. So AI laboratories also consider third -party suppliers who can create high quality environments and ratings.

The push for the RL environments has crying a new category of well -intentioned newly established businesses, such as engineering and primary intellect, aiming to drive the space. Meanwhile, large data labeling companies such as Mercor and Surge say they are investing more in RL environments to keep up with industry shifts from static data sets in interactive simulations. Big workshops are also thinking of investing in large $ 1 billion in RL environments Next year.

The hope for investors and founders is that one of these newly established companies emerges as a “AI scale for environments”, referring to the $ 29 billion Powerhouse data marking.

The question is whether the RL environments will really push the borders of AI progress.

TechCrunch event

Francisco
|
27-29 October 2025

What is a RL environment?

At their core, the RL environments are educational reasons that simulate what an AI agent would do in a real software application. A founder described their construction recent interview “Like the creation of a very boring video game.”

For example, an environment could simulate a Chrome browser and work an AI agent with the purchase of a pair of socks on Amazon. The agent is scored by his performance and sent a reward signal when he succeeds (in this case, buying a worthy pair of socks).

While such work sounds relatively simple, there are many places where an AI agent could escape. Navigation in the developing menus of the website may be lost or buy too many socks. And because developers cannot predict exactly what a mistake will turn an agent, the environment itself must be durable enough to capture any unexpected behavior and deliver useful comments. This makes the construction environments much more complex than a static data set.

Some environments are quite complex, allowing AI agents to use tools, internet access, or use various software applications to complete a given task. Others are closer, with the aim of helping an agent learn specific tasks in Enterprise software applications.

While RL environments are the hot thing in Silicon Valley at the moment, there is a lot precedent for using this technique. One of Openai’s first projects in 2016 was the construction ”Gyms rl“Which were quite similar to the modern perception of the environments. The same year, Google Deepmind’s Alpha The AI ​​system struck a world champion in the board game, Go. He also used RL techniques in a simulated environment.

What is unique to today’s environments is that researchers are trying to create AI agents using computers with large transformer models. Unlike Alphago, which was a specialized AI system that works in a closed environment, today’s AI agents are trained to have more general opportunities. AI researchers today have a stronger starting point, but also a complex goal where more can go wrong.

A full of field

AI data labeling companies such as Scale AI, Surge and Mercor are trying to meet the moment and create RL environments. These companies have more resources than many newly established businesses in the field, as well as deep relationships with AI Labs.

Surge Edwin Chen CEO tells TechCrunch that he has recently seen a “significant increase” in demand for RL environments within AI laboratories. Surge – which he created reportedly Revenue of $ 1.2 billion Last year from collaboration with AI Labs such as Openai, Google, Anthropic and Meta – recently turned a new internal organization specially tasked with building RL environmental, he said.

The closure behind the Surge is Mercor, a startup of $ 10 billion, which has also worked with Openai, Meta and Anthropic. Mercor puts investors for RL Business Building environments for specific tasks, such as coding, healthcare and law, according to the marketing material observed by TechCrunch.

Mercor CEO Brendan Foody told TechCrunch in an interview that “few understand how big the opportunity around the RL environments is.”

The AI ​​scale has used to dominate the data label, but has lost ground since Meta invested $ 14 billion and hired its CEO. Since then, Google and Openai have fallen on the AI ​​scale as a data provider and even the start is facing competition for work with data labeling in the Meta. But still, the scale is trying to meet the moment and build environments.

‘This is just the nature of the business [Scale AI] It is means, “said Chetan Rane, the scale of AI’s product for agents and RL environments.” The scale has proven its ability to adapt quickly. We did this in the early days of autonomous vehicles, our first business unit. When Chatgpt came out, the AI ​​scale adapted to it. And now, once again, we are adapting to new border venues such as agents and environments. ”

Some younger players focus exclusively on environments from the beginning. Among them is engineering, a starting start about six months ago with the bold target of “automation of all jobs”. However, co -founder Matthew Barnett tells Techcrunch that his business starts with RL environments for AI encoding agents.

Mechanize aims to provide AI laboratories with a small number of powerful RL environments, Barnett says, instead of larger data companies that create a wide range of simple RL surrounding. At this point, boot offers software engineers $ 500,000 For the construction of an environment of RL – much higher than an hourly contractor could earn work on a AI or Surge scale.

Mechanize has already worked with humanity in RL environments, two sources familiar with the issue told TechCrunch. Mechanize and Anthropic refused to comment on the partnership.

Other newly established companies bet that RL environments will have an influence outside AI laboratories. Prime Intellect – a boot supported by researcher AI Andrej Karpathy, Founders Fund and Menlo Ventures – aims at smaller RL environments.

Last month, Prime Intellect started a Rl hub surroundings, aimed to be a “hugged person for RL surroundings.” The idea is to give open source developers to access the same resources that the large AI laboratories have and sell these developers access to computing resources in the process.

Training generally capable factors in RL environments can be more computing than previous AI training techniques, according to Prime Intellect Will Brown. Along with the newly established companies that create RL environments, there is another opportunity for GPU providers that can supply the process.

“The RL environments will be too big to dominate any company,” Brown said in an interview. “Part of what we do is just try to build good open source infrastructure around it.

Will it score?

The open question around the RL environments is whether the technique will escalate like previous AI training methods.

Aid learning has powered some of the biggest jumps in AI in the past year, including models such as Openai’s O1 and OPENAI’s Claude Opus 4, are particularly important discoveries, because the methods previously used to improve AI models now show reduced release.

The environments are part of Ai Labs’ largest stake in RL, which many believe will continue to lead to progress as they add more data and computational resources to the process. Some of the Openai researchers behind O1 told TechCrunch that the company initially invested in AI reasoning models that were created through RL and Compute Time-Time-because they thought it would be fine.

The best way for the RL scale remains unclear, but the environments look like a promising candidate. Instead of simply rewarding chatbots for text answers, they let agents work in simulations with tools and computers available. This is much more intense, but possibly more rewarding.

Some are skeptical that all these RL environments will get rid of. Ross Taylor, a former AI researcher with Meta who co -founder of general reasoning, tells Techcrunch that RL environments are prone to rewarding hacking. This is a process in which AI models cheat to get a reward, without really doing the work.

“I think people underestimate how difficult it is to escalate the environments,” Taylor said. ‘Even the best available to the public [RL environments] They usually do not work without serious modification. ”

The head of OpenAi engineering for API business, Sherwin Wu, told a recent podcast That was “short” in the newly established RL Environmental Businesses. Wu noted that it is a very competitive space, but also that the AI ​​research is evolving so quickly that it is difficult to serve AI’s laboratories well.

Karpathy, a primary intellect investor called RL environments a possible discovery, has also expressed attention to the RL area wider. To one Post in xHe raised concerns about how much the progress of AI can be squeezed by RL.

“I am swollen in environments and techniques of interactions, but I am a Bearish in enhancing learning in particular,” Karpathy said.

UPDATE: A previous version of this article refers to mechanical work as mechanical work. Has been informed to reflect the official name of the company.

agent agents bets big environments Human learning open Research Rl Scale ai Silicon train Valley
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleTrump says Lachlan and Rupert Murdoch could invest in Tiktok deal
Next Article Fueled by India’s small businesses, UK Fintech Tide becomes a Unicorn supported by TPG
bhanuprakash.cg
techtost.com
  • Website

Related Posts

Fika Jobs Raises $4M to Build Video-First Recruiting Platform Where AI Agents Interview Candidates

23 June 2026

Founder Summit success rates increase on June 26

22 June 2026

US says ASML’s top chip tool may be in China, but how?

22 June 2026
Add A Comment

Leave A Reply Cancel Reply

Don't Miss

Fika Jobs Raises $4M to Build Video-First Recruiting Platform Where AI Agents Interview Candidates

23 June 2026

Ribbie turns real-time baseball stats into arcade-like, pixel-art shows

23 June 2026

4 days left to save up to $190 on Founder Summit 2026

23 June 2026
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Fintech

4 days left to save up to $190 on Founder Summit 2026

23 June 2026

Robinhood’s note on 10% layoffs shows that blaming AI doesn’t cut it

17 June 2026

Anthropic’s latest spat with the Trump administration may actually help it, sales figures suggest

17 June 2026
Startups

Ethan Thornton tries to do everything at once

Founders Fund’s extreme bet on humanely killed fish

DeepL acquires Mixhalo for live audio streaming and translation

© 2026 TechTost. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer

Type above and press Enter to search. Press Esc to cancel.