Cloud infrastructure has long been designed around people searching, clicking, scrolling and streaming in a consistent and predictable manner. AI agents behave differently. They can unleash a flurry of activity, spinning up multiple sub-agents that query hundreds of databases, look up documents and call APIs in seconds, then disappear as quickly as they arrived.
Under this premise, Amazon is redesigning a key piece of its cloud infrastructure. On Thursday, AWS introduced the next generation of OpenSearch Serverlessa fully managed search and vector database — essentially an information storage and retrieval system at scale — designed specifically for agent workloads. AWS says the new system can scale instantly when agents fire up jobs and scale back to zero when idle.
The release reflects a growing realization across the tech industry: The infrastructure originally designed for a human-centric Internet isn’t working so well in a world increasingly populated by agents.
While AI agents still represent a relatively small portion of Internet activity, machine-generated traffic is already significant and poised to grow. Cloudflare says bots accounted for 31% of all HTTP traffic over the past six months. AI crawlers, search engines and assistants made up about a quarter of all bot requests during this period.
“Non-human traffic will overtake human traffic sometime in the first half of 2027,” he said Lai Yi Ohlsensenior product manager at Cloudflare, TechCrunch.
At Google’s I/O developer conference last week, the company said users will be able to start delegating tasks to AI systems, such as shopping research, booking travel, browsing the web and interacting with apps. But the buck doesn’t stop at consumer-focused AI agents. Businesses are increasingly deploying agents internally and for their customers, creating new kinds of machine-generated traffic behind the scenes.
As a result, cloud providers and infrastructure companies are figuring out how to adapt systems built for humans to a world of agents that continuously and autonomously retrieve information, invoke tools, and create machine-to-machine traffic.
That’s where AWS’s new OpenSearch Serverless comes in.
“The timing is simple. Agents are moving from experimentation to production and creating traffic patterns that the previous infrastructure simply wasn’t designed for,” Tia White, general manager for Amazon OpenSearch Service, told TechCrunch. “They grow without warning, idle without warning, and the business needs search that continues without paying for empty or idle compute.”
The key technical change with this new generation is that it decouples computation from storage, allowing computations to ramp up in seconds to respond to bursts of agent traffic and ramp down to zero so clients pay $0 when agents are idle.
“Previously, even in the previous serverless version, you had to have at least one instance up and running because storage and compute were tied together,” White said. “You couldn’t just turn around automatically [compute] at the rate we needed, so you always had idle compute reserved for your workload, whether you were using it or not.”
Think of it as always paying for a parking space, even when you don’t use it. With AWS’ upgraded Serverless, it’s more like paying for a metered parking space.
At launch, OpenSearch Serverless will integrate natively with AI development platforms such as Vercel and Kiro, so developers can develop production-ready search and vector backends for agents without managing infrastructure.
Change is occurring throughout the cloud industry. Databricks and Snowflake are being repositioned as AI memory and retrieval systems for enterprise data. Microsoft has released updates to Azure it is designed to handle bursts of AI agents and share memory between agents. Cloudflare, in a similar spirit to Amazon, was introduced last month infrastructure that aims to provide agents with persistent environments and immediate scalability.
The more companies deploy AI agents, the more pressure there will be to redesign the infrastructure around machine-generated workloads, which in turn could make agents cheaper and easier to deploy at scale.
When you purchase through links in our articles, we may earn a small commission. This does not affect our editorial independence.
