Cloudflare has just given the AI industry a new deadline to separate web crawlers used for traditional search purposes, such as Google Search, from those used for AI agents and training. Starting September 15, 2026, Cloudflare’s default settings will block “mixed-use” crawlers from any pages that host ads, the company announced Wednesday.
This means that crawlers that combine search, proxies, and training will be blocked from crawling these sites by default, unless the site owner adjusts the settings otherwise. These changes to defaults will apply to new Cloudflare customers, new sites created by existing customers, and all existing free customers, the company says.
The move could affect how AI model providers can access web content for educational purposes and help boost their agent services.
Cloudflare points out that most website owners want their content discoverable through search and often through artificial intelligence services, but want protection against free distribution of their intellectual property.
Cloudflare specifically calls the “world’s largest search engine” (clearly a Google reference!) as having access to roughly “2x more information” than other AI companies, because the search giant makes it difficult for customers to remain discoverable without using AI.
Google has pushed back against this generalization in the past, noting that it provides a bot called Google Extended which allows website owners to opt out of the use of their content for education and artificial intelligence products and services such as Gemini Apps and Vertex API. Using it does not affect a website’s inclusion in Google Search. However, the tech giant’s flagship Googlebot is crawling for Search, including AI features like AI Overviews and AI mode.
“Now that the majority of Internet traffic is non-human, we must go further and act faster so that a sustainable ecosystem can emerge,” Cloudflare co-founder and CEO Matthew Prince said in his announcement of the news, referring to the recent milestone. where bots overtook human traffic on the internet for the first time. That change wasn’t expected to happen until next year.
“Cloudflare’s new tools and partnerships offer website owners increased visibility and commercial opportunities and benefit AI companies that have bots with clear and transparent intent. We hope the proposed default changes encourage mixed-use crawlers to separate search from agent use and training,” Prince said.
While Cloudflare offers a range of products to help users launch their own AI systemsthe company also released a series of tools to give publishers more control over their content in the age of artificial intelligence. In recent years, Cloudflare has released tools to combat AI bots, including a marketplace that allows websites to charge AI bots for scraping, called Pay Per Crawl.
The latter is now also evolving into “Pay Per Use,” the company said, which will allow publishers to charge AI companies when their content creates value, not just when it’s received.
The change could also help conserve publisher bandwidth and resource calculations for AI model providers, as Cloudflare data suggests that over 50% of crawl traffic from AI crawlers is spent re-fetching unchanged pages.
To make this happen, Cloudflare is initially working with two partners, Ceramic.ai and You.com. When a publisher participates, they get paid when their content appears in Ceramic’s AI search results or when You.com accesses a portion of their premium content.
Other AI companies can adapt this model for how they work, Cloudflare says.
When you purchase through links in our articles, we may earn a small commission. This does not affect our editorial independence.
