
Internet infrastructure provider Cloudflare is now blocking all AI scrapers accessing content by default in an industry first.
The move has been backed by more than a dozen major news and media publishers including the Associated Press, The Atlantic, Buzzfeed, Conde Nast, DMGT, Dotdash Meredith, Fortune, Gannett, The Independent, Sky News, Time and Ziff Davis.
Cloudflare said its default setting for new domains is now to block AI crawlers that don’t have permission or provide compensation, giving them more control.
Website owners can then decide if they want to give AI crawlers access to their content and how they can use it if they do.
This means, for example, that if a publisher did a deal with OpenAI they could specifically choose to allow access to only its GPTBot and no others. Or publishers could continue to exclude everything if they don’t feel they are getting any value back from scraping.
The AI crawlers are now able to state their purpose – for example whether they are used for training, inference or search – to help website owners decide whether to allow them to scrape.
Cloudflare co-founder and chief executive Matthew Prince said: “If the internet is going to survive the age of AI, we need to give publishers the control they deserve and build a new economic model that works for everyone – creators, consumers, tomorrow’s AI founders, and the future of the web itself.
“Original content is what makes the internet one of the greatest inventions in the last century, and it’s essential that creators continue making it. AI crawlers have been scraping content without limits.
“Our goal is to put the power back in the hands of creators, while still helping AI companies innovate. This is about safeguarding the future of a free and vibrant internet with a new model that works for everyone.”
Cloudflare is one of the world’s largest internet networks and its customers are said to account for 20% of traffic on the worldwide web.
Previously, in September last year, it had added the ability to block AI crawlers in one click which has since been adopted by more than one million customers.
Cloudflare published analysis showing which AI-only crawlers were doing the most scraping. OpenAI’s GPTBot had the biggest share at 30% in May (up from 5% in May 2024) followed by Anthropic’s ClaudeBot (21%, down from 27% last year).
Meta’s bot Meta-ExternalAgent took a 19% share of the scraping, Amazonbot was on 11% and Bytespider, linked to training models like Ernie and Tiktok-related AI, was on 7.2% (down from a much bigger 42% in 2024).
However, Googlebot, which indexes content for Google Search, made up a 50% share of all AI and search web crawlers – up from 30% last year.
Cloudflare is also experimenting with “pay per crawl”, currently in private beta testing, which would let website owners charge a flat, per-request price to crawlers.
If a crawler does not have a billing relationship with Cloudflare, meaning they can’t be financially charged, it would be blocked but take away the message that there could be a relationship with that website in future.
Publishers can currently set a flat price for all crawlers accessing their site but Cloudflare said this would mean they can allow certain scrapers to bypass charges, if they already have an arrangement for example.
Publishers react to AI scrapers blocking move
Rich Caccappolo, vice chairman at DMG Media which owns the Daily Mail, Metro and The i Paper, said: “DMGT welcomes Cloudflare’s initiative to help prevent the current epidemic of unauthorised scraping of websites by AI companies and their proxies, who are using copyrighted content for commercial purposes without paying for it.
“We support any innovation that creates a structured and transparent relationship between content creators and AI platforms and respects fundamental property rights.
“It will require a concerted effort by regulators, politicians, legislators, technology providers and content creators to build a new economic model for the AI era. We commend every step taken in pursuit of that goal.”
DMGT has previously financially backed Prorata.ai, which is building a model through which it attributes the source of content generated by AI chatbots to certain publishers and pays them accordingly.
The publishers speaking out in support of Cloudflare’s move are a mix of those who have done AI content deals with the likes of OpenAI and those who have not.
Conde Nast chief executive Roger Lynch said the new approach was a “game-changer for publishers and sets a new standard for how content is respected online. When AI companies can no longer take anything they want for free, it opens the door to sustainable innovation built on permission and partnership.”
Dotdash Meredith chief executive Neil Vogel said: “We have long said that AI platforms must fairly compensate publishers and creators to use our content. We can now limit access to our content to those AI partners willing to engage in fair arrangements.”
Renn Turiano, chief consumer and product officer of Gannett which owns USA Today and more than 200 US local publications, said that “blocking unauthorised scraping and the use of our original content without fair compensation is critically important” and that they believe the new tech “will help combat the theft of valuable IP”.
And Sky News executive chairman David Rhodes said: “This permission-based model will help secure the future of quality digital journalism, which is our commitment. Sky News is all about providing ‘the full story, first’– so we wanted to be among the first to join Cloudflare’s framework for setting fair terms of trade in news.
“We’ll help design the future of these services as video becomes an ever-larger part of both crawling and publishing.”
The change was also backed by tech bosses at the likes of Reddit and Pinterest. Reddit’s chief executive Steve Huffman said the “whole ecosystem of creators, platforms, web users and crawlers will be better when crawling is more transparent and controlled”.
More testimonials from news publisher leaders backing the Cloudflare deal can be found here.
Email pged@pressgazette.co.uk to point out mistakes, provide story tips or send in a letter for publication on our "Letters Page" blog