Fighting for quality news media in the digital age.

  1. Media Law
August 30, 2023

News publishers divided over whether to block ChatGPT

The PPA has written to Rishi Sunak asking for AI companies to be forced into being transparent.

By Charlotte Tobitt

Publishers pushing back against AI companies are not doing so because they are an “archaic industry that just doesn’t want any change” but because they want a level playing field, an industry leader has said.

And although ChatGPT owner OpenAI has this month given websites the ability to opt out from allowing their content to be scraped for training purposes, many publishers would prefer to work with the tech platforms to “get the value that they want”.

Sajeeda Merali, chief executive of the Professional Publishers Association which represents large and small specialist media businesses, told Press Gazette that her member publishers are “very concerned” about a lack of transparency from AI companies about how they use publishers’ content to train their tools.

“It’s certainly not good for business. If there is no government intervention, then this is going to have quite a serious adverse impact on the sustainability of the industry, so the health of it, the prosperity of it. Content creators will have no means of monetising their content.”

Merali noted that PPA members, who include Bauer Media, Conde Nast, Hello!, Future, the London Review of Books, Mark Allen Group and New Scientist, produce “really deep insights for specialist communities” and “there’s a lot of investment that goes into that quality journalism”.

Leading news publishers block ChatGPT

According to Originality.AI, an AI detector for publishers, Reuters was the first of the top 100 websites in the world to block OpenAI’s GPTBot on its website’s Robots.txt file.

Other publishers to have opted out appear to include: The New York Times – which has also changed its terms of service to ban the training of AI models using its content – and sister title The Athletic, CNN, Bloomberg, Insider, The Verge, PC Mag, Vulture, Mashable, Times of India, New York Magazine, The Atlantic, Bustle, Vox, Lonely Planet, Hello!, Axios, France 24, and the New York Daily News.

Originality.AI explained the potential downside of blocking GPTBot, comparing it to blocking crawlers used to index sites on Google: “The most significant concern for websites considering blocking GPTBot is the potential missed opportunity. As ChatGPT evolves and integrates with the internet more intimately, it could serve a role similar to that of a search engine.

“By providing users with direct links or references from web sources, ChatGPT can direct significant traffic to those sites. If GPTBot is blocked, that site’s content may not be among the recommended sources, essentially sidelining potential visitors.

“In essence, just as blocking Google would prevent a website from appearing in one of the world’s most popular search engines, blocking GPTBot might mean missing out on a burgeoning channel of web traffic.”

Similarly, Merali told Press Gazette: “[If] ChatGPT is to continue to grow and become an entry point for digital information in the same way that Google search is at the moment then opt-out isn’t really a viable option.

“What we don’t want to do is create barriers to negotiating the right terms with ChatGPT and we certainly don’t want them to be able to say that ultimately publishers can choose to do what they want.

“It’s trying to find that sweet spot in the middle where we can continue to innovate and use these solutions and the industry works in the right way where its content is valued in the right way as well.”

She did point out, however, that publishers originally allowed Google to crawl their content because they would get a commercial reward: “Readers would be directed onto publisher websites where they could serve advertising or they could convert people to be subscribers, or they could tell them about an event they’re doing or any other affiliate partnerships they have and generate affiliate revenues.

“There was no agreement for that content to then be used to train AI models.”

AI and transparency: ‘It’s all a bit of a black hole’

The problem with negotiations, Merali said, is a lack of transparency around how publishers’ content is being used.

“It’s really hard to determine what the value of that content is until it’s understood how that content has been used. But it’s all a bit of a black hole at the moment.”

This is why the PPA joined together with the European Publishers Council, Publishers’ Licensing Services and Association of Learned and Professional Society Publishers to write to Prime Minister Rishi Sunak last week, calling for the implementation of a legal footing for transparency provisions “to ensure that owners of AI systems declare how they have used publishers’ content so that compensation issues can be identified and addressed”.

The letter said: “It is already clear that publishers’ content is being used to train AI tools without permission, or any form of payment. There are already documented cases of AI systems using publishers’ works without citing or crediting them which in some cases has led to lengthy and costly litigation.

“Without transparency enforcement provisions, rightsholders are unable to see how their content is used, which creates challenges to any ability to agree terms, fees and limits on ensuing use through a negotiated licence. Therefore, the Government must take swift action to put the right regulatory mechanisms in place.”

The letter also called for the Government to ensure the upcoming Digital Markets, Competition and Consumers Bill equips the Competition and Markets Authority to “address market power obtained by large tech companies who own AI systems”.

The letter noted: “Large technology companies who own the AI systems are already quickly obtaining power over operation of the digital market and have strategic influence on the publishing sector.”

Both the letter and Merali stressed that publishers are not resistant to change.

Sunak was told: “Traditionally, the publishing industry has always adapted to developments in technology. Digital technologies have allowed our sector to continue to evolve and find new and innovative ways to connect with readers. But the two issues we raise are the result of unprecedented and, as yet unregulated, shifts in market power.”

While Merali told Press Gazette that publishers have been testing and sharing results of their experiments around the use of AI tools that can “potentially help to improve their workflow or could potentially help them to understand their reader better so they can serve even more relevant content”.

“So it’s not that we’re this archaic industry that just doesn’t want any change. I think there’s been a lot of innovation in terms of how it is being used, but it’s about fair play and it’s actually about there is a legal framework in place that protects copyright and why should that actually be any different in this case?”

Topics in this article : , ,

Email pged@pressgazette.co.uk to point out mistakes, provide story tips or send in a letter for publication on our "Letters Page" blog

Websites in our network