July 6, 2023

Generative AI and The Guardian: ‘What we do can’t be reproduced synthetically’

How The Guardian is experimenting with gen AI tools - and why it has not deployed any yet.

Guardian article about ChatGPT owner OpenAI — Guardian article about ChatGPT threat. Picture: Shutterstock

“It is, at its best, really dazzling. But it also comes with some downsides.”

In one sentence, Chris Moran, head of editorial innovation at The Guardian, neatly summarised how most journalists, editors and news executives feel about the new technology that has emerged since the arrival of ChatGPT in November.

Those downsides are plentiful and pose “particular dangers” for journalism, he told Press Gazette’s Future of Media Explained podcast. Yet his experimentation with generative AI so far has left him feeling “quite optimistic about what value journalism brings to the world”.

“Everybody is very worried about it, rightly, and there are difficult questions,” he explained. “But there is something about looking at this technology which reminds you why what we do is incredibly important and not entirely reproducible synthetically.”

Hallucinations: Journalists ‘can’t ignore the fact it will make stuff up’

One of those downsides is the production of “hallucinations” by ChatGPT and other AI text generators – meaning they make up information or, in the case of being asked to summarise something, they introduce “facts” that were not in the original source material.

Scandinavian publisher Schibsted has been experimenting with AI-written summaries based on its titles’ own articles, but 10% of the time the summaries included things that were not in the source material, whether accurate or not.

Moran told Press Gazette: “…even if you think the technology is really exciting, which I do, as journalists you really can’t ignore the fact that it will make stuff up”.

He has written about how The Guardian has been contacted by researchers and students asking for help finding an article cited by ChatGPT. The chatbot had referenced articles with headlines written by named journalists on the correct beats, but after trawling through the archives and even lists of deleted stories The Guardian found that the stories had never existed.

Moran pointed out that in Google’s beta test for Bard, its AI-generated chatbot that is expected to be integrated into search, it includes a warning about its results: “Bard may display inaccurate or offensive information that doesn’t represent Google’s views.”

“I think if you are a quality news organisation, that has to make you stop and ask some very big questions about whether you can actually deploy it as it stands now,” he said.

This issue may be insurmountable for many news publishers. Debate continues around whether these types of hallucinations will be ironed out as the technology develops, or whether they will be an inevitable part of generative AI which will mean it struggles to reach a high enough accuracy rate to truly become trusted sources.

Risk to trust and business models from AI in search

Generative AI platform also pose a threat to news publishers via their appearance in search results. Moran explained: “Where it starts to get really worrying, and where you start to see this problem space open up into an even bigger area, is when Microsoft and Google and others build these tools into places like search where one might expect a definitive and accurate answer to a certain extent, it may be generating nonsense.

“And it may be ascribing pieces to The Guardian or The Times or The New York Times which never existed in the first place. And I think that has some fairly obvious repercussions for the information ecosystem.”

The search engines and chatbots could also end up “potentially undercutting” trust in news brands by citing them against hallucinations, he added.

Moran also addressed the “massive” ramifications for news publishers’ business models of AI-generated search results giving people so much information in response to their queries that they will no longer need to click through to the original articles.

He said: “You effectively have Google’s AI intermediating between your content and the user, and bypassing your website… What’s interesting is if that intermediation also carries a health warning which says none of this may be true, you start to ask interesting questions, I think, about whether or not gen AI is a good technology to apply right now in a search environment in particular.”

The Guardian’s head of editorial innovation Chris Moran. Picture: The Guardian

Moran noted that The Guardian has operated an API for several years through which people can pay to get a stream of its content, but no AI companies have been in touch to ask to use that. The FT reported in June that publishers including News Corp, Axel Springer, The New York Times and The Guardian were in discussions with tech companies about copyright and payment for the use of their content in chatbots.

Moran said The Guardian is “thinking about what the value of our content is in this environment” and added that “one thing’s for sure, is news media features very significantly in a large part of these common crawl sets”.

The Guardian’s AI principles

The copyright issue makes up one of The Guardian’s three “principles” on generative AI published in June, which said: “A guiding principle for the tools and models we consider using will be the degree to which they have considered key issues such as permissions, transparency and fair reward.”

The principles also say that before they use generative AI in any way, a Guardian journalist must make the case for why it benefits readers, have it signed off by an editor, and ensure it is transparently signalled to the user. Any use must also only take place if it improves the quality of work and supports The Guardian’s mission.

Despite putting these principles in place, generative AI has not yet been added into the production process in any way at The Guardian.

Moran explained: “We do have a commitment that we don’t just want people adopting this stuff because it is new… One of the interesting things about efficiency and generative AI in a news organisation right now is if you cannot automate it without a human in the loop, then the possibilities for efficiency drop if you’re having to recheck it.

“And, crucially, what we don’t want to do is remove one of the things that most protects us from the potential of a huge wave of synthetic content about to engulf us which is we are trained journalists and editors and we are human. So we are looking for use cases which are explicitly around what can this do that we cannot do? What can this do that removes more mundane tasks from experts so that they are freed up to do their job? And so on, and so forth.”

Moran, whose job frequently sees him act as a bridge between the newsroom and the product and engineering teams, said although generative AI tools have not yet been deployed they have been tested by The Guardian’s journalists to try to find out what might be genuinely useful.

Moran said the tools can “pretty successfully” write headlines especially if they are in a particular format: for example, an interview headline will say who the person is, usually with a quote and some broader context.

However he added: “Again, to be clear, we could never automate that. Even with very carefully engineered prompts, it will still occasionally make up quotes, even when you ask it for a verbatim one from a given body of text. So the question then becomes, is this useful?”

Moran also said The Guardian’s live bloggers are playing behind the scenes with AI-generated summaries to see if it can help them from needing to trawl back through their previous entries to sum up what has happened every few hours.

He also gave the example of a bot that could, for example, tell a writer on a technical beat like environment or science when they file their copy that they need to simplify their language or explain certain things better for a general audience.

Guardian staff ‘not under pressure’ to deploy AI tools

Asked about when these tools might have a place in the Guardian newsroom, Moran said: “I think for us to choose to deploy one of these models in a productionised environment, especially with the background of the IP conversation, that’s a big call. And I think to do that we would have to have some use cases which weren’t just nice, but which are generally transformative.

“So right now no, we are not under pressure to deploy this stuff at all. I’m really proud of the fact that we’re not. I think it is the right position to say how does this stuff work? What is the benefit to the reader? Can you isolate that? And is it worth everything else?”

Something else that needs to be considered before staff can deploy generative AI tools is educating them on the potential dangers of it in their normal working behaviours, for example when using buttons like “make this more concise”.

“Mostly reporters should not be touching this stuff for research,” Moran said as another example. “If you go to Google, and you can’t find the thing you’re looking for, going to ChatGPT because you can’t find it there is almost certainly an extremely bad move.”

He revealed The Guardian has already given demonstrations to 270 people in the UK newsroom and to the whole of the US and Australia newsrooms, and that they will eventually produce something like their staff social media guidelines covering this area.

“Education in terms of normal working behaviours is almost more important here than what’s the one thing you’re going to build,” Moran said.

Publishers also have a responsibility to journalists and others, such as designers and photographers, whose jobs may be affected by the arrival of generative AI, Moran said.

“Right now it feels to me that there will be people who move on efficiency and I personally would say it’s too early to do that, I don’t believe that you can rely on it well enough,” he said, warning against making any major decisions yet.

But he added: “I think the fact is, though, right now, we don’t know how these things are going to impact us. We don’t. And I think it would take a much braver person than me to say everything’s going to be fine, or everything’s going to be a disaster.

“One thing we do know is it will change jobs at some point if the technology does improve.”

To hear the full interview with Chris Moran, listen to Press Gazette’s Future of Media Explained podcast.