Fighting for quality news media in the digital age.

  1. News
October 24, 2022updated 02 Mar 2023 9:14am

Dos and don’ts of newsroom automation – and why ‘robot journalism’ isn’t the right term

By Bron Maher

Use newsroom automation to get journalists doing more “robust reporting” – but “do not automate a bad process”, publishers and experts in the use of newsroom AI have said.

Attendees at Press Gazette and United Robots’ webinar “News automation: Winning robot journalism strategies for 2023” on 12 October heard from leaders in the field that the key to automation success is knowing in advance where you need it most – and acknowledging that AI can only take you 80% of the way with certain stories.

But once the automation is off the ground, journalists, rather than losing their jobs, are freed up to do more interesting and informative work.

There were, however, some differing views on how open publishers need to be about automation and some opposition to the phrase “robot journalism”.

‘I don’t call it robot journalism’

Press Gazette editor-in-chief Dominic Ponsford asked panellist Thomas Sundgren, the chief commercial officer at United Robots, what we mean when we talk about AI, robots and automation.

Sundgren explained that the sort of AI popularised by science fiction – or latterly, language programme GPT-3 – isn’t necessarily useful to publishers.

“Most newsrooms and most suppliers don’t want to apply that pure machine learning tech into automated content for a newsroom,” he said. “And the reason is, basically, that machine learning is meant to constantly change and learn, and change output based on learning…

“Sometimes that could produce a radically different output format, when it comes to editorial texts, because that’s what it’s meant to do. And most readers and newsrooms actually want a predictable format.”

Cynthia DuBose, the vice president for audience growth and content monetisation at US local media group McClatchy, echoed Sundgren’s point.

“I don’t call it robot journalism at all. I think when it comes to local news, I define AI for local news as information. It’s data that our readers want to know – whether it’s weather, high school sports, real estate prices. And it allows us to add a layer of information back to what we’re offering our community.”

McClatchy has been using automation to handle parts of its real estate reporting, for example doing the grunt work processing reams of property sales and pricing data.

DuBose said: “For us, we use the term ‘AI articles’. We don’t use robot journalism. Our journalists are continuing to create journalism. And that’s something that I think McClatchy has been very clear about – we don’t believe that AI or the robots can do what our reporters do.”

What newsrooms can do with automation

Aimee Rinehart, who leads the local news AI initiative at Associated Press, said AP had used a similar approach with earnings reports and their journalists “can now do more robust reporting and leave the data to the machines”.

She said: “Since 2014, we pioneered natural language generation with earnings reports – so taking data in a spreadsheet from quarterly earnings and creating articles based on that. And so each cell has an ‘if-then’ – so if the price went up, this is the language [you tell it to use]…

“And we went from doing 300 earnings reports every quarter to 3,000. And the journalists that we had reporting on this said that they [had] felt like robots, actually, by doing 300 earnings reports, and they are able to now do analysis.”

Rinehart told the webinar: “No one lost their jobs during that [implementation]. In fact, people said that their jobs became better, more interesting. And they were able to write analysis pieces around what those earnings reports collectively or by sector meant.”

Pete Clifton, the editor-in-chief at PA Media, said his organisation had used automation to localise data reporting.

When the Office for National Statistics publishes a major data release, Clifton said automation allows PA “to scale up, to provide 300 localised versions on a story like that.

“Whereas we would previously have done one national view, we might have done a couple of breakouts on a particular talking point in one region, but we would never have been able to provide that level of detail for that many different local authorities or towns or health authorities and so on.”

How to make the most of automation

DuBose explained that when McClatchy was figuring out how to apply automation to its work they “identified areas where we had an audience demand for basic information. We recognised in many of our markets, pretty much all of them, there’s a big appetite for timely and accurate real estate information.

“Then we looked for opportunities to produce content at scale… We looked for structured data that was available across all regions, which also made the implementation much more feasible. And in this case, that was real estate transactions.”

Asked by Ponsford whether the implementation had driven an increase in revenue, DuBose said: “We are seeing an increase in audience reach.”

Another theme that recurred was the importance of humans to make the automation work.

Rinehart from AP said: “When we are talking with local newsrooms, and we’re talking about implementing solutions for things that don’t always have that reliable structured data, we say that AI can take you 80% of the way there. And then that other 20% has to be a human.”

Despite that, she confirmed to Ponsford that AP is so confident in its models that its automated earnings release digestions go straight onto the wire without human checking.

Clifton emphasised that the capable team of data journalists at PA had been crucial to the company’s implementation of automation. On part-automated PA articles he said: “Often it’s bylined by one of the data journalists who’s overseen the end-to-end process, which has included plenty of automation along the way, but they’ve put in a fair amount of the grunt work as well.”

How open should you be about your use of automation?

That point on bylines formed one of the few differences of opinion among the panellists. Whereas PA does not typically announce the involvement of automation in its articles, McClatchy is open about it.

Clifton said: “I don’t think there are many examples where publishers are racing to put up stuff [where] they’re proudly saying ‘this is completely automatically generated, and there’s been no intervention at all’… The formats might change, and it might really unnerve the readership.”

He said he has had discussions with customers over automating sports content, but found clients preferred the idea of a human writing it. Nonetheless, he felt “you need to hold the hand of customers to let them know what it is they’re actually getting, and to know that it’s being done responsibly… but the net result is something that they’ll feel comfortable with”.

At McClatchy, DuBose explained: “We have bylines that show the stories were generated by a bot… And then we have a note that really explains how the information came to be, where the information came from.

“And we have an email address that we use for feedback. We really wanted to be transparent.”

DuBose suspected McClatchy’s openness had helped its sites perform better on Google.

“We have not seen any penalisation… Google wants [automated content] to be identified, which we do, and we feel we do very well – with the bot byline, with the footer that we have on the bottom, and also [making sure it’s] not repetitive.”

Keeping on Google’s good side

Ponsford asked the panellists how Google would take to widespread automation, given its recent emphasis on boosting original content.

[Read more: ​​Google’s latest core algorithm change hits major news publishers harder than ‘helpful content’ update]

DuBose said: “Last week we had a hurricane – we were [covering] the weather. That story was number one in one of our sites through Google. And it was just a predictive [article] – it was a report of ‘This state could be in the path’…

“So I think there are a few things that when you are setting up your template, you want to make sure that you are looking to have not every article be exactly the same, you want to make sure that you’re being transparent about who is writing and how this was written.”

Sundgren, of United Robots, added: “What Google does to penalise automated or bot-generated content, it’s exactly what Cynthia says. It is repetitive, short text snippets meant to just be put out there in volumes just to attract SEO.

“And Google are smart enough of course and good enough to differ that kind of dirty bot content from automated editorial content as we do it.”

Tips for success: Define the gaps in your reporting

Asked for any other advice for newsrooms hoping to implement automation, DuBose said: “Make sure you’re doing that in a topic in an area that is going to appeal to your audience. It doesn’t make sense to spend the time to automate to have additional articles, if that is not a topic that resonates with your communities.”

Rinehart advised that tech wasn’t necessarily always the answer.

“You don’t always need a tech solution. But I will say as you think about what could solve a problem, you will probably discover you have just a bad process. So do not automate a bad process.”

And Sundgren said: “It is about defining the gaps in reporting that you think you have and that you need to fill to be able to satisfy [your] audience, give them better information, cover new stuff.”

Picture: Shutterstock

Email pged@pressgazette.co.uk to point out mistakes, provide story tips or send in a letter for publication on our "Letters Page" blog

Websites in our network