Fighting for quality news media in the digital age.

  1. Press Gazette Events
September 24, 2025

ChatGPT is ‘ghost’ of what publishers provide directly

O'Reilly Media showcases in-house AI tool 'Answers'.

By Alice Brooker

AI companies can grind up content and serve it as a hamburger but cannot steal the “steak” being served directly on publisher websites.

This was the analogy used by founder of Miso Technologies Lucky Gunasekara speaking to Press Gazette’s Future of Media Technology Conference on 11 September.

Together with Julie Baron, chief product officer at US technology publisher O’Reilly Media, he showcased “Answers” – an in-house generative AI platform. O’Reilly is a book publisher and digital learning platform and offers Answers as part of its $49 per month subscription.

Gunasekara said Miso, founded in 2017, came about after discovering a “deep learning model called Burt, which is basically like the great, great grandmother of [Chat]GPT”.

After showing this to O’Reilly, he was told “this might actually solve a lot of problems”.

The partners came up with the Answers tool in 2020, which creates “a solution capable of reliably answering questions for learners, crediting the sources it used to generate its answers, and then paying royalties to those sources for their contributions”, according to the O’Reilly Media website.

Answers allows a user to “just tap the Answers widget, ask [a] question and get the most relevant source in the context of that content experience”, Baron explained. The tool can also explain text that a user chooses to highlight, and respond in different languages.

Answers is built on data “that basically takes print and digital and makes it all promptable internally”, said Gunasekara.

Answers was created before ChatGPT, which Baron described as a “ghost” compared to the original content it is based upon.

The most valuable aspect to Answers is its “easy opt-in system” for publishers who partner with O’Reilly so they can consent to their content being used, she added, as well as the platform’s ability to maintain “references and sources”, which is “pretty unique on the market”.

Baron added that a weakness of ChatGPT and other LLMs is hallucinations, “what we’ve built with Answers is that there’s direct citations and quotations”.

“ChatGPT can’t do that at all,” she said. “They can’t quote from a book or provide the exact image or provide exact specific content. And that’s a leg up – that’s the steak.”

O’Reilly has seen “higher rates of engagement” with Answers, and “more titles accessed across the entire ecosystem”, in an attention economy where users don’t want to browse anymore, said Baron.

Baron, who has a background working at The Globe and NPR, said the tool allows publishers to continue to make money from old articles.

O’Reilly described its partnership with Miso as “really lucrative”, adding that it is available to 2.5 million paying subscribers.

O'Reilly Media chief product officer Julie Baron speaking on stage at Press Gazette's Future of Media Technology Conference.
O’Reilly Media chief product officer Julie Baron at Press Gazette’s Future of Media Technology Conference. Picture: ASV Photography for Press Gazette

Content no longer ‘published for human consumption’

“We are very quickly moving out of an era where content is published for human consumption,” Gunasekara said.

“Now, we’re publishing content not just for humans to consume, but for machines to consume on behalf of humans [or]…on behalf of other machines.

“There’s over 2,200 unique bots targeting publishers right now, and that was like 1,100 at the beginning of this year. The rate of growth and the appetite for premium content by these scrapers has never been higher.”

Baron added now LLMs are a source of referrals, there needs to be a push on these referrals being “more of a handshake as opposed to kind of free content”.

Robots management needs to be 24/7 effort for publishers

The scraping “field” is “moving really quickly”, said Gunasekara, adding that of the 11,500 publishing sites Miso monitors, “47% of them don’t have Robots.txt”.

Robots.txt is a text file placed on a website’s server that tells search engine crawlers which parts of the site they are allowed or not allowed to crawl.

Of the sites that do have Robots.txt, “well over 80% are out of date”, said Gunasekara.

“The robots management is not like once a month or once a quarter thing. It’s a 24/7 effort now.

“You need to get evidence of violations, which is making sure you [are] crawling and scraping them for evidence that they have your content. And you should be screenshotting it and sending it to your lawyers. That is one form of pushback we’re just not seeing enough of right now.”

Gunasekara referred to “the hamburger law” – scrapers are “stealing publisher steak and grinding it into hamburger meat”.

“Once [an AI model] crosses that line of reproducing your images [and] reproducing quotations and they don’t have an agreement with you… that’s a ‘got you’ moment that they’re very aware of. So, the result is that [if] they’re serving hamburgers, you should be serving Michelin star meals, right? That’s really been our mindset as we’ve proceeded.”

CEO at AI engine Miso Technologies Lucky Gunasekara at Press Gazette's Future of Media Technology Conference
CEO at AI engine Miso Technologies Lucky Gunasekara at Press Gazette’s Future of Media Technology Conference. Picture: ASV Photography for Press Gazette

Email pged@pressgazette.co.uk to point out mistakes, provide story tips or send in a letter for publication on our "Letters Page" blog

Websites in our network