Chatbots based on publishers’ own content archives pose “a lot of revenue opportunities” according to the chief digital and information officer at Forbes.
Forbes is among the publishers trying out generative AI tools trained on their own archives, having launched AI-powered search tool Adelaide at the end of October.
Both Google and ChatGPT owner OpenAI have this year made technology available to allow people create their own generative AI tools using their proprietary content and data, mitigating the copyright issues – although hallucinations and other inaccuracies may still pose a risk.
OpenAI began rolling out customisable versions of ChatGPT for paying ChatGPT Plus and Enterprise users in November, promising that no coding was required.
Google had already launched Enterprise Search on Generative AI App Builder, which lets organisations easily create custom chatbots and semantic search applications. It said in a blogpost in June: “Customers can combine their internal data with the power of Google’s search technologies and generative foundation models, delivering relevant, personalized search experiences for enterprise applications or consumer-facing websites.”
Forbes used Google’s Enterprise Search feature to build Adelaide and chief digital and information officer Vadim Supitskiy told Press Gazette it took just two weeks to create a prototype and two months to release it.
Adelaide, named after the wife of Forbes founder B.C. Forbes, allows people to search for Forbes content. Answers to specific questions are provided in a more conversational way than a traditional website search bar, with relevant links signposted underneath. Currently, the tool is trained on 12 months of Forbes content only but it is planned that this will eventually roll out to the whole archive.
Supitskiy said they would not have created Adelaide if Google’s tool had not allowed them to exclusively use Forbes content rather than bringing in a whole database of information from across the internet.
“That was the number one concern for us so we wouldn’t have released it until that kind of functionality was readily available,” Supitskiy said. “We only want to provide our users with our trusted journalism, right? We don’t want any outside content.”
Forbes ‘didn’t want to jump in without any safeguards’
Supitskiy explained that Forbes “usually in general try to innovate as much as possible” – for example in the metaverse and with NFTs – and that generative AI has been no exception.
“We’ve done a lot with AI in the past. AI is core to what we do in general, but gen AI is new and it really was exciting for us to look at all the capabilities that were available. But we also wanted to make sure that when we started using it, it was safe, it was trained and grounded on our content only.
“So we didn’t want to just jump in without any safeguards… we worked with Google to collaborate on that and make sure that the products created in a specific way that it’s only our data, it’s trained on our content, and so forth.”
They then looked at which areas of the Forbes website could benefit from the addition of the tool, potentially driving engagement and loyalty.
“And one that came up kind of right away was search because traditional search on publishers’ websites has been the same for so many years,” he said. “We always kind of looked at it and thought okay, how can we innovate on that particular product… but in general, it’s pretty straightforward and not the most engaging experience. So we thought gen AI could be a good solution here.
“Not only does it become interactive for the user – they can engage, talk to that particular experience, create a conversation – but also get very relevant content to dive deeper into it.”
Since 26 October Adelaide has been in beta testing with 5% of the Forbes audience seeing its answers when they search for something on the website. In addition anyone can go to the dedicated Adelaide page.
Speaking at the end of November, Supitskiy said: “So far what we’ve seen is good engagement when people come to that page, so actually people are engaging really well there and are coming back to it. They like the experience. What we’re working on right now is how to get more people into the experience…”
Ultimately, Supitskiy said the aim is to boost engagement but said monetisation could come down the line, suggesting paying users could get an exclusive element.
“I think there are a lot of revenue opportunities in the future but I think it starts with a good product. A product that people like, that they will come back to and want to use more.
“Of course, you can monetise it with display advertising, but also we’re thinking about what experience would be valuable to our members as well, right? How can we have potentially different tiers of something that’s exclusive to our members…”
The main tech challenge posed so far has been the model not always identifying the most recent information, but Supitskiy said “we figured out how to optimise some of that”.
Other publishers have raised concerns around chatbots stating false information as if it was true, a phenomenon known as hallucinating, even when the tech has been trained on their own content.
But Supitskiy said: “We’ve seen it in the beginning, and this is still kind of a concern in general, but I think so far Google has optimised it and grounded it to the point that it’s doing very well,” he said.
Money Saving Expert has created its own “MSE ChatGPT” module in its app for users to ask questions and receive answers based only on the outlet’s own archive of hundreds of thousands of articles.
Editor-in-chief Marcus Herbert told Press Gazette in September the bot was “giving some really, really interesting results” but the results were not yet “sufficiently accurate” especially when attributing and recommending relevant MSE articles. He was also concerned about the risk to trust from hallucinations, if they mislead people about MSE content.
‘The key is to build something that’s valuable for your audience’
Supitskiy expects more gen AI-powered products to be launched by news publishers soon.
“I think a lot of people will have something with gen AI because it’s such a powerful technology, right?” he said.
“I think the key is to build something that’s valuable for your audience and, you know, it could be different for different publishers, they have different angles. So focusing on your main goal is important in my opinion, but because the technology is so powerful, I do think that more and more publishers will get into the game and will create some experiences, either internal or external.”
Metaverse industry professional Tom Ffiske has seen similar pros and cons on a chatbot he built on his tech newsletter and website Immersive Wire.
Ffiske, who prefers the term “chat companion” as he feels “chatbot” carries stigma from bad customer service experiences, created the Immersive Wire tool last month after OpenAI launched custom GPTs.
The Immersive Wire Chat Companion is based on a mixture of OpenAI’s latest natural language model GPT-4, which contains information from the wider internet up to April 2023, and Ffiske’s own content archive of several years. It is currently only available for paying ChatGPT Plus users.
Ffiske told Press Gazette using his own archive means it is “pre-curated… it’s not shit information. It’s not just, like, 1,000 press releases which are unvetted. It’s stuff which I think is important, curated by me, and it only pulls the relevant information.”
Ffiske acknowledged that all of the data he is inputting to the GPT via a spreadsheet of his content will become part of OpenAI’s training dataset at some point when it does its next major model update (although it is possible to switch this off). But he said this would only bother him if OpenAI managed to make its model totally up to the minute.
“Since it’s using information which is out of date, I become more relevant because people come to me for the literal latest.”
He has seen similar problems to Forbes so far, with the tech struggling to work out what the latest update is.
He did find a workaround in his spreadsheet, but said that initially “even when I used the word ‘latest’ and even when I was categorising data on dates, it’s still coming up with information which is out of date”.
Chatbots can help news publishers ‘become a hub of relevant and good information’
He has also had to learn how to best categorise and tag the data and stories so that the right type of information is surfaced in queries – for example so it knows whether to share news or a recommendation.
“The challenge publishers will have with all their news stories is what are they tagging each article with so that it meets the majority of users using the tool?” Ffiske said. “Based on my own website data I’ve got an understanding that people are interested in what Apple’s been up to… which is why I’ve categorised stories in a particular way to try to meet some of those people. But I know that there may be inquiries coming up which I can’t predict.”
Overall however Ffiske believes it can be a significant opportunity for publishers to “become a hub of relevant and good information which people can seek in their own time”.
He added: “I think for news websites, it’s less relevant because when you go on the homepage, the user intention is to find the latest news on what’s happening that day. I think this is a more of an opportunity for websites that provide curated information or reviews and such like because it’s more evergreen, you’re not going there each day, you’re going in there to get some expert opinions. So Which? might be a really good website to do this for example, just to source relevant things.”
He also suggested that one revenue opportunity could be offering brands the opportunity to sponsor certain types of inquiry and receive a link to their product in the results.
Future’s tech brand Tom’s Hardware was one of the earliest to create a chatbot, dubbed the “Hammerbot”, based on its content archive allowing users to ask for recommendations and advice and get answers in the form of text and links to relevant articles on the website.
Other tech websites have since created their own versions, including Macworld, PCWorld, Tech Advisor and Tech Hive, all owned by Foundry and all using a tool trained on their content portfolio called Smart Answers.
Future chief executive Jon Steinberg told Press Gazette’s Future of Media Technology Conference in September that the Tom’s Hardware Hammerbot has “no hallucination, there’s no false data. It’s read expert content and it’s coming up with a result from that.”
Indeed a post in the Tom’s Hardware forum in September about the “known issues” with the bot since its May launch mentioned some other issues, but not hallucinations or inaccuracies.
It said answers were sometimes out of date, recommending older products, similar to the issues seen at Forbes and Immersive Wire.
In addition the top link in the search result was not always the one most directly related to the answer given by the Hammerbot, most of the answers were “fairly short” and “a bit terse”, and they “may express opinions that aren’t necessarily those of Tom’s Hardware”.
Email pged@pressgazette.co.uk to point out mistakes, provide story tips or send in a letter for publication on our "Letters Page" blog