Fighting for quality news media in the digital age.

  1. Comment
November 28, 2024

Why publishers deserve more than 50/50 generative AI revenue share

Publisher bargaining power is the highest it's been for a long time, David Buttle writes.

By David Buttle

Daily Mail owner DMG Media is investing in Prorata, the AI start-up that has built technology to attribute value to media owners when content is used in real-time by an AI application.

As well as taking an equity stake as part of the deal, DMG will join the likes of the FT, Guardian, Axel Springer and Fortune in signing content deals for its forthcoming AI search engine.

Whilst Prorata’s technology is potentially promising – both for what it means for the licensing market as well as what it does – publishers are ceding too much value with 50-50 revenue share deals. This risks setting a precedent for more value erosion from the sector.

We should applaud DMG Media for their proactive approach to AI and risk. Too much focus in the industry has been on deployment of this technology. Whilst potentially delivering short-lived arbitrage gains, these considerations are largely a distraction from the far more consequential questions – around intermediation and damage to business models – that will arise from changes in the way consumers retrieve information.

We should also celebrate the fact that serious capital is going into businesses built on the promise of a market that will see publishers paid when their intellectual property is used in real-time to inform the response of an AI system via a process called retrieval augmented generation (RAG).

Together with Tollbit’s recent $24m series A round, Prorata’s success in raising capital is further evidence that smart money expects this market to materialise. Note however that use of content for AI training is a distinct market and, due to the potential financial liabilities for historic usage, one that is likely to remain throttled until legal cases have run their course.

Prorata’s technology is intended to facilitate the content access market for RAG by “analyz[ing] a generative response and fairly attribut[ing] each source of contributing content”. The idea is that if a particular publisher proportionately contributed more to a response, then it would receive commensurately more revenue from the AI developer.

The start-up’s long-run ambition is to offer ‘attribution as a service’ to AI developers; the technology can be plugged into a third-party AI application and for a small fee, when this content is accessed to inform an output, the creator is rewarded on the basis of the value of their content to the response.

Prorata is developing a paid-for AI search engine to demonstrate its technology and in a bid to take a slice of this newly-competitive market (although I’m not sure how much of a threat it will be to Google with just $25m of funding). This will only access – and provide answers on the basis of – licensed content from its publishing partners (the foundational LLM, Meta’s Llama, was obviously trained on a far larger corpus of data). Publishers will receive 50% of the revenues. There are no minimums or guarantees.

In a world where ever more of their revenue needs to come from licensing content as an input to AI-powered user-facing services, publishers are ceding too much value with these 50-50 revenue share deals.

Whereas the markets for search and social media are highly concentrated with powerful network effects which entrench dominant players, this is not the case for consumer AI applications. The foundational technology is available off-the-shelf and search indexes are now public. It seems likely there will be a multitude of AI information-retrieval services; some will look very much like Google, others will take different forms and flavours.

For media owners this means that their relative bargaining power is greater than it has been since the early days of digital news delivery. Negotiating with Google around search has been a fruitless endeavour: it has been in a position to set take-it-or-leave-it terms. Publishers have been reliant on the flow of traffic meaning there is no walkaway point. Doing a deal with Prorata has radically different characteristics.

The attribution start-up has no consumer-facing product, is therefore delivering no users, no branding, and as far as I am aware, no technology credits (of the kind offered by e.g. OpenAI) to help publishers. Its product – which provides fully-formed answers to news queries – is potentially substitutional to engagement on media outlets’ owned and operated platforms.

Furthermore, it is hard to overstate how important these partnerships are for Prorata. Were its technology to be deemed unacceptable to content creators, then it would be valued at far less than the $130m figure established at its last funding round. These partnerships are giving immediate and material value to Prorata in return for vague promises of future reward.

It seems clear that publishers have a stronger hand here than ceding 50% of the value of their content. To give a comparator, Spotify pays out roughly 70% of revenue to rightsholders and it is the market leader with a third of the global music streaming market, bringing vast audiences and revenue potential to artists and labels.

That publishers have signed on these terms is particularly concerning in that they may begin to establish a precedent for the relative value of the content inputs versus the delivery platform in the AI information-retrieval age. After decades of value loss at the hands of social and search platforms, we should be pushing to retain more of the economic value of our content within our businesses in this next platform era.

Whether Prorata’s model for attributing value is ultimately the standard approach adopted for RAG licensing, I’m not sure. On the plus side, assigning credit and weighting payment to the content creators contributing the most to an AI response is valuable. However, the Prorata approach to attribution is not based upon what is actually going on inside an AI system as it formulates its response (which is, in effect, unknowable).

Instead it is a post-hoc assessment based upon the response and the available corpus of data. This disconnect creates potential pitfalls and opportunities to game the technology. Is this a preferable basis for payment versus a simple per-access model? Perhaps. But perhaps not.

What is certain to me is that we should be more ambitious than 50-50 revenue share deals for new AI search products. Publishers need to become aware of the value of their content in this new platform era. It is the highest it has been for a long time.

Topics in this article : ,

Email pged@pressgazette.co.uk to point out mistakes, provide story tips or send in a letter for publication on our "Letters Page" blog

Websites in our network