
I love tech. Much of my journalistic career has happily been spent in the intersection of media and technology, embracing the wizardry of digital tools and platforms to innovate new storytelling techniques and new business models.
Every tech development has seen me experiment and evangelise the latest exciting shiny toy.
Right now, that’s generative AI, artificial intelligence capable of producing content – text, images, audio and video – in response to a prompt.
Gen AI has some amazing qualities, magical even. The creative in me would love to be dabbling with the latest models, showcasing and celebrating what’s possible – particularly in the burgeoning field of generative video.
But I can’t. There’s something holding me back. And that’s my fundamental objection to the way the AI developers have trained their models, and their callous disregard of the creative industries on which they depend.
A quick step back. Large language models (LLMs) are the engines behind the consumer-facing chatbots such as ChatGPT, Claude, Copilot and Gemini. Those LLMs have been trained on data – words, images, videos that you and I have carefully crafted as news stories, books, scripts, photos, animations, TV shows, movies and more. That content is protected by copyright. But the hi-techs have ignored that and taken it without asking.
That’s prompted publishers and creators to pursue claims of copyright infringement, saying their works have been used without consent to create new products that then compete with them.
Most lawsuits have been filed in the US where the hi-techs point to something called ‘fair use’, a doctrine in US copyright law that gives them an exception, a ‘get out of jail card’ – so long as courts agree on a case-by-case basis that the use meets certain factors, including that it doesn’t harm “the existing or future market for the copyright owner’s original work”.
The US lawsuits – the most prominent of which is The New York Times versus Microsoft and OpenAI – are currently in discovery stage. Not that you’d know that from the utterances of several AI developers, who’d have you believe that fair use has already been settled (which it clearly has not), and secondly that the fair use doctrine applies globally (which it doesn’t – particularly here in the UK where it has no legal meaning).
The confidence and ease with which fair use is used to absolve the hi-techs of any wrongdoing is an example of what I’m terming ‘copyright denialism’. I’ve loosely based it on climate denialism which is an act of defiance, of rebellion.
Copyright denialism is an example of technological exceptionalism, the belief that tech companies don’t need to abide by normal rules, that they’re too big and too important to regulate, that nothing should be allowed to challenge their dominance – and, in the US right now, their supremacy.
The hi-techs and those who worship at their altar regard generative AI as a gift that ‘democratises creativity’. No need to learn how to play guitar, paint a picture, write a poem, or train to become a journalist – just enter a prompt and the wave of AI’s wand will instantly afford you the status of a creator.
The hi-techs don’t mention that actual creators and their actual art – the result of painstaking practice and skill development – are powering these plagiarised outputs. The hi-techs imply that creators sit atop gilded pedestals, jealously guarding their creative powers that now need to be dispersed among the populace.
The means by which the hi-techs are carrying out this popular revolution involves the denial of creators’ intellectual property. Their copyrighted works have been copied then chopped up by machines which spit out derivative versions to the seeming delight of the masses. No consent. No compensation. No admission from the hi-techs that what they’ve done is wrong. How can it be? They’re the good guys, breaking the creators’ selfish grip on a precious commodity that others less fortunate have a right to share.
AI copyright deniers follow climate denial playbook
Climate deniers reject scientific evidence. Copyright deniers are denying a centuries-old legal convention which first afforded protections to publishers in the mid-1500s and then authors in the groundbreaking Copyright Act of 1710.
Climate deniers bend the truth. So do copyright deniers, twisting logic and language as they seek to defend the indefensible.
Last March, Mira Murati, then chief technology officer at OpenAI, was asked how its video generator was trained. In an uncomfortable interview with The Wall Street Journal’s Joanna Stern, Murati replied: “We used publicly available data and licenced data”. Stern: “So videos on YouTube?” Murati: “I’m actually not sure about that.”
Stern: “Videos from Facebook? Instagram?” Murati: “If they were publicly available, publicly available to use, there might be the data, but I’m not sure. I’m not confident about it.” Publicly available does not mean legally available.
Last June Microsoft AI chief Mustafa Suleyman claimed that since the 1990s, a “social contract” had existed giving anyone a “fair use” defence under copyright law to copy, recreate or reproduce content on the open web. “That has been freeware, if you like,” he said. No such contract has ever existed, and fair use does not apply in the UK.
The notion that content on the web is ‘freeware’ contradicts Microsoft’s own terms of use which state “you may not modify, copy, distribute, transmit, display, perform, reproduce, publish, license, create derivative works from, transfer, or sell any information, software, products or services”.
More recently Suleyman’s boss, Microsoft CEO Satya Nadella, dreamt up a new defence to justify unauthorised scraping of copyrighted content. In October he told The Times that fair use needed to be defined. Fair enough, but once again, fair use doesn’t exist in the UK.
What he said next was significant: “If I read a set of textbooks and I create new knowledge, is that fair use? If everything is just copyright, then I shouldn’t be reading textbooks and learning because that would be copyright infringement.” Er, no. Reading and learning from a book has nothing to do with the industrial scale copying of copyrighted content for commercial gain.
UK copyright law is clear but ‘needs to be enforced’
The tragedy of all this copyright denialism is that it has influenced and polluted minds in government. Ministers are currently consulting on plans to water down the UK’s gold standard copyright regime, making it easier for the AI companies to scrape rightsholders’ content without permission or compensation.
Publishers and creators will be able to opt out, but that depends on them being able to identify the scrapers (which they can’t) and stop their works from being used (proven and reliable tech preventing this doesn’t yet exist). The consultation document is based on a false premise: that our copyright laws need to be reformed since a “lack of clarity” and “legal uncertainty is undermining investment in and adoption of AI technology”.
We shouldn’t have to say this but copyright law in the UK is absolutely clear. It is also completely certain. It just needs to be enforced.
That our government – a Labour government, which surely is meant to defend the rights of workers – would use copyright-denying terminology and side with Big Tech is utterly baffling. As is the lack of any reference to the unjust theft of copyrighted content and historic harms that have been done.
The consultation closes on Tuesday, February 25 at 11.59pm. Please respond and use your voice to fight back against the copyright deniers.
I’m hopeful that one day we’ll see the emergence of smaller models, ethically trained on clean data.
Generative AI’s inherent weaknesses – outputs are designed for plausibility rather than accuracy; it’s biased; it hallucinates (we really should say it makes things up); it can’t think; can’t reason beyond simple pattern spotting; it can’t create; it has no senses, no feelings, no emotions, no sense of humour, no knowledge of walking in our footsteps and experiencing how things in the real world work and interact with each other – will still be with us.
But at least I’ll be able to experiment with a clean conscience.
Graham Lovelace was the keynote speaker at the TravMedia UK Summit in London on 17 February.
Graham charts the impacts of generative AI on human-made media in his global newsletter, Charting Gen AI, and advises journalists and media firms on the responsible and commercially safe adoption of generative tools. He can be contacted via: graham@lovelace.co.uk
Email pged@pressgazette.co.uk to point out mistakes, provide story tips or send in a letter for publication on our "Letters Page" blog