Legal experts say OpenAI has ‘case to answer’ in showdown with New York Times

Why the case could put OpenAI "on the horns of a dilemma" in what it concedes.

Webpages of the New York Times, Common Crawl, OpenAI, and Microsoft are seen on a computer. Picture: Shutterstock/Tada Images

The New York Times claim against OpenAI appears to give the tech company “a case to answer” and puts it on the “horns of a dilemma” in its negotiations with other publishers, IP experts have told Press Gazette.

The New York Times is suing OpenAI and its partner Microsoft for copyright, arguing they “disproportionately” used its content archive to train ChatGPT and asking for all large language models that have been trained on its copyrighted work to be destroyed.

OpenAI has hit back by claiming The New York Times “intentionally manipulated prompts” to force AI tools to reproduce its content almost word-for-word to back up its case. It argued using content to train AI tools is “fair use” but said it allows websites to opt out “because it’s the right thing to do”.

Five IP lawyers gave Press Gazette their initial assessments on the key points of the New York Times case and, given it is being brought in the US, what implications it could have in the UK.

Mark Nichols, senior associate – IP solicitor at Potter Clarkson, believes the case has merit and suggested The New York Times has bolstered its case by adding additional arguments alongside its main copyright claim.

“On top of The New York Times’ principal allegations of copyright infringement, it’s interesting that the publication also argues that OpenAI has removed copyright management information from its articles, and creates AI ‘hallucinations’ which are falsely attributed to the newspaper, diluting their trademarks and reputation,” he said.

“This gives the New York Times some strong fallback positions if the copyright claim fails, and arguably puts OpenAI on the horns of a dilemma.

“Either they accept that they output works which are very similar to the New York Times works from which they derive, or they accept that they are not similar but are at risk of introducing falsehoods.

“Given the assertions and evidence given by The New York Times, it’s likely that the case has merit. However, there will be some technical points of copyright law which have probably not yet been considered by the courts in this context, on which a lot may turn.”

‘It seems that there is a case for OpenAI to answer’

Several of the lawyers noted the case is hard to predict because it touches on untested parts of copyright law.

Nick Eziefula, partner and AI and intellectual property specialist at Simkins, said: “Generally, copyright legislation was not written with current AI scenarios in mind. Much will therefore depend on the court’s chosen interpretation. Given the inherent uncertainty on the issues, it seems that there is a case for OpenAI to answer,” he said.

Eziefula believed there are “some noticeable omissions” in how OpenAI has responded so far. “For example, when setting out its position in relation to the US concept of ‘fair use’, OpenAI alludes to the right under EU law that permits Text and Data Mining (TDM). However, OpenAI does not consider the limitations to that right, notably that it only applies to ‘lawfully accessible works’.

“In light of this, it is important to note that The New York Times has a paywall to access articles. OpenAI also doesn’t state whether, when scraping training data from the internet, it considers opt-outs that are not expressed in its required technical format.

“It is also not clear how OpenAI treats information that may have been made available online by a third party unlawfully (e.g. when a third-party website unlawfully publishes a copy of a New York Times article), which ChatGPT is then trained on. This was a key issue raised by The Authors Guild in a separate lawsuit brought against OpenAI earlier in 2023.”

Eziefula also warned that OpenAI was at “risk of sending mixed messaging” by highlighting its ongoing discussions with news organisations, and previous negotiations with The New York Times. Bloomberg reported on Wednesday that OpenAI is in talks with CNN, Fox and Time to license their content, while it has already signed deals with Axel Springer and the Associated Press.

“On the one hand OpenAI says it does not need the permission of the news organisations to train on their news articles (on the basis of alleged ‘fair use’),” Eziefula said, “whilst on the other hand it enters discussions to obtain those permissions.”

Working against The New York Times in its bid for damages, he said, is the idea that some of the chatbot responses “regurgitating” extracts of its articles only came because of very specific prompts. Eziefula said this raises “the question of what (if any) damage The New York Times has suffered, rather than whether OpenAI infringed their IP or harmed their reputation in the first place.

“The less people have seen or accessed the ‘harmful’ content, the less likely it is that The New York Times could recover substantial damages for the infringement.”

Josh Little, partner at Marriott Harrison who specialises in IP law, said the New York Times case “goes a step further” than other claims filed against generative AI models to date. Fiction and non-fiction authors have also launched lawsuits against OpenAI and Microsoft in the US, while Getty Images is pursuing image generator Stability AI in the UK.

“Other claims focus on the copying of source data for the system to learn and generate new content,” Little said. “The New York Times alleges that ChatGPT has copied source data and then repeated it verbatim in its output.

“OpenAI has publicly responded by saying it believes that training AI models using publicly available internet materials constitutes fair use and therefore isn’t an infringing act under US law.

“Whether or not that is the case is only part of the issue here and fails to address the claim that ChatGPT is copying and regurgitating source data verbatim. This case could be an interesting test of the limits of the concept of fair use under US copyright law.

“Alternatively, it may just be tactical positioning as part of the negotiations between the parties…” The New York Times first got in touch with OpenAI “to raise intellectual property concerns and explore the possibility of an amicable resolution” in April last year. OpenAI said it believed the discussions were “progressing constructively” until their last communication on 19 December – 12 days before The New York Times filed its case.

Significance for IP beyond the US?

The New York Times complaint refers to US copyright law, under which the “fair use” defence can cover “transformative” uses that add something new and do not substitute the original work.

Eziefula of Simkins noted: “Although this US case would not set a direct legal precedent to be applied in other jurisdictions, given the US-centric nature of the AI industry, and the global prominence of the parties to this dispute, a US decision on the issues will likely have a significant influence on the approach to the regulation of AI systems internationally.”

Simon Barker, partner and head of intellectual property at Freeths, noted that the fair use defence is “more flexible in the US than it is in the UK”.

“However,” he added: “The New York Times says this undermines its ability to provide its own articles and so OpenAI may struggle to contend that the use is fair.”

In the UK, the “fair dealing” exception is different and can cover news reporting, criticism or review, and research – although there is no statutory definition and “it will always be a matter of fact, degree and impression in each case,” as the Government explains.

“The question to be asked is: how would a fair-minded and honest person have dealt with the work?” Courts have previously assessed whether using the work affected the market for the original work, causing the owner to lose revenue, and whether the amount taken was “reasonable and appropriate”.

The New York Times lawsuit states that OpenAI and Microsoft’s actions have had a detrimental impact on several revenue streams: subscriptions, advertising, licensing, and affiliates.

Pete Konieczko-Hansom, head of IP law at Blacks Solicitors, said that the main question to ask under English law would be: “Is OpenAI permitted to use New York Times copyrighted material to teach its AI?”

“The owner of copyright has full control over what is done with a piece of copyrighted work. Normally, when an article is published, it is done so subject to a licence that as the reader or subscriber you can read and you use the article within the restrictions of the licence/the law. It seems unlikely that this licence extends to using the copyrighted work to ‘teach’ an AI system.

“Even if it could be argued that it is permitted to use the material to teach the AI, any output is, unless correctly attributed and not a substantial copy, likely to be an infringement of copyright.

“Assuming (and it is just an assumption) therefore that OpenAI has, by using New York Times material to teach its AI, infringed NY Times copyright, does OpenAI have a valid defence?

“English law currently permits the use of copyrighted material without permission in a very specific set of circumstances. These are usually for non-commercial purposes and the rules around them are very strict. These are mostly commonly around criticism/review, parody, education as well, and this is perhaps the most relevant one here, under the concept of ‘fair dealing’.

“Fair dealing is probably the exception that OpenAI would most likely hang its hat on. This is because there is not a set legal definition of what counts as fair dealing and the legislation leaves it for the court to decide. There are various factors that the court would need to take into account, not least whether there is any commercial benefit to OpenAI or commercial detriment to The New York Times.

“Under the circumstances, it certainly looks more likely than not that if this case were to be heard in the UK, The New York Times would be successful in their claim than OpenAI in defending it.”

A ‘fair use’ precedent could create ‘significant challenges’ for publishers

Nichols of Potter Clarkson said OpenAI’s public comments since the complaint was filed show it sees “political lobbying to be their best option, instead of going through the courts. They’ve already pitched the copyright position as an existential threat to them, ignoring that they are capable of – and have already – licensed copyright works from various rightsholders.

“Whilst I don’t think that it sets a precedent per se, it does suggest that negotiating licences is less commercially appealing than political lobbying, which further compromises the relationship between tech and traditional rights holders, like media corporations.”

Jen Clements, global publishing director at Smartframe Technologies which enables the distribution of copyrighted images to publishers, told Press Gazette an OpenAI win would create difficulties for publishers and they should get ahead of it by assessing their revenue strategies.

“If OpenAI successfully argues for ‘fair use’ of copyrighted materials in training AI models, it could set a precedent that allows other companies to use publishers’ content without permission, creating significant challenges to protecting their intellectual property rights,” Clements said.

“However, with many companies beginning to publish policies around the responsible use of AI, and the current string of lawsuits from publishers only set to continue, AI companies will need to build more transparency into their models and address building concerns from the industry.”

Clements added: “Whatever the outcome, this lawsuit and the prominence of AI should be a catalyst for publishers to review their monetisation strategies, not only looking at the mix of content they publish, but also whether their existing advertising strategies are working, particularly with the deprecation of third-party cookies.”