Can Copyright Law Save Journalism From A.I.?

Matt Ford

May 1, 2024 at 1:22 PM·7 min read

Artificial intelligence chatbots are a booming business these days. So are copyright lawsuits against them. A.I. development companies have already been sued by artists and publishers over the use of books, photographs, and artworks to “train” their products, with mixed results so far.

Now journalists are joining the fray. On Tuesday, eight newspapers owned by Alden Global Capital sued OpenAI and Microsoft, two of the largest players in the burgeoning A.I. industry, for allegedly violating their copyrights by using their articles to develop their chatbots. Among the plaintiffs are The New York Daily News, Orlando Sentinel, and Chicago Tribune.

“Microsoft and OpenAI simply take the work product of reporters, journalists, editorial writers, editors and others who contribute to the work of local newspapers—all without any regard for the efforts, much less the legal rights, of those who create and publish the news on which local communities rely,” the lawsuit alleged.

The lawsuit is only the latest chapter in print journalism’s long struggle to survive the internet era. It may be a crucial one. Some of Silicon Valley’s tech barons are openly hostile to journalism, dreaming of the day when it can be “disrupted” or rendered obsolete. To do so, however, they may have to get through copyright law first.

To understand the problem, one must understand A.I. “Artificial intelligence” is something of a misnomer where ChatGPT and other popular A.I. tools are concerned. Even the most sophisticated programs are not actually capable of abstract reasoning, creativity, critical thinking, and other hallmarks of sentience. They instead simulate what a user might think intelligence looks like.

ChatGPT, for example, was “trained” with a large corpus of English-language written materials. (The precise “language learning model,” or LLM, is supposed to be a secret, but we’ll come back to that later.) It then uses algorithms to regurgitate what it thinks is the statistically likeliest answer to a query. These chatbots can be pretty convincing, as anyone who has ever used one can attest. But they also have obvious limitations.

In the last year or so, for example, a small but growing number of lawyers have been sharply criticized by courts for filing A.I.-written legal briefs that contained inaccurate legal citations. When asked a legal question, a human lawyer would likely research prior cases to get a clearer understanding of precedent. ChatGPT, on the other hand, does not have the power to research things that are outside of its LLM. Instead, it would simply create what its algorithms think is the likeliest answer, with algorithmically assembled fake case names to go along with it.

This “training” process is opaque in many of the newest chatbots. While OpenAI’s earliest iterations of ChatGPT’s language learning model were open source, the company has declined to disclose how it trained its most recent models. OpenAI has argued that the secrecy is necessary given the competitive nature of the A.I. industry these days. The newspapers argued, however, that it also serves as a convenient way to cover up what it describes as “mass copyright infringement.”

If the LLM is an internal secret, you might ask, how can the newspapers possibly know if ChatGPT is drawing upon them? They said in their complaint that they were able to get ChatGPT to produce “near-verbatim copies of significant portions of the publishers’ works when prompted to do so.” Chatbots cannot simply Google things and copy-paste them into their answers; the original source material must be present in their LLM.

To support that allegation, the newspapers included portions of articles written by them that were regurgitated upon request by ChatGPT and other A.I. chatbots trained on OpenAI’s GPT model. The newspapers also alleged that programs built by Microsoft using that model would produce large portions of their articles upon request, far beyond what a normal search engine would offer when queried.

“In this way, synthetic search results divert important traffic away from copyright holders like the publishers,” the newspapers claimed, referring to A.I.-generated summaries for search engines that can include extensive paraphrasing. “A user who has already read the latest news, even—or especially—with attribution to the publishers, has less reason to visit the original source.” To make matters worse, the newspapers pointed to custom GPTs in OpenAI’s store that allow users to circumvent the newspaper’s paywalls.

One of the most troubling allegations is that the GPT models also fabricate information and attribute it to the newspapers themselves. This process is akin to the incidents I mentioned earlier where ChatGPT would invent fake cases to cite when answering legal questions. “In AI parlance, this is called a ‘hallucination,’” the newspaper argued. “In plain English, it’s misinformation.”

The newspapers included instances where users were able to get ChatGPT to falsely say that the Daily News endorsed the idea that injecting bleach could treat Covid-19, that the Tribune had endorsed a now-recalled baby lounger linked to infant deaths, and that The Denver Post had reported that smoking could be a treatment for asthma. (New Republic readers: Please do not use bleach to treat Covid, put your infant in a recalled baby lounger, or smoke to treat asthma.)

These alleged copyright infringements and reputational harms have been part of a highly lucrative business model for the defendants. “As of February 2024, OpenAI was on pace to generate more than $4 billion in revenue in 2025—over $333 million in revenue per month,” the newspapers noted. Microsoft too is reaping the rewards of its early push into A.I.: The tech giant saw a 20 percent increase in profit in early 2024, in large part because of its GPT-driven products.

And while Silicon Valley is turning a profit, the newspaper industry from which it is harvesting content is in dire straits. But the newspapers said that their problem with the A.I. companies’ behavior was not strictly about dollars and cents. “This issue is not just a business problem for a handful of newspapers or the newspaper industry at large,” they argued. “It is a critical issue for civic life in America. Indeed, local news is the bedrock of democracy and its continued existence is put at risk by [the] Defendants’ actions.”

Neither OpenAI nor Microsoft have filed responses to the lawsuit so far. In similar cases, however, they have argued that their actions are protected by fair use, a doctrine in copyright law that allows for unauthorized uses in some circumstances. The New York Times sued both companies last December for similarly large-scale copyright infringement, alleging that it had first tried without success to reach an “amicable resolution” on commercial licensing.

In its motion to dismiss the case in March, Microsoft said that fair use also applied to any alleged use of Times articles. “Despite The Times’s contentions, copyright law is no more an obstacle to the LLM than it was to the VCR (or the player piano, copy machine, personal computer, internet, or search engine),” the tech giant claimed. “Content used to train LLMs does not supplant the market for the works, it teaches the models language.”

And in its own motion to dismiss, OpenAI described the alleged infringements cited by the lawsuits—regurgitating training texts and “hallucinations”—as “uncommon and unintended phenomena” for its A.I. models. The company complained that the Times had not reported these issues to it for their own review. “Rather, the Times kept these results to itself, apparently to set up this lawsuit,” OpenAI claimed.

The emergence of the internet has been helpful for journalism and public discourse in some cases. But it has also hollowed out the advertising markets that once allowed newspapers to stay afloat and keep their communities informed and civically engaged. As those vital outlets decline, sludgy Facebook posts and cheap Google hits have filled the void. Even then, A.I.-generated news might be the most disturbing development yet. It is one thing to replace newspapers; it is another to loot and pilfer them to train deeply flawed replacements.

TechCrunch
This Week in AI: Generative AI and the problem of compensating creators
This week in AI, eight prominent U.S. newspapers owned by investment giant Alden Global Capital, including the New York Daily News, Chicago Tribune and Orlando Sentinel, sued OpenAI and Microsoft for copyright infringement relating to the companies' use of generative AI tech. “We’ve spent billions of dollars gathering information and reporting news at our publications, and we can’t allow OpenAI and Microsoft to expand the big tech playbook of stealing our work to build their own businesses at our expense,” Frank Pine, the executive editor overseeing Alden’s newspapers, said in a statement.
Engadget
Microsoft and OpenAI sued yet again by Chicago Tribune and New York Daily News
A group of publications that include the Chicago Tribune, New York Daily News and the Orlando Sentinel are accusing OpenAI and Microsoft of stealing their copyrighted content to train generative AI products.
TechCrunch
Google still hasn't fixed Gemini's biased image generator
Back in February, Google paused its AI-powered chatbot Gemini's ability to generate images of people after users complained of historical inaccuracies. Google CEO Sundar Pichai apologized, and Demis Hassabis, the co-founder of Google's AI research division DeepMind, said that a fix should arrive "in very short order" — but we're now well into May, and the promised fix has yet to appear. Google touted plenty of other Gemini features at its annual I/O developer conference this week, from custom chatbots to a vacation itinerary planner and integrations with Google Calendar, Keep and YouTube Music.
Engadget
A group of TikTok creators are also suing the US government to stop a ban of the app
Eight TikTok creators have sued the US government in an effort to block a law that could lead to a ban of the app.
TechCrunch
Netflix to take on Google and Amazon by building its own ad server
Netflix announced during its Upfronts presentation on Wednesday that it's launching its own advertising technology platform only a year and a half after entering the ads business. This move pits it against other industry heavyweights with ad servers, like Google, Amazon and Comcast. The company originally partnered with Microsoft to develop its ad tech, letting Netflix enter the ad space quickly and catch up with rivals like Hulu, which has had its own ad server for over a decade.
Autoblog
Save 80% on this HD baby display monitor for your car
This baby car mirror from Shynerk can help you monitor your children in a car seat. This mirror uses a camera and an HD display to keep an eye on them.
Yahoo Finance
Car insurance costs are surging — but it's not because of price gouging
Auto insurer costs have exceeded revenue from premiums for three years straight. Blame fancy cars and reckless drivers.
TechCrunch
Matt Garman taking over as CEO with AWS at crossroads
It’s tough to say that a $100 billion business finds itself at a critical juncture, but that's the case with Amazon Web Services, the cloud arm of Amazon, and the clear market leader in the cloud infrastructure market. On Tuesday, the company announced that CEO Adam Selipsky was stepping down to spend more time with his family and recharge a bit, according to his statement. During Selipsky’s tenure, growth for the cloud division has slowed pretty dramatically falling from 33% in Q2 2022 to 12% in Q2 and Q3 2023 before ticking up to 13% and 17% in its two most recent reports -- although to be fair, growth has slowed across the industry, as the space has settled into a more mature state.
Yahoo Finance
FDIC's Gruenberg rebuffs bipartisan calls for his resignation as new banking rules loom
FDIC Chair Martin Gruenberg rebuffed calls to resign during a heated House hearing Wednesday, as he pledged to fix a toxic workplace culture while his agency prepares an overhaul of how banks are regulated.
Yahoo Finance
Google is reinventing itself for the AI age
Google took back the AI crown during its I/O developer conference, but its reign could be short-lived.
TechCrunch
The top AI announcements from Google I/O
Google’s going all in on AI — and it wants you to know it. During the company’s keynote at its I/O developer conference on Tuesday, Google mentioned “AI” more than 120 times. Google plans to use generative AI to organize entire Google Search results pages.
The Yodel
Senators propose AI safeguards, NTSB reports ship failures before Key Bridge crash and Caitlin Clark’s WNBA debut
Get caught up on this morning’s news: Senators propose AI safeguards, Cohen’s testimony continues and more in today’s edition of The Yodel newsletter
TechCrunch
Uber has a new way to solve the concert traffic problem
Uber is taking a shuttle product it developed for commuters in India and Egypt and converting it for an American audience. The ride-hail and delivery giant announced Wednesday at its annual Go-Get event in New York City that it will launch a shuttle service in certain U.S. cities this summer. Uber Shuttle in the U.S. will repurpose the technology and business model that Uber has built to help commuters in emerging markets where there's a public transportation gap.
Engadget
Google's Wear OS 5 promises better battery life
Google has unveiled Wear OS 5 at its I/O developer conference today, giving us a glimpse of new features and other improvements coming with the platform.
Autoblog
2024 Alfa Romeo Giulia and Stelvio Quadrifoglio Super Sports close the book
The limited-edition 2024 Alfa Romeo Giulia and Stelvio Quadrifoglio Super Sports end the Quadrifoglio's ICE era with special trim, no special performance.
Yahoo Life Shopping
Tell biting bugs to buzz off with this top-selling $17 indoor fly trap that uses zero chemicals
An insect-buster that has over 18,000 five-star fans? We'll bite!
Yahoo News
Biden and Trump agree to presidential debates in June, September — but with some big changes
Here’s what will be different this time — and why it matters.
Autoblog
Lawmakers call out eight automakers for sharing connected vehicle data
Eight automakers admitted to sharing location and other data with law enforcement without a warrant or owner consent.
Engadget
OpenAI co-founder and Chief Scientist Ilya Sutskever is leaving the company
Last year, Ilya Sutskever admitted that he regretted the role he played in the sudden dismissal of OpenAI CEO Sam Altman and President Greg Brockman.
TechCrunch
Google takes aim at Android malware with an AI-powered live threat detection service
Google is preparing to launch a new system to help address the problem of malware on Android. Its new live threat detection service leverages Google Play Protect's on-device AI to analyze apps for malicious behavior. The service, announced following the Google I/O developer event on Tuesday, examines various signals related to an app's use of sensitive permissions and interactions with other apps and services, the company explains.

News

Life

Entertainment

Finance

Sports

New on Yahoo

Can Copyright Law Save Journalism From A.I.?

Recommended Stories

This Week in AI: Generative AI and the problem of compensating creators

Microsoft and OpenAI sued yet again by Chicago Tribune and New York Daily News

Google still hasn't fixed Gemini's biased image generator

A group of TikTok creators are also suing the US government to stop a ban of the app

Netflix to take on Google and Amazon by building its own ad server

Save 80% on this HD baby display monitor for your car

Car insurance costs are surging — but it's not because of price gouging

Matt Garman taking over as CEO with AWS at crossroads

FDIC's Gruenberg rebuffs bipartisan calls for his resignation as new banking rules loom

Google is reinventing itself for the AI age

The top AI announcements from Google I/O

Senators propose AI safeguards, NTSB reports ship failures before Key Bridge crash and Caitlin Clark’s WNBA debut

Uber has a new way to solve the concert traffic problem

Google's Wear OS 5 promises better battery life

2024 Alfa Romeo Giulia and Stelvio Quadrifoglio Super Sports close the book

Tell biting bugs to buzz off with this top-selling $17 indoor fly trap that uses zero chemicals

Biden and Trump agree to presidential debates in June, September — but with some big changes

Lawmakers call out eight automakers for sharing connected vehicle data

OpenAI co-founder and Chief Scientist Ilya Sutskever is leaving the company

Google takes aim at Android malware with an AI-powered live threat detection service