Top Authors Join Lawsuit Against OpenAI Over “Mass-Scale Copyright Infringement” of Novels

  • Oops!
    Something went wrong.
    Please try again later.
  • Oops!
    Something went wrong.
    Please try again later.
  • Oops!
    Something went wrong.
    Please try again later.

The leading trade group for authors has entered a legal battle against OpenAI over its human-mimicking chatbot in a case that could decide the legality of using copyrighted works to train AI systems.

The Authors Guild — led by prominent fiction authors including George R.R. Martin, Jonathan Franzen and John Grisham — on Tuesday sued OpenAI, accusing the company of engaging “in a systematic course of mass-scale copyright infringement” to “power their lucrative commercial endeavor.” The proposed class action filed in New York federal court builds upon arguments from creators who have already initiated lawsuits against AI firms that generative AI illegally produces infringing works that directly compete with their creations.

More from The Hollywood Reporter

The legal action is at least the third against OpenAI over the company using copyrighted books to train its system. OpenAI is facing a proposed class action from author Paul Tremblay, in addition to another filed by Sarah Silverman, which also names Meta. Artists have similarly sued AI art generators Stability AI, Midjourney and DeviantArt for copyright infringement.

OpenAI has said that it trains its model on “large, publicly available datasets that include copyrighted works.” While it remains unknown which works are included in the dataset, the authors point to ChatGPT generating summaries and in-depth analyses of the themes in their novels as proof that OpenAI used their books.

“For example, when prompted, ChatGPT accurately generated summaries of several of the Franzen Infringed Works, including summaries for The Corrections, Purity, and Freedom,” states the complaint.

These summaries, the authors say, constitute infringing works that interfere with their economic prospects to profit off of their books. They take issue with OpenAI choosing not to license their novels.

AI companies thus far have resisted striking licensing deals with authors, opting to obtain the data from internet-based book collections. Authors have argued in several lawsuits that the datasets were obtained from illegal shadow library sites, including Library Genesis, Z-Library and Bibliotik.

Stressing that ChatGPT undermines the market for authors’ works, the lawsuit says that the tool facilitates the creation of infringing derivative productions. It cites businesses that sell prompts allowing users to create what are essentially works of fan fiction. When prompted, ChatGPT generated the “next purported installment of While the Patient Slept, one of the Authors Guild Infringed Works, and titled the infringing and unauthorized derivative ‘Shadows Over Federie House,'” according to the complaint. A derivative is a work based upon a preexisting, copyright-protected work.

“For example, a business called Socialdraft offers long prompts that lead ChatGPT to engage in ‘conversations’ with popular fiction authors like Plaintiff Grisham, Plaintiff Martin, Margaret Atwood, Dan Brown, and others about their works, as well as prompts that promise to help customers ‘Craft Bestselling Books with AI,'” the lawsuit states.

Works of fan fiction are considered infringing creations. Authors, however, typically don’t take action unless the works are monetized.

“The prompting and output of AI that creates a summary or a new storyline using the same characters or the same themes would be very possibly be considered a derivative and infringing work,” says Ed Klaris, an intellectual property lawyer and professor at Columbia Law School.

Courts could look to A&M Records v. Napster for direction on whether OpenAI is facilitating the infringement of authors’ novels through ChatGPT. A federal judge in that case found the peer-to-peer sharing service liable for contributory and vicarious copyright infringement, rejecting a fair use defense. The court found that Napster had a duty to control, or at least restrict, users infringing on artists’ copyrights through digital audio file distribution.

The courts will also wrestle with two Supreme Court cases that legal experts say could decide the issue of whether OpenAI can avail itself of a fair use defense. AI companies will likely point to precedent greenlighting the copying of works to generate noninfringing text responses from when the Authors Guild in 2005 sued Google for digitizing millions of books to create a search function for the works. A federal judge in that case rejected copyright infringement claims, finding the company’s copying of the books to amounts to fair use. Central to the ruling was that Google only allowed users to view snippets of text without providing the full book.

The authors, meanwhile, will likely cite a recent decision from the Supreme Court in Andy Warhol Foundation for the Visual Arts v. Goldsmith. In that case, the justices effectively reined in the scope of fair use. The justices stressed that the defense is likely to be rejected when an original work and derivative share the “same or highly similar purpose,” and they’re competing in the same market.

Klaris says the courts will likely reject fair use based on the Supreme Court’s decision in Warhol. “This is a huge intellectual property grab that the AI companies are doing,” he notes. “It’s going to get reined in.”

OpenAI in August moved to dismiss most claims in the copyright lawsuit led by Silverman. It argued that copyright law “does not protect ideas, fact or language,” according to the filing.

“Copyright protects the particular way an author expresses an idea—not the underlying idea itself, facts embodied within the author’s articulated message, or other building blocks of creative expression,” it stated.

Best of The Hollywood Reporter

Click here to read the full article.