Anthropic Partially Wins Copyright Lawsuit from Authors

Wed 25th Jun, 2025

In a notable legal development, Anthropic has achieved a partial victory in a copyright infringement case concerning the use of unauthorized book copies for training large language models (LLMs). A federal district court in the United States has granted Anthropic's request for a summary judgment, allowing the use of certain copies for AI training, while deeming the downloading of e-books from illegal sources as unlawful.

This case is part of a broader wave of lawsuits filed in the U.S. against AI operators for alleged copyright violations. The complaint was brought forth by three authors: Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, who charged Anthropic with creating a digital library containing a vast array of books without obtaining proper licenses. The lawsuit, filed in the U.S. District Court for Northern California, highlights several critical actions taken by Anthropic:

  • Anthropic reportedly downloaded and stored over seven million e-books from illegal online sources.
  • In a practice involving the purchase of used physical books, the company digitized these texts using optical character recognition (OCR) technology and subsequently disposed of the physical copies.
  • Numerous additional copies of digital books were created from both aforementioned sources, which were then utilized to train various LLMs.
  • Furthermore, Anthropic generated other copies for different purposes, though these were not distributed outside the company.

Importantly, the lawsuit does not accuse Anthropic of disseminating copyrighted texts from its LLMs to end users, as the company has implemented specific filtering software to prevent such occurrences.

In seeking a legal resolution, Anthropic argued that it should be granted a Fair Use exemption for all the allegations made against it. The Fair Use doctrine allows for the limited use of copyrighted material without permission from the rights holders, provided the use meets certain criteria that promote the advancement of knowledge and creativity.

The determination of Fair Use is not explicitly defined in the law, which poses challenges in legal interpretations. In this case, the court analyzed four factors:

  1. The purpose of the use--commercial, educational, or non-commercial.
  2. The nature of the copyrighted work.
  3. The amount of the work used in relation to the entire work.
  4. The effect of the use on the market value of the original work.

The court segmented the case into three parts and arrived at the following conclusions:

  • Concerning the use of unauthorized copies for LLM training, the court found that the transformative nature of the usage favored Fair Use. Anthropic's objective was not to replace the original works but to generate new texts using artificial intelligence.
  • Regarding the type of works involved, this factor slightly opposed Fair Use, whether they were non-fiction or fiction.
  • When evaluating the extent of material copied, Anthropic conceded to using entire books--a practice deemed not entirely necessary for LLM training. However, the court determined that the usage was reasonably necessary.
  • Lastly, in terms of market impact, the court ruled that Anthropic's training practices did not diminish the demand for original works. Although the unlicensed actions might hinder the development of a market for licensing works for LLM training, the court deemed this economic consideration not protected under copyright law.

In a surprising conclusion, the judge noted that despite the complete copying of books, the analysis of the Fair Use factors leaned in favor of Anthropic.

This ruling sets a significant precedent amid ongoing debates about copyright and artificial intelligence, as numerous cases involving similar allegations are currently pending in the U.S.


More Quick Read Articles »
OSZAR »