← All posts

AI Training Data Lawsuits: Where 2026 Landed

Key Takeaways

  • Bartz v. Anthropic received final fairness approval at the May 14 2026 hearing: $1.5B fund, ~$2,931 per work, 91.3% claim rate against ~482K eligible works. Largest copyright settlement in US history. Sets the de facto floor price for piracy-based training claims1
  • Andersen v. Stability AI / Midjourney / DeviantArt / Runway went to trial on September 8 2026 (N.D. Cal.). First US jury verdict on AI training as copyright infringement is the most consequential expected outcome of the year2
  • Kadrey v. Meta (Chhabria, June 2025) is the most-cited 2026 precedent: training on books, including pirated copies, is fair use on the record those plaintiffs developed. Chhabria explicitly invited better-pled successor cases on a market-harm theory. The Turow et al. third-amended complaint is testing exactly that3
  • Reddit v. Perplexity / SerpApi / Oxylabs / AWMProxy (S.D.N.Y., filed October 2025) became the first major test of the scraper-intermediary liability question. Active discovery; alleged scraping volumes (~2B Google search results via SerpApi, ~781M via Oxylabs, ~482M via AWMProxy) are now in the public docket4
  • Robots.txt is not a §1201 technological measure. Two 2025 rulings (Ziff Davis, Reddit) held it is “more akin to a sign than a barrier.” RFC 9309 is cited as evidence of lack of authorization, not as a circumvention barrier. AIPREF has been referenced in amicus filings but no court has yet relied on it5

The Year in One Paragraph

Two events defined the AI training-data docket in 2026: Bartz v. Anthropic closed as the first nine-figure settlement on training data, and Andersen v. Stability AI went to a jury. Between those bookends, Kadrey v. Meta stood as the most-cited fair-use precedent, Reddit v. Perplexity opened the scraper-intermediary front, and the regulatory side (EU AI Act GPAI enforcement, California AB 2013, Texas TRAIGA) added a non-litigation flank. This post tracks all of it.

The Comprehensive Case Table

Status as of late 2026. Bracketed dates and projections were anchored from May 2026 research and re-verified before publish. Anything that moved between then and December 1 is noted in the projection column.

#CaseCourt / DocketFiledStatusYear-End
1NYT v. OpenAI / MicrosoftSDNY 1:23-cv-11195Dec 27 2023Summary-judgment briefing; expert discovery closedSJ ruling expected Q1 2027; trial 2027
2Authors Guild v. OpenAISDNY (MDL 3143)Sep 2023Consolidated under Judge Sidney SteinSJ track alongside NYT
3Bartz v. AnthropicN.D. Cal.Aug 2024Final approval May 14 2026; second installment paidActive claim distribution; third installment Sep 2026 paid
4Kadrey v. MetaN.D. Cal. 3:23-cv-03417Jul 2023June 2025 fair-use ruling for Meta survived; Turow plaintiffs proceeding on market-harm theoryExpanded SJ briefing on the new record
5Reddit v. Perplexity / SerpApi / Oxylabs / AWMProxySDNY 1:25-cv-08736Oct 22 2025First Amended Complaint Feb 9 2026; motion-to-dismiss decided Q3 2026Discovery opening; first major scraper-intermediary case
6Getty Images v. Stability AI (UK)EWHC2023Judgment Nov 4 2025: copyright/database claims rejected; limited trademark winsResolved (UK); narrow appeal possible
7Getty Images v. Stability AI (US)D. Del.Feb 2023Discovery; SJ briefing late 2026SJ ruling 2027
8Concord / UMG / ABKCO v. Anthropic (Round 1)N.D. Cal. 5:24-cv-03811Oct 2023Interim relief denied; guardrails stipulationTrial scheduling Q4 2026
9UMG / Concord / ABKCO v. Anthropic (Round 2)TBDJan 28 2026$3B sought across 20,000+ songs; uses Bartz torrent recordPleadings stage
10Tremblay / Silverman / Chabon v. OpenAIMDL 31432023Member action; narrowed to direct infringement + UCLFolded into consolidated complaint
11Encyclopedia Britannica & Merriam-Webster v. OpenAISDNYMar 13 2026~100,000 articles; copyright + Lanham ActMotion-to-dismiss stage
12Petryazhna v. OpenAI (YouTube)MDL 3143Aug 2024 (amended)Stayed pending class certNo SJ before late 2026
13Millette v. Google / OpenAIMDL 3143Aug 2024OpenAI motion to dismiss filedClass cert briefing
14Doe v. GitHub / OpenAI / Microsoft (Copilot)N.D. Cal.Nov 202222 → 2 claims; 9th Cir. interlocutory appeal pendingAppellate ruling expected; trial track 2027
15Andersen v. Stability AI / Midjourney / DeviantArt / RunwayN.D. Cal. 3:23-cv-00201Jan 13 2023Trial began Sep 8 2026First US jury verdict on AI training likely Q4 2026
16Disney / NBCU / DreamWorks v. MidjourneyC.D. Cal. 2:25-cv-05275Jun 11 2025Post-Mediation Status Conf Aug 31 2026Either settlement or contested SJ briefing
17Warner Bros. Discovery v. MidjourneyC.D. Cal.Sep 4 2025PleadingsLikely consolidated with Disney action
18Ziff Davis v. OpenAIMDL 3143Apr 24 2025Tag-along to MDL; copyright claims advancedCoordinated discovery
19Daily News / Tribune / MediaNews Group v. OpenAIMDL 3143Apr 2024ConsolidatedCoordinated SJ
20Raw Story / Intercept v. OpenAISDNY2024DMCA §1202(b) claims survived in partDiscovery

This is not exhaustive. The AI Lawsuit Tracker indexes 166+ cases as of late 20266; the table above covers the cases most consequential for AI training data specifically.

Three Rulings That Defined the Year

Kadrey v. Meta (Chhabria, N.D. Cal., June 25 2025) is the most-cited 2026 ruling. The court held that training LLMs on books (including pirated copies) was fair use on the record the plaintiffs developed. Chhabria warned that the ruling “stands only for the proposition that these plaintiffs made the wrong arguments.” Future plaintiffs who develop a market-harm record (lost licensing revenue, output substitution) may win on identical underlying facts. Every subsequent complaint front-loads market-harm theories as a result. This is why the Turow et al. third-amended complaint matters: it is the closest test of whether a better-pled record changes the outcome3.

Getty Images v. Stability AI (UK High Court, November 4 2025) was the first Western judgment on whether AI model weights are “copies” within the meaning of national copyright law. The English court held they are not: training did not occur in the UK, and weights do not store the training works as copies under the CDPA. Limited trademark wins on watermarks reproduced in outputs. The judgment does not bind US courts, but carries political weight as the first developed-jurisdiction ruling that AI weights are not copies7.

Bartz v. Anthropic / Alsup (June 2025 ruling, September 2025 preliminary settlement, May 14 2026 final approval) drew a clear line. Training on legitimately acquired books is fair use. Downloading from LibGen and Pirate Library Mirror is not. The $1.5B settlement is denominated against the piracy-acquisition track, not the training-as-fair-use track. The implied legal architecture: acquisition method matters; what a lab does with legally acquired data is fair use; what it does with stolen data is infringement1.

These three rulings define the 2026 boundary conditions. They also explain why every 2026 complaint after Kadrey emphasizes both market-harm theory and acquisition-track piracy evidence.

2026 AI training-data litigation: Bartz $1.5B settlement final May 14, Andersen trial begins Sep 8, Kadrey survived as most-cited fair-use precedent, Reddit v Perplexity at first scraper-intermediary test, robots.txt is not a Section 1201 technological measure

The Settlements Ledger

CaseAmountDateNotes
Bartz v. Anthropic$1.5BSep 2025 prelim → May 14 2026 finalLargest copyright settlement in US history; ~$2,931/work; 91.3% claim rate
Warner Music v. UdioUndisclosedApril 2026First major music-AI settlement
Concord / UMG / ABKCO v. Anthropic (Round 1)Stipulated guardrails (no $)2025Maintains Claude’s lyric-output refusal

No NYT, Authors Guild, or image-AI settlements through 2026.

91.3% of authors claimed against the Bartz fund. That is extraordinary for a class action. The settlement functions as a floor price: subsequent complaints, including UMG / Concord / ABKCO Round 2 ($3B sought), explicitly anchor against the Bartz-implied per-work value.

What 2026 Filed New

Two filings reshape the trajectory.

UMG / Concord / ABKCO v. Anthropic Round 2 (January 28 2026) seeks $3B across 20,000+ songs. The complaint imports torrent and shadow-library evidence developed in the Bartz discovery and applies it to the music corpus. The original 2023 case lost interim relief and ended in stipulated guardrails; Round 2 is a substantively different theory built on the Bartz-Alsup acquisition-track logic.

Encyclopedia Britannica & Merriam-Webster v. OpenAI (March 13 2026, S.D.N.Y.) covers approximately 100,000 articles and pleads both copyright and Lanham Act claims. The Lanham Act claim is the more interesting novelty: it argues that AI outputs that substitute for the licensed reference product (definitions, encyclopedic facts) are commercial use of the publisher’s trademark. The motion-to-dismiss outcome will indicate whether this theory survives.

The MDL 3143 consolidation continues to absorb tag-along complaints monthly. Publisher-side filings have outpaced AI-lab counterfilings two-to-one through 2026.

The Robots.txt Ruling That Matters

The most consequential standards-related ruling of 2026 is a loss for publishers.

Two 2025 rulings — Ziff Davis v. OpenAI and Reddit v. various — explicitly held that robots.txt is not a “technological measure that effectively controls access” under DMCA §1201. The Ziff Davis order’s framing: robots.txt is “more akin to a sign than a barrier.” RFC 9309 is cited in briefing but does not change the analysis; the standard documents the protocol. It does not convert it into a technical access-control measure5.

This matters because it forecloses one of the legal theories publishers might have used to attach §1201 anti-circumvention claims to robots.txt-violating crawlers. The remaining publisher options are conventional copyright infringement, breach of terms-of-service, the Lanham Act (if AI outputs substitute for the publisher’s product), and the developing scraper-intermediary theory in Reddit v. Perplexity.

AIPREF has been referenced in amicus filings but no court has yet relied on it. As AIPREF moves toward IESG submission, the question of whether AIPREF signaling has different §1201 treatment from robots.txt is open. Today, the answer is “not under the existing precedent,” and that is a non-trivial constraint on what standardized signaling can buy publishers in litigation.

For background on AIPREF’s standards trajectory, see AIPREF After Toronto and GPAI After Six Weeks.

The Regulatory Flank

The non-litigation track in 2026 added enforceable obligations that run parallel to the docket.

California AB 2013 (effective January 1 2026) requires generative AI providers to publish summaries of training data on the provider’s website. This is the US analog to EU AI Act Article 53(1)(d), with narrower scope and California-specific applicability.

Texas TRAIGA (effective January 1 2026) provides AG-only enforcement with a 60-day cure period and a restricted-purposes regime. Less aggressive than CA AB 2013 on disclosure but covers a broader risk surface.

EU AI Act GPAI enforcement (Article 88-94 fining authority effective August 2 2026) is the most consequential regulatory milestone of 2026 for AI training data. Up to €15M or 3% of global turnover for GPAI obligation breaches. Covered separately at GPAI After Six Weeks.

Trump Executive Order December 11 2025 purported to preempt inconsistent state AI laws. Without congressional action, the EO cannot override state law; legal challenges expected.

US Copyright Office Part 3 (Generative AI Training, pre-publication May 9 2025) concluded that fair-use analysis for AI training is case-by-case with no bright-line rule. The final version was not yet published as of late 2026.

What to Watch in 2027

Three procedural milestones determine whether 2026’s legal architecture holds.

Andersen v. Stability AI verdict. If the jury returns infringement, the image-AI track and the text-AI track converge on a damages-driven settlement posture. If the jury returns no infringement, the legal architecture becomes “labs win on training when they get a jury.”

NYT v. OpenAI summary judgment ruling. The expert-discovery record on regurgitation is closed. The SJ ruling is the first signal on whether output-substitution arguments break fair use at scale.

Reddit v. Perplexity discovery outcomes. If the alleged scraping volumes hold up under discovery, the scraper-intermediary theory becomes viable as a routine cause of action. Every proxy provider’s exposure changes if Reddit wins on the §1201 or contributory infringement track.

Kadrey successor (Turow) summary judgment. Tests whether a better-pled market-harm record changes Chhabria’s underlying ruling. If yes, the Bartz-style settlement posture spreads; if no, the fair-use architecture for training holds.

The 2027 docket will inherit roughly 30 active cases, two flagship trial outcomes pending, and a regulatory-enforcement layer that did not exist a year ago. For the unit economics that explain why scraping continues alongside both litigation and licensing, see How Much Does It Cost to Scrape the Web at Scale? and Where AI Training Data Actually Comes From in 2026.


Last updated: December 2026

References

  1. Bartz v. Anthropic Settlement. Settlement website with claim form, schedule, and orders. https://www.anthropiccopyrightsettlement.com/
  2. Andersen v. Stability AI / Midjourney / DeviantArt / Runway, N.D. Cal. 3:23-cv-00201. CourtListener docket and Mishcon GenAI tracker. https://www.mishcon.com/generative-ai-intellectual-property-cases-and-policy-tracker
  3. Kadrey v. Meta, Chhabria order June 25 2025. FisherBroyles client alert. https://fisherbroyles.com/news/client-alert-summary-and-strategic-analysis-of-judge-chhabrias-fair-use-ruling-in-kadrey-v-meta/
  4. Reddit v. Perplexity, SerpApi, Oxylabs, AWMProxy, S.D.N.Y. 1:25-cv-08736. CourtListener docket. https://www.courtlistener.com/docket/71720563/reddit-inc-v-serpapi-llc/
  5. Ziff Davis v. OpenAI and Reddit v. various, on robots.txt and DMCA §1201. RFC 9309 cited in briefing. https://www.rfc-editor.org/rfc/rfc9309.html
  6. AI Lawsuit Tracker (166+ cases, weekly updates). https://ailawsuittracker.com/
  7. Getty Images v. Stability AI (UK High Court, Nov 4 2025). Latham & Watkins client alert. https://www.lw.com/en/insights/getty-images-v-stability-ai-english-high-court-rejects-secondary-copyright-claim
  8. Authors Guild AI class-action tracker. https://authorsguild.org/news/ai-class-action-lawsuits/
  9. McKool Smith AI Infringement Case Updates. https://www.mckoolsmith.com/newsroom-ailitigation-53
  10. BakerHostetler Newspaper Cases tracker. https://www.bakerlaw.com/new-york-times-v-microsoft/
  11. US Copyright Office, Part 3: Generative AI Training (pre-publication, May 9 2025). https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-3-Generative-AI-Training-Report-Pre-Publication-Version.pdf
  12. Texas Responsible AI Governance Act (TRAIGA), Norton Rose Fulbright analysis. https://www.nortonrosefulbright.com/en/knowledge/publications/c6c60e0c/the-texas-responsible-ai-governance-act
  13. ChatGPT Is Eating the World — case coverage tracker. https://chatgptiseatingtheworld.com/