Encyclopedia Britannica Sues OpenAI Over Alleged Use of Articles to Train ChatGPT

OpenAI's Pentagon Pact
OpenAI's Pentagon PactOpenAI's Pentagon Pact

The growing legal battle between traditional knowledge publishers and artificial intelligence companies has taken another major turn. Encyclopedia Britannica, along with its Merriam-Webster subsidiary, has filed a lawsuit against OpenAI, accusing the AI developer of using its reference materials without permission to train ChatGPT.

The lawsuit was filed in Manhattan federal court and claims that OpenAI copied thousands of Britannica’s articles and dictionary entries to build its large language models. According to the complaint, the AI company used these materials to teach its chatbot how to answer user questions, ultimately competing with the very websites that produced the original information.

Britannica Claims AI Is “Cannibalizing” Its Traffic

In the legal filing, Britannica argues that OpenAI’s chatbot generates AI summaries that replicate information from its encyclopedia and dictionary content, reducing the need for users to visit the original websites.

The publisher says ChatGPT can sometimes produce responses that closely resemble its original entries, which it claims diverts web traffic and undermines its business model.

Britannica alleges that OpenAI copied nearly 100,000 articles from its database during the training process for GPT models.

The company argues that this use of copyrighted material occurred without permission or licensing agreements, raising serious concerns about how AI companies obtain training data.

OpenAI Responds to the Lawsuit

OpenAI has rejected the accusations. In response to the lawsuit, a spokesperson for the company said its AI systems are trained using publicly available information and operate under principles of fair use.

“Our models empower innovation and are trained on publicly available data and grounded in fair use,” the spokesperson said.

The company maintains that its models transform information into new forms rather than simply reproducing existing content.

Trademark and “Hallucination” Concerns

Beyond copyright issues, Britannica also accuses OpenAI of trademark misuse. The lawsuit claims ChatGPT sometimes references Britannica as a source in answers, which could lead users to believe the company has authorized the AI system to use its content.

Britannica further alleges that some of these citations appear in AI-generated errors or so-called “hallucinations,” potentially damaging the publisher’s reputation if incorrect information is attributed to its name.

Because of this, the lawsuit also includes claims related to trademark infringement and misleading attribution.

Part of a Larger Legal Battle Over AI Training Data

The case is part of a broader wave of legal disputes emerging across the publishing and technology industries. Over the past year, authors, media companies, and publishers have increasingly challenged AI firms over how their content is used to train machine learning models.

Several news organizations and writers have already filed similar lawsuits against major AI developers, arguing that the technology relies heavily on copyrighted material gathered from the internet.

Britannica itself previously launched another legal case against AI search startup Perplexity AI in 2025, which remains ongoing.

What Britannica Wants From the Court

In its filing, Britannica is seeking monetary damages from OpenAI, though the exact amount has not been specified.

The publisher is also asking the court to issue an order preventing OpenAI from continuing to use its copyrighted materials in AI training systems.

If the court rules in Britannica’s favor, the decision could significantly impact how AI models are trained and potentially force companies to license training data from publishers.

Final Words

The lawsuit highlights the growing tension between artificial intelligence innovation and traditional content ownership. As AI systems become more powerful and widely used, the question of where their training data comes from — and who should be compensated — is becoming one of the most important legal debates in the tech industry.

With major publishers now challenging AI developers in court, the outcome of cases like Britannica’s could shape the future relationship between technology companies and the creators whose work fuels the digital knowledge ecosystem.

Anubhav Chauhan

Anubhav Chauhan is a passionate technology writer at NewzTechy.com, where he focuses on delivering the latest updates and insights from the fast-moving world of tech. With a keen interest in emerging technologies, gadgets, and digital trends, he enjoys breaking down complex topics into simple, easy-to-understand content for everyday readers. Anubhav believes that technology should be accessible to everyone, and through his writing, he aims to keep readers informed, aware, and ahead of the curve. Whether it’s new innovations, software updates, or industry developments, he is always eager to explore and share valuable information with his audience.