New York Times takes Perplexity AI to court over ‘illegal’ copying of content

By _shalini oraon

The New York Times Takes Perplexity AI to Court: A Landmark Battle Over News, AI, and the Future of the Web

In a move that could define the legal and ethical boundaries of the artificial intelligence era, The New York Times has filed a federal lawsuit against the AI search startup Perplexity, alleging “willful, illegal copying” of its copyrighted journalistic content. This lawsuit is not merely a corporate dispute; it is a direct shot across the bow of the entire generative AI industry, challenging the foundational practice of web scraping and “fair use” that has fueled the rise of models like ChatGPT. At stake is the future of news publishing, the economics of content creation, and the very structure of the internet.

The Core of the Allegation: Beyond Scraping to “Theft”

The Times’ lawsuit, filed in the Southern District of New York, goes beyond the typical complaints about AI companies using content for training their large language models (LLMs). It presents a detailed, damning portrait of what it calls Perplexity’s “egregious” business model, built on a three-part system of infringement:

1. Massive, Unauthorized Scraping: The suit alleges Perplexity’s crawlers systematically copied millions of Times articles, bypassing the newspaper’s robots.txt file—the standard protocol websites use to communicate with web crawlers about what can and cannot be accessed. This alleged bypass is a critical element, moving the activity from potentially contentious but common scraping to what The Times frames as outright trespass and hacking.
2. Near-Verbatim Reproduction in Outputs: The lawsuit provides numerous examples where Perplexity’s AI, responding to user queries, generated summaries of recent Times investigative reports, feature stories, and reviews that contained “memorized” and near-identical language from the original articles. Crucially, these outputs often omitted proper attribution or only provided a token, generic citation like “source: The New York Times” without a link, thereby depriving The Times of both traffic and licensing revenue.
3. The “Illusion” of Legitimacy: The Times argues that Perplexity’s entire value proposition—an AI that provides instant, accurate, and sourced answers—is predicated on this unauthorized ingestion of premium content. By summarizing and repackaging Times journalism without permission or payment, Perplexity is allegedly creating a direct market substitute, siphoning users who might otherwise click through to the original site.

“Defendants seek to free-ride on The Times’s massive investment in its journalism,” the complaint states, accusing Perplexity of building “an advertising-supported business that is valued at more than $1 billion” on the back of stolen intellectual property.

Perplexity’s Defense and the “Fair Use” Fault Line

Perplexity’s CEO, Aravind Srinivas, has publicly defended the company, stating they respect robots.txt and that their systems are designed to summarize the web with proper attribution. The company’s likely legal defense will hinge on the doctrine of “fair use.”

AI companies broadly argue that training models on publicly available internet data is transformative—it doesn’t republish the articles but learns statistical patterns from them to generate new, original text. They compare it to a human reading thousands of articles and then writing in their own words. Furthermore, they claim that providing short summaries with citations is a public good that drives traffic back to publishers, not away from them.

The Times’ lawsuit aggressively counters this framing. It argues Perplexity’s outputs are not transformative but derivative substitutes that serve the same purpose as the original news story. The alleged bypassing of robots.txt undermines any good-faith “fair use” claim. The case will force courts to examine a nuanced question: At what point does an AI’s learning and output cross the line from “transformation” to “unfair exploitation”?

A Broader Industry Reckoning

This lawsuit is the most significant escalation in a simmering conflict. The Times previously sued OpenAI and Microsoft in a similar, landmark case that is still ongoing. Other media giants, including the Financial Times, CNN, and Reuters, have struck licensing deals with OpenAI. This creates a two-tier landscape: those who can negotiate payouts for their content, and those whose content is used without permission.

The Perplexity case is particularly potent because of the startup’s pure-play “AI search” model. Unlike OpenAI, which has multiple revenue streams, Perplexity’s core product is fundamentally about digesting and regurgitating the modern web. If The Times succeeds, it could impose a crippling licensing cost on an entire class of AI companies, potentially reshaping or even dismantling the current “scrape everything” paradigm.

Implications for the Future of the Web

The outcome of this lawsuit will send ripples far beyond the newsroom.

· For Publishers: A win for The Times would be a powerful vindication, giving all content creators leverage to demand licensing fees from AI companies. It could lead to a new, structured ecosystem where AI firms pay to access high-quality, reliable information—a potential financial lifeline for a struggling news industry. A loss could further entrench the feeling that Big Tech can freely monetize creative work without compensation.
· For AI Companies: A ruling against Perplexity could force a fundamental technological and business model pivot. AI firms might have to invest heavily in curated, licensed datasets or develop more sophisticated real-time licensing agreements. It could slow innovation, raise costs, and advantage well-funded incumbents who can afford to pay for content.
· For Users and the Public: There is a risk that a more restrictive legal environment could wall off valuable information from AI systems, reducing the breadth and depth of knowledge these tools can access. Conversely, it could lead to a healthier internet ecosystem where creators are incentivized to produce quality work, and AI tools are forced to be more transparent and ethical in their sourcing, potentially reducing “hallucinations” and misinformation.

Conclusion: A Defining Battle for the Digital Age

The New York Times v. Perplexity is more than a copyright case. It is a foundational battle over value, ownership, and ethics in the age of intelligent machines. It asks: Who benefits from the AI revolution? Is the open web a free training buffet, or is it a library where the works must be respected and their creators compensated?

The lawsuit places a stark choice before the courts: uphold the rights of content creators whose work forms the essential feedstock of AI, or sanction a new technological paradigm that may inevitably override old copyright norms in the name of progress. The verdict will not only determine the fate of one AI startup but will also write a crucial early chapter in the rulebook for our AI-driven future. The discovery process alone, likely to reveal the inner workings of Perplexity’s data practices, will provide unprecedented insight into the “black box” of AI training, making this a case the entire tech and media world will watch with profound anxiety and anticipation.


Discover more from AMERICA NEWS WORLD

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Discover more from AMERICA NEWS WORLD

Subscribe now to keep reading and get access to the full archive.

Continue reading