OpenAI, Anthropic Allege Large-Scale AI Data Scraping by Chinese Competitors

Sapatar / Updated: Feb 24, 2026, 17:08 IST 4 Share
OpenAI, Anthropic Allege Large-Scale AI Data Scraping by Chinese Competitors

OpenAI and Anthropic have reportedly accused several Chinese artificial intelligence firms of conducting extensive data scraping operations to harvest proprietary content for training advanced AI models. According to statements from the US-based AI companies, the alleged activities involve extracting vast volumes of publicly available and restricted data from platforms, developer tools, and application programming interfaces (APIs) without proper authorization.

The companies claim such practices may violate terms of service agreements and undermine intellectual property protections that form the backbone of AI development.


Concerns Over Model Distillation and Replication

A central concern raised by OpenAI and Anthropic revolves around “model distillation,” a technique where outputs from advanced AI systems are used to train rival models. Industry experts say that by querying leading AI models at scale and collecting their responses, competitors can potentially replicate capabilities without incurring the same research and infrastructure costs.

Executives from both firms suggest that certain overseas entities may be leveraging this approach to accelerate development of competing large language models.


Cybersecurity and Infrastructure Safeguards

In response to the alleged data harvesting, OpenAI and Anthropic have reportedly strengthened security controls, tightened API rate limits, and enhanced monitoring systems to detect suspicious or automated bulk access. The companies are also said to be collaborating with cloud providers and cybersecurity teams to mitigate potential vulnerabilities.

Security analysts note that AI companies face unique challenges, as their systems are inherently designed to provide responses to user queries — creating opportunities for systematic extraction if safeguards are insufficient.


Geopolitical and Regulatory Implications

The accusations arrive amid intensifying technological competition between the United States and China, particularly in advanced AI chips, foundational models, and generative AI applications. Policymakers in Washington have increasingly framed AI leadership as a matter of national security, imposing export controls on high-performance semiconductors and tightening restrictions on technology transfers.

If proven, the allegations could fuel further regulatory scrutiny and deepen tensions between the world’s two largest economies.


Industry-Wide Challenge

Analysts emphasize that the issue of data scraping and model replication is not confined to one region. As AI development accelerates globally, companies across markets are grappling with how to protect proprietary training data, algorithms, and model outputs.

OpenAI and Anthropic have called for clearer international norms and stronger enforcement mechanisms to safeguard innovation while maintaining openness in research.