Finance

OpenAI Alleges Chinese Competitor DeepSeek Illegally Used Its AI Models for Training

2025-01-29

Author: Wei

OpenAI's Allegations Against DeepSeek

OpenAI has raised alarms over a potential breach of intellectual property, asserting that Chinese AI start-up DeepSeek has utilized its proprietary models to develop a competing open-source AI product.

Details of the Claims

According to a report from the Financial Times, OpenAI claims to have gathered evidence that DeepSeek engaged in "distillation," a method where developers improve the performance of smaller AI models by leveraging the outputs of larger, more capable models. This practice allows companies to achieve high levels of performance on specific tasks while significantly reducing costs.

Impact on Investors

While OpenAI has refrained from disclosing specific details about its evidence, the company's terms of service explicitly prohibit users from copying or utilizing outputs to develop competing models. The rise of DeepSeek's R1 reasoning model has shocked investors and technology firms, given its exceptional performance in cognitive tasks. Following the announcement, shares in Nvidia plummeted 17%, wiping out approximately $589 billion in market value, raising concerns about the necessity for substantial investments in AI hardware. However, shares rebounded by 9% in the following days along with other tech stocks as the market adjusted to the news.

Industry Perspective on Distillation

A source close to OpenAI noted that distillation is a recognized practice within the industry, emphasizing that while OpenAI facilitates this on its own platform, the core issue arises when companies create models for their own competitive interests. Microsoft and OpenAI had previously investigated accounts allegedly linked to DeepSeek that were reported to be using OpenAI’s API, blocking their access based on suspicions of distillation that violated terms of service.

Opinions from Experts

David Sacks, an AI advisor during the Trump administration, remarked on the possibility of intellectual property theft involving DeepSeek. He explained that the distillation technique allows one model to learn and extract knowledge from another, stating, "There’s substantial evidence that what DeepSeek did here is they distilled the knowledge out of OpenAI models."

DeepSeek's Defense

DeepSeek defended its practices, claiming to have trained its V3 model with 671 billion parameters using just 2,048 Nvidia H800 graphics cards, at a cost of $5.6 million—significant savings compared to the vast sums expended by OpenAI and Google. Observers noted instances of the V3 model producing responses that seemed to reflect training on outputs from OpenAI’s GPT-4, thus violating its terms of service.

Common Industry Practices

Experts in the AI field corroborate that leveraging outputs from established models, like those from OpenAI, is a common practice among startups and researchers, as it provides a valuable feedback loop without the associated costs of developing such models from scratch. Ritwik Gupta, an AI PhD candidate at UC Berkeley, noted, "It is not surprising to me that DeepSeek supposedly would be doing the same. Stopping this practice may be difficult."

Challenges Facing AI Companies

The situation illustrates the challenges faced by leading AI companies as they strive to maintain their technological edge against rivals who may capitalize on their groundwork. Chinese enterprises have rapidly learned from their American counterparts, devising strategies to enhance their model training while minimizing reliance on expensive chip resources.

OpenAI's Commitment to Protect Intellectual Property

OpenAI emphasized its commitment to protecting its intellectual property through countermeasures and close collaboration with the U.S. government, particularly to safeguard its advanced models from potential exploitation by competitors.

Legal Challenges for OpenAI

Interestingly, this controversy comes at a time when OpenAI is facing its own legal challenges, with allegations from newspapers and authors—including The New York Times—claiming that the company trained its models using their copyrighted materials without authorization. As the legal landscape continues to evolve, the implications for AI development, intellectual property rights, and international competition remain profound.

Ethical Considerations in AI