Home > News > ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

By BenjaminFeb 26,2025

OpenAI suspects that China's DeepSeek AI models, significantly cheaper than Western counterparts, may have been trained using OpenAI's data. This revelation, coupled with DeepSeek's rapid ascent in popularity, triggered a sharp decline in the stock prices of major AI companies, particularly Nvidia, which experienced its largest single-day loss in history.

DeepSeek's R1 model, based on the open-source DeepSeek-V3, boasts significantly lower training costs (estimated at $6 million) and computational requirements compared to Western AI models. While this claim is disputed by some, it has raised concerns about the substantial investments made by American tech companies in AI.

OpenAI and Microsoft are investigating whether DeepSeek violated OpenAI's terms of service by using its API or employing "distillation," a technique to extract data from larger models. OpenAI confirmed its awareness of such attempts by Chinese and other companies to replicate leading U.S. AI models and emphasized its commitment to protecting its intellectual property. David Sacks, President Trump's AI czar, also indicated evidence suggesting DeepSeek used distillation to leverage OpenAI's models.

DeepSeek is accused of using OpenAI’s model to train its competitor using distillation. Image credit: Andrey Rudakov/Bloomberg via Getty Images.
This situation highlights the irony of OpenAI's accusations, given previous claims that creating models like ChatGPT is impossible without using copyrighted material. OpenAI's own submission to the UK's House of Lords acknowledged the reliance on copyrighted works for training its large language models. This has led to criticism, particularly in light of lawsuits filed against OpenAI by the New York Times and 17 authors alleging copyright infringement. The ongoing debate underscores the complex legal and ethical challenges surrounding the training of AI models on copyrighted material, especially in the rapidly evolving landscape of generative AI. A 2018 U.S. Copyright Office ruling that AI-generated art cannot be copyrighted further complicates the issue.

Previous article:Horror Game 'Coma 2' Unveils Spooky Dimension Next article:Silver & Blood: Requiem Announces Major Summer Update