Microsoft and OpenAI are investigating whether a group linked to Chinese AI startup DeepSeek improperly obtained data from OpenAI’s technology.
According to Bloomberg, the probe follows concerns that the data extraction could breach terms of service or indicate unauthorised access by individuals associated with DeepSeek.
Microsoft’s security researchers observed suspicious activity in the fall, where individuals believed to be linked to DeepSeek were using OpenAI’s Application Programming Interface (API) to exfiltrate large amounts of data. The API is a licensed system for integrating OpenAI’s AI models into external applications, but misuse could violate these terms.
DeepSeek’s release of its R1 model in early January caused significant market turmoil, surpassing expectations and outperforming US competitors despite using far fewer resources than OpenAI or Google. This rapid rise raised questions about the underpinnings of the US stock market boom, which relies on AI hyperscalers investing heavily in computing power.
Tech stocks, including Microsoft, Nvidia, Oracle, and Alphabet, experienced a substantial drop in value as investors reconsidered their reliance on costly AI hardware. The decline wiped out nearly $1 trillion in market value before stabilising somewhat.
David Sacks, President Donald Trump’s AI czar, commented on the evidence, telling Fox News that there is “substantial evidence” DeepSeek used OpenAI’s models to train its own AI, a process known as distillation. This technique allows smaller models to replicate larger ones by learning from their outputs, potentially breaching OpenAI’s terms of service.
OpenAI responded by acknowledging that Chinese firms often attempt to replicate US technology. The company stated they are taking measures to protect their intellectual property and collaborate with the US government to safeguard advanced models.
While DeepSeek denied any wrongdoing during the lunar new year holiday, experts suggest that using outputs from larger models for training is a common practice in AI development. This highlights the challenges faced by companies seeking to protect their technical edge.
The probe underscores growing tensions between US and Chinese firms over intellectual property rights, as OpenAI faces its own legal battles regarding alleged unauthorised use of copyrighted data. In a statment to the House of Lords in early 2024, the company admitted "it would be impossible to train today’s leading AI models without using copyrighted materials."
Recent Stories