Image

Did DeepSeek Use Gemini to Train Its New AI?

DeepSeek, a Chinese AI company, recently launched an updated version of its R1 reasoning model, called R1-0528. This new model shows strong performance in math and coding tasks. However, some developers and researchers now believe that it may have been trained using data from Google’s Gemini AI models.

Melbourne-based developer Sam Paech pointed out that R1-0528 often uses similar words and expressions found in Gemini 2.5 Pro. In a post on X, he shared examples that hint at this overlap. Another developer behind the SpeechMap AI tool also noted that DeepSeek’s thought patterns during problem-solving are almost identical to what Gemini models generate.

This is not the first time DeepSeek has been linked to data from other AI systems. Back in December, their older model, DeepSeek V3, sometimes introduced itself as ChatGPT. This raised concerns that it might have been trained on ChatGPT logs, suggesting possible use of OpenAI’s data.

Earlier this year, OpenAI claimed to have found links between DeepSeek and the use of a method called distillation. This method involves training a smaller model by using output from a larger, more advanced AI. While not illegal, it goes against OpenAI’s rules, which clearly state that users cannot use their models to build rival AI systems.

Microsoft, an OpenAI partner, reportedly discovered large amounts of data leaving OpenAI accounts in late 2024. The company believes these accounts were tied to DeepSeek. This raised further suspicion that DeepSeek may have accessed OpenAI or Gemini outputs to train its own tools.

The challenge here is that today’s internet is flooded with AI-generated content. This makes it very hard for companies to filter out model outputs from training data. AI-generated content appears on blogs, forums, and social media, which are often used as training sources by AI companies. Because of this, different AI models may start to sound the same.

Still, some experts believe DeepSeek might have used outputs from top-performing models like Gemini. Researcher Nathan Lambert shared his thoughts on X, saying that if DeepSeek has limited GPUs but plenty of funding, using synthetic data from advanced models could help them train more efficiently.

To stop the misuse of their models, AI companies are tightening their security. OpenAI now asks users to verify their identity using a government ID before accessing its advanced tools. This rule applies only to countries supported by OpenAI, and China is not one of them.

Google has also added new protection by summarizing traces generated by its AI models on its developer platform. This makes it harder for rivals to copy these models. Anthropic, another AI company, is doing the same to protect its technology.

As of now, Google hasn’t made any public comments on the issue.

Releated Posts

Netflix Is Reportedly Testing an AI-Powered Voice Search Feature

Netflix is testing a new artificial intelligence feature. The streaming giant is reportedly building an AI-powered voice search…

ByByNipuni Tharanga May 8, 2026

WhatsApp Could Soon Begin Testing Redesigned Liquid Glass UI for Chats on iOS

WhatsApp is working on bringing its Liquid Glass design to the chat interface on iOS. The update promises…

ByByNipuni Tharanga May 5, 2026

Meta to Cut One in 10 Jobs as Spending on AI Soars

Meta has announced it will cut 10% of its workforce next month, roughly 8,000 employees. The company also…

ByByNipuni Tharanga Apr 24, 2026

Apple Names John Ternus as New CEO to Replace Tim Cook

Apple has announced that John Ternus will take over as chief executive officer, ending months of speculation about…

ByByNipuni Tharanga Apr 21, 2026

Leave a Reply

Your email address will not be published. Required fields are marked *