Image

Did DeepSeek Use Gemini to Train Its New AI?

DeepSeek, a Chinese AI company, recently launched an updated version of its R1 reasoning model, called R1-0528. This new model shows strong performance in math and coding tasks. However, some developers and researchers now believe that it may have been trained using data from Google’s Gemini AI models.

Melbourne-based developer Sam Paech pointed out that R1-0528 often uses similar words and expressions found in Gemini 2.5 Pro. In a post on X, he shared examples that hint at this overlap. Another developer behind the SpeechMap AI tool also noted that DeepSeek’s thought patterns during problem-solving are almost identical to what Gemini models generate.

This is not the first time DeepSeek has been linked to data from other AI systems. Back in December, their older model, DeepSeek V3, sometimes introduced itself as ChatGPT. This raised concerns that it might have been trained on ChatGPT logs, suggesting possible use of OpenAI’s data.

Earlier this year, OpenAI claimed to have found links between DeepSeek and the use of a method called distillation. This method involves training a smaller model by using output from a larger, more advanced AI. While not illegal, it goes against OpenAI’s rules, which clearly state that users cannot use their models to build rival AI systems.

Microsoft, an OpenAI partner, reportedly discovered large amounts of data leaving OpenAI accounts in late 2024. The company believes these accounts were tied to DeepSeek. This raised further suspicion that DeepSeek may have accessed OpenAI or Gemini outputs to train its own tools.

The challenge here is that today’s internet is flooded with AI-generated content. This makes it very hard for companies to filter out model outputs from training data. AI-generated content appears on blogs, forums, and social media, which are often used as training sources by AI companies. Because of this, different AI models may start to sound the same.

Still, some experts believe DeepSeek might have used outputs from top-performing models like Gemini. Researcher Nathan Lambert shared his thoughts on X, saying that if DeepSeek has limited GPUs but plenty of funding, using synthetic data from advanced models could help them train more efficiently.

To stop the misuse of their models, AI companies are tightening their security. OpenAI now asks users to verify their identity using a government ID before accessing its advanced tools. This rule applies only to countries supported by OpenAI, and China is not one of them.

Google has also added new protection by summarizing traces generated by its AI models on its developer platform. This makes it harder for rivals to copy these models. Anthropic, another AI company, is doing the same to protect its technology.

As of now, Google hasn’t made any public comments on the issue.

Releated Posts

Can a Machine Ever Love You Back? The Truth About AI Romance

People are falling in love with artificial intelligence. It sounds like something from a movie, but it is…

ByByNipuni Tharanga Feb 12, 2026

ChatGPT Now Shows Ads: What Free Users in the US Need to Know

OpenAI has started showing advertisements in ChatGPT for users in the United States. This change affects people using…

ByByNipuni Tharanga Feb 10, 2026

Beautiful But Toxic: The Hidden Dangers in Common Flowers

Flowers bring beauty and fragrance into our homes and gardens, but many popular blooms have a hidden, less…

ByByNipuni Tharanga Feb 10, 2026

Spotify Upgrades Lyrics with Previews, Translations, and Offline Access

Spotify is making it easier and more engaging for users to connect with the words in their favorite…

ByByNipuni Tharanga Feb 6, 2026

Leave a Reply

Your email address will not be published. Required fields are marked *