News

DeepSeek also said it distilled the reasoning steps used in R1-0528 into Alibaba’s Qwen3 8B Base model. That process created a new, smaller model that surpassed Qwen3’s performance by more ...
Despite the significant attention the R1 model garnered at its launch, the latest update was released with fewer details. However; DeepSeek later disclosed on X that the R1-0528 version boasted ...
DeepSeek today rolled out DeepSeek-R1-0528, an upgraded version of its R1 large language model that it says now rivals OpenAI's O3 and Google's (NASDAQ:GOOG) Gemini 2.5 Pro. The China-based AI ...
This gain is made possible by TNG’s Assembly-of-Experts (AoE) method — a technique for building LLMs by selectively merging the weight tensors ...
For instance, in the AIME 2025 test, DeepSeek-R1-0528’s accuracy jumped from 70% to 87.5%, indicating deeper reasoning processes that now average 23,000 tokens per question compared to 12,000 in ...
Deepseek R1-0528 challenges proprietary AI models like OpenAI’s GPT-4 and Google’s Gemini 2.5 Pro by offering comparable performance at significantly lower costs, providing widespread access ...
Deepseek’s R1-0528 AI model competes with industry leaders like GPT-4 and Google’s Gemini 2.5 Pro, excelling in reasoning, cost efficiency, and technical innovation despite a modest $6 million ...
DeepSeek released an updated version of their popular R1 reasoning model (version 0528) with – according to the company – increased benchmark performance, reduced hallucinations, and native support ...
German firm TNG has released DeepSeek-TNG R1T2 Chimera, an open-source variant twice as fast as its parent model thanks to a ...
Chinese AI startup DeepSeek has not yet determined the timing of the release of its R2 model as CEO Liang Wenfeng is not ...