Open-Source Models
-
.DeepSeek V3.2 Achieves GPT‑5‑Level Performance While Cutting Training Costs by 90%
.DeepSeek’s new V3.2 model matches OpenAI’s upcoming GPT‑5 on reasoning benchmarks while using a fraction of the training FLOPs, thanks to its Sparse Attention (DSA) architecture and efficient token‑selection. The open‑source base model (93.1 % AIME accuracy) and the higher‑performing V3.2‑Speciale variant (gold‑medal scores on the 2025 IMO and IOI) show that advanced AI no longer requires massive compute budgets. Enterprise users can deploy the models on‑premise, benefiting from lower cost, strong coding performance, and retained reasoning traces, though DeepSeek plans to improve factual coverage and generation fluency.
-
Deep Cogito’s Open LLMs Outperform Similar-Sized Models Using IDA Technique
San Francisco startup Deep Cogito unveils open-source LLMs (3B–70B parameters), claiming superior performance over Meta’s Llama, DeepSeek, and Alibaba’s Qwen in benchmarks like MMLU and GSM8K. Their innovation, Iterated Distillation and Amplification (IDA), enables self-improvement cycles without human feedback, using two phases: computational exploration for advanced reasoning (Amplification) and consolidation of insights into core parameters (Distillation). A 70B IDA-tuned model achieved 91.73% MMLU accuracy, outperforming Llama 3.3 70B. Future plans include larger MoE models (109B–671B) under open licenses, aiming to challenge proprietary AI dominance while sparking debates about scaling intelligence economically.