Large Language Models

  • AI Safety Benchmark: Code Model Safety Testing Results Released

    CAICT’s AI Institute launched security benchmark testing for code-generating LLMs, assessing risks and capabilities using a dataset of 15,000+ test cases across nine languages and various attack methods. The initial assessment of 15 Chinese models (3B-671B parameters) revealed varied security levels, with most exhibiting medium risk. Models showed weaknesses in scenarios involving malicious intent, highlighting vulnerabilities to cyberattacks. CAICT plans to expand testing to international models and develop mitigation tools, aiming to promote a secure LLM ecosystem.

    2025年7月21日
  • Tiger Zhu: Large Models Will Devour 90% of Agents

    GSR Ventures Managing Partner Zhu Xiaohu predicts that large language models (LLMs) will “devour” 90% of AI Agents. His comments, shared on Xiaohongshu, follow previous skepticism about embodied AI and highlight his firm’s bullish stance on the broader AI landscape, evidenced by investments in companies like Robopoet and LiblibAI. Zhu likened AI Agent startups to early internet webmasters, suggesting they learn from successful internet companies. His perspective sparks debate about the long-term viability of standalone AI Agents.

    2025年7月14日
  • Huawei Releases Pangu-7B Dense and 72B Mixture-of-Experts Models as Open Source

    Huawei has open-sourced its Pangu 7B dense and Pangu-Pro MoE 72B large language models, along with Ascend-based inference technology. This move supports Huawei’s Ascend ecosystem strategy, aiming to accelerate AI research and application. The Pangu-Pro MoE 72B model shows strong performance, ranking highly on benchmarks for models under 100 billion parameters.

    2025年6月29日