LLMs

  • 7 AI Werewolves: GPT-5 Dominates, Kimi’s Aggressive Tactics

    In a benchmark test simulating social dynamics, seven LLMs played the game Werewolf. GPT-5 significantly outperformed the others with a 96.7% win rate, demonstrating superior strategic thinking and manipulation skills. Other models, including Qwen3 and Kimi-K2, showed respectable performance. Analysis revealed distinct personality traits in each model; for example, Kimi-K2 exhibited aggressive tactics. The experiment highlights the importance of social skills for AI agents operating within human teams, alongside traditional benchmarks.

    2025年9月2日
  • Liang Zhihui, VP of 360 Group: Empowering Everyday Users with an “AI Expert Team” Through Super Search Intelligent Agents

    At the AGI Playground, Liang Zhihui of 360 Group introduced “Super Search,” an AI-driven search engine. Moving beyond traditional keyword-based methods, Super Search uses a task engine with autonomous planning to address complex queries. It leverages LLMs, along with specialized tools and models, including domestic ones, to break down and execute intricate tasks, as for planning a rock-climbing trip. Features include intelligent agent creation and integration of high-quality models.

    2025年6月24日