Anthropic

  • Anthropic Uses AI Agents to Audit Models for Safety

    Anthropic is using AI agents to audit and improve the safety of its AI models, like Claude. This “digital detective squad” includes Investigator, Evaluation, and Red-Teaming Agents that identify vulnerabilities and potential harms proactively. These agents have successfully uncovered hidden objectives, quantified existing problems, and exposed dangerous behaviors in AI models. While not perfect, these AI safety agents help humans focus on strategic oversight and pave the way for automated AI monitoring as systems become more complex.

    41 mins ago
  • Claude 4 Makes Surprise Debut: Runs Non-Stop for 7 Hours

    Anthropic unveils Claude 4, featuring Opus 4 and Sonnet 4. Both models excel in programming and reasoning, with Opus 4 demonstrating remarkable endurance by refactoring code for seven hours straight. Key advancements include dual-mode reasoning, external memory, and tool usage. The official Claude Code is also released. Sonnet 4 is now free for all users, while developers gain new API features like code execution and file API. Pricing for Opus 4 and Sonnet 4 is competitive.

    2025年5月27日