Multimodal

  • OpenAI Introduces GPT‑5.2, Claiming Superior Performance on Professional Tasks

    words.OpenAI unveiled GPT‑5.2, a higher‑performing generative‑AI model aimed at professional tasks such as spreadsheet creation, presentation design, image interpretation, code generation, and extended‑context work. Released via ChatGPT and API, it comes in three tiers—Instant, Thinking, and Pro—tailored to speed, structured tasks, and high‑accuracy demands. GPT‑5.2 leads on benchmarks like SWE‑Bench Pro and GPQA Diamond, though Anthropic’s Opus 4.5 edges it on a narrower coding test. Built on a larger transformer, multimodal embeddings, and a near‑million‑token context window, the model emphasizes improved factuality, safety, and enterprise revenue potential amid intensified AI competition.

    2026年1月18日
  • Baidu Open-Sources WENQING 4.5 Series Models, Featuring 10 AI Models

    Baidu has released its ERNIE 4.5 open-source large language model series, featuring ten models including Mixture-of-Experts (MoE) variants. These models offer fully open-sourced pre-training weights and inference code, accessible via platforms like Hugging Face. The series boasts innovative heterogeneous MoE architecture for multimodal capabilities and achieves state-of-the-art performance on various benchmarks, outperforming competitors in text and multimodal tasks. The models are distributed under the Apache 2.0 license, promoting both academic and commercial use.

    2025年6月29日