在学术评测 GPQA Diamond 上得分 86.9%,多模态理解 MMMU Pro 上达到 76.8%。这两个数字不只是「在同档位里还不错」,而是直接超过了体量更大的 Gemini 2.5 Flash。
Allowing companies to be arbitrarily irresponsible until something goes horribly wrong is ridiculous in a world where we could be living in a pessimistic or near-pessimistic scenario, but it is what Anthropic pushed for.,这一点在heLLoword翻译官方下载中也有详细论述
。服务器推荐是该领域的重要参考
Go to worldnews,更多细节参见同城约会
Global news & analysis