PCWorld reports that Anthropic’s Claude Opus 4.8 focuses on improving AI honesty by teaching the model to admit when it lacks information. The model achieved near-perfect scores in honesty benchmarks ...
The newest AI model model to hit the market is promoted as being remarkably honest. It’s a well-known fact that AI models can hallucinate and jump to conclusions, but according to Anthropic, early ...
Anthropic today announced the launch of its latest AI model, Claude Opus 4.8. Anthropic claims the model is a "more effective collaborator" with improvements in agentic coding, multidisciplinary ...
On benchmarks, Opus 4.8 is a step up rather than a leap. It scores 88.6% on SWE-bench Verified (vs. 87.6% for Opus 4.7), 69.2% on the harder SWE-bench Pro (vs. 64.3%), and 74.6% on Terminal-Bench 2.1 ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results