Claude Beat GPT-5 (But There's a Catch)

367 views
October 6, 2025
Claude Sonnet 4.5 is ranking higher than GPT-5 on LMArena, but benchmarks are misleading because models are trained to game them. The scoring system penalizes models for saying "I don't know," so they hallucinate instead of admitting uncertainty. I've been using Sonnet for everything lately—it's excellent for coding and writing, though some people don't like how bluntly it calls you out when you're wrong.