Models Apr 16, 2026

Anthropic releases Claude Opus 4.7 — the SWE benchmark just moved

Opus 4.7 brings notable gains on the hardest software engineering tasks, with users reporting confident hand-off of work that previously required close supervision.

Anthropic shipped Claude Opus 4.7 on Apr 16, calling it “a notable improvement on Opus 4.6 in advanced software engineering, with particular gains on the most difficult tasks.”

The headline claim from Anthropic’s own announcement: users report being able to hand off their hardest coding work — the kind that previously needed close supervision — with confidence. Opus 4.7 handles complex, long-running tasks with rigor and consistency, pays precise attention to instructions, and now uses methods to verify its own output before reporting back.

What changed

Better self-verification. Opus 4.7 explicitly checks its own work mid-task and corrects course before returning a final answer.
Vision improvements. Stronger multimodal reasoning.
Same pricing tier as Opus 4.6 — no premium for the upgrade.

Industry context

Opus 4.7 lands one week ahead of OpenAI’s GPT-5.5 release, which TechCrunch and TNW frame as OpenAI’s response to Anthropic’s lead in the enterprise coding market. The two-model arms race is now happening on roughly weekly cadence, with Google’s Gemini 3.1 Pro powering Deep Research agents released the same week.

The 2026 picture: three frontier labs trading model releases on roughly monthly intervals, with benchmark deltas measured in single percentage points. The differentiation is moving from raw intelligence to agentic reliability — can you trust the model to complete a 4-hour task without supervision?

Source: Anthropic

#anthropic #claude #swe-bench #coding