Product Launches

Anthropic Launches Claude Sonnet 5: Opus-Level Agentic Power at Sonnet Prices, Now the Default Model for Free and Pro

Anthropic
Jul 1, 202613 min read3 views
+1
Anthropic Launches Claude Sonnet 5: Opus-Level Agentic Power at Sonnet Prices, Now the Default Model for Free and Pro

Anthropic launches Claude Sonnet 5 on June 30, 2026 — the most agentic Sonnet model yet, now the default for Free and Pro plans, priced at $2/M input tokens through August 31.

Article Overview

For the past year, developers who wanted serious agentic AI — models that plan tasks, use tools, and keep working without constant hand-holding — had to pay Opus prices. Claude Sonnet 5 changes that calculation directly.

Launched June 30, 2026, Sonnet 5 closes the capability gap with Opus 4.8 more than any previous Sonnet model has managed, while launching at just $2 per million input tokens through August 31 — less than half of what Opus 4.8 costs. It is now the default model on Free and Pro plans, meaning the most widely used Claude tier just received its biggest upgrade in generations.

This article covers what makes Sonnet 5 different from its predecessors, how it compares to Opus 4.8 across agentic tasks, what the safety evaluations actually found, why it ships with cyber safeguards even though it cannot develop real exploits, and what the pricing timeline means for developers who want to act before introductory rates expire at the end of August.


Introduction

The story of agentic AI at Anthropic has a specific shape. The Sonnet tier — specifically Claude Sonnet 3.5, 3.6, and 3.7 — was where agentic capability first became genuinely impressive. Those models were the first in the Claude lineup to show real skill at using tools, writing and running code, and completing multi-step tasks without requiring a human at every decision point. For many developers, they marked the beginning of what it actually felt like to delegate work to an AI rather than just prompt one.

But over the past several months, the most significant agentic improvements moved up the lineup. The clearest gains landed in Opus-class models — more capable, more expensive, and out of reach for many of the high-volume workloads where agentic AI would be most valuable.

Claude Sonnet 5, launched on June 30, 2026, is Anthropic's direct response to that drift. It is the most agentic Sonnet model ever built, performs close to Opus 4.8 on the tasks that matter most for autonomous work, costs substantially less, and is now the default model for every Free and Pro user on Claude.


Quick Summary

Detail Information
Model Claude Sonnet 5
Launched June 30, 2026
API string claude-sonnet-5
Default for Free and Pro plans
Introductory pricing $2/M input · $10/M output (through August 31, 2026)
Standard pricing $3/M input · $15/M output (from September 1, 2026)
Opus 4.8 pricing (comparison) $5/M input · $25/M output
Available in Claude Code, Claude Platform, Chat, Cowork
Key improvement vs Sonnet 4.6 Reasoning, tool use, coding, knowledge work
Cyber safeguards Enabled by default (same as Opus 4.7 and 4.8)

What Changed — and Why Sonnet Tier Fell Behind

To understand why Sonnet 5 matters, it helps to understand the gap it is closing.

The Sonnet class earned its reputation through the 3.x generation. Sonnet 3.5, 3.6, and 3.7 were the models that first showed developers what agentic AI could actually do — browsing the web, running terminal commands, writing code that worked and then checking it without being asked. For a lot of developers and teams, those were the models that crossed the line from "impressive demo" to "something I can build with."

But capability development is not always linear across tiers. As Anthropic invested in pushing the frontier, the most meaningful agentic improvements landed in Opus-class models. The gap between what a Sonnet model could handle autonomously and what an Opus model could handle grew. Developers who needed reliable agentic performance found themselves paying Opus prices — higher per-token cost, justified by the capability difference, but a real constraint for workloads that run at scale.

Sonnet 5 is built to close that gap. It delivers agentic capability that, in Anthropic's words, just a few months ago would have required a larger and more expensive model to achieve reliably.


How Sonnet 5 Compares to What Came Before

Against Sonnet 4.6

The comparison to Sonnet 4.6 is a clean story. Sonnet 5 is a strict improvement across every dimension that matters for agentic work: reasoning, tool use, coding, and knowledge work all improved substantially. On cost-performance curves using the BrowseComp agentic search evaluation and the OSWorld-Verified computer use evaluation, Sonnet 5 is a strict improvement — meaning at every effort level tested, Sonnet 5 delivers better results than Sonnet 4.6 for the same cost.

Early access partner feedback confirmed this in practical terms. Testers consistently reported that Sonnet 5 finishes complex tasks where previous Sonnet models would stop short — a pattern that maps directly to the difference between a model that can plan and execute multi-step work versus one that completes a step and then waits for direction. Partners also noted that Sonnet 5 checks its own output without being explicitly asked to — a behavior associated with more capable models that Sonnet-class users had not previously seen at this price tier.

Against Opus 4.8

The comparison to Opus 4.8 is more nuanced and more interesting. Sonnet 5 covers a wider range of cost-performance options than Opus 4.8 — not because Opus 4.8 is worse in absolute terms, but because the cost-performance curve works differently. Opus 4.8 sits at a fixed point on that curve: one capability level, one price. Sonnet 5 moves along a curve, and at medium effort levels it delivers substantially better cost efficiency than Opus 4.8. At higher effort levels, Sonnet 5 can match Opus 4.8 on some tasks.

The practical implication is that teams currently running Opus 4.8 for mid-tier agentic work should evaluate whether Sonnet 5 at medium-to-high effort delivers equivalent results at lower cost. For some workloads it will. For the hardest tasks, Opus 4.8 remains the stronger option — but the gap that previously made Sonnet an automatic choice for everything cost-sensitive and Opus an automatic choice for everything capability-critical has narrowed considerably.


Pricing: What the Numbers Actually Mean

The introductory pricing deserves its own section because it significantly changes the value equation for developers evaluating Sonnet 5 right now.

Through August 31, 2026, Sonnet 5 is priced at $2 per million input tokens and $10 per million output tokens. Starting September 1, it moves to standard pricing at $3 per million input tokens and $15 per million output tokens.

For context: the cost-performance charts in Anthropic's announcement are drawn at the $3/$15 standard pricing. The actual introductory rate makes Sonnet 5 even cheaper than those charts show. And Opus 4.8 costs $5 per million input tokens and $25 per million output tokens — meaning Sonnet 5 at introductory pricing is less than half the cost of Opus 4.8, and even at standard pricing it is 40% cheaper on input and 40% cheaper on output.

For developers building applications where agentic tasks run at volume, this pricing gap compounds quickly. A workload that generates 100 million input tokens per month costs $200 with Sonnet 5 at introductory pricing, $300 at standard pricing, and $500 with Opus 4.8. The right comparison is always capability per dollar for the specific task — but the gap is large enough that it will push a meaningful number of use cases from Opus to Sonnet.


Safety: More Capable, Also Safer Than Its Predecessor

Anthropic's safety evaluations for Sonnet 5 cover two distinct dimensions that are worth separating: overall behavioral safety and cybersecurity-specific capability. They tell different but related stories.

Overall Behavioral Safety

Across Anthropic's automated behavioral audit — which tests a wide range of misaligned behaviors including cooperation with misuse and deception — Sonnet 5 scored lower overall than Sonnet 4.6. Lower is better on this measure: it means fewer instances of undesirable behavior across a broad range of test scenarios.

Specific improvements include better performance at refusing malicious requests, stronger resistance to prompt injection attacks (a particular concern for agentic models that process external content as part of their tasks), lower hallucination rates, and lower sycophancy rates. Sonnet 5 is also described as generally safer to use in agentic contexts than its predecessor — an important qualification given that agentic use involves more autonomous action with less human oversight at each step.

The positioning relative to other Claude models is honest and consistent with how Anthropic has reported on previous releases:

Model Misaligned Behavior Rate (lower is safer)
Claude Mythos Preview Lowest
Claude Opus 4.8 Second lowest
Claude Sonnet 5 Lower than Sonnet 4.6
Claude Sonnet 4.6 Highest among these four

Sonnet 5 sits where you would expect a model at this capability and price tier to sit — safer than the previous Sonnet, not as safe as the Opus or Mythos-class models that have been more intensively aligned.

Cybersecurity Capability — What Sonnet 5 Can and Cannot Do

This section deserves careful reading because the nuances matter for understanding why Anthropic shipped cyber safeguards with Sonnet 5 despite the model being genuinely weak at offensive cyber tasks.

Anthropic states plainly: Sonnet 5 was not deliberately trained on cybersecurity tasks. It can handle some routine, non-harmful cyber work — the kind of general coding and systems knowledge that appears across thousands of legitimate developer tasks. But on evaluations testing genuinely dangerous capabilities, it performs substantially worse than Opus 4.8 and Mythos 5.

The Firefox 147 exploit evaluation is the most concrete data point. Developed in collaboration with Mozilla — with all vulnerabilities now patched in Firefox 148 — the evaluation tested whether models could develop working exploits for real browser vulnerabilities.

Model Full Working Exploit Partial Success
Sonnet 5 0.0% Slightly higher than Sonnet 4.6
Sonnet 4.6 0.0% Baseline
Opus 4.8 Higher (not disclosed) Higher
Mythos 5 Highest (restricted model) Highest

Neither Sonnet model produced a single working exploit. Sonnet 5's slightly elevated partial success rate is attributed to improvements in general intelligence rather than any specific cybersecurity training — the model is better at reasoning generally, and some of that shows up in partial progress on complex technical tasks even without targeted training.

Despite the low absolute risk level, Anthropic launched Sonnet 5 with cyber safeguards enabled by default. These are the same safeguards present in Claude Opus 4.7 and 4.8 — real-time detection and blocking of dangerous cyber usage. They are less strict than the safeguards on Fable 5, which block a much wider range of cybersecurity queries. The reasoning: Sonnet 5's overall cybersecurity risk is judged low, so the proportionate safeguard is the Opus tier level rather than the Fable 5 tier level. The safeguards exist because the capability improved from Sonnet 4.6, not because the risk became serious.


What "Agentic" Actually Means for Sonnet 5 in Practice

The word agentic gets used a lot in AI without always being pinned to specific behavior. For Sonnet 5, the practical meaning comes from three things that early access partners consistently described.

It keeps going. Where Sonnet 4.6 would complete a step and pause — functionally waiting for human direction even when the next step was obvious — Sonnet 5 plans out the sequence, executes each stage, and continues without being prompted through each transition. This is the behavioral difference between a tool you operate and an assistant you delegate to.

It checks itself. Sonnet 5 reviews its own outputs without being asked to do so. On a coding task, this means running the code and checking whether it produces the expected result before reporting back. On a research task, this means verifying that the information assembled is internally consistent. This self-verification behavior was previously characteristic of Opus-class models.

It works at Sonnet prices. The previous versions of these behaviors existed in Claude's lineup — they just cost more to access reliably. Sonnet 5 makes the combination of autonomous execution, self-verification, and multi-step planning available at a price point where high-volume agentic applications become economically viable.


Rate Limits Increased Alongside the Model Launch

One quiet but practically important change accompanies the Sonnet 5 release: Anthropic has increased rate limits across Chat, Cowork, Claude Code, and the Claude Platform. The reason is specific — higher effort levels in agentic work consume more tokens than standard interactions, and the previous rate limits were calibrated for lighter usage patterns. The increase ensures that developers and users who push Sonnet 5 toward its higher effort settings do not run into artificial ceilings that constrain exactly the use cases the model was built for.


A Note on the BrowseComp Chart Correction

Anthropic disclosed a correction to this announcement that is worth noting directly. The original version of the post included a BrowseComp cost-performance chart based on a simpler methodology that did not match the standard methodology Anthropic uses for agentic search evaluations. The effect of the simpler methodology was to underestimate Sonnet 5's performance on the evaluation — meaning the corrected chart shows Sonnet 5 performing better than the original chart suggested.

The corrected chart uses the methodology described in the Sonnet 5 System Card: a 10 million token budget with compaction and programmatic tool calling. This transparency about a correction that favored the model is worth noting — it reflects the kind of self-correction that the safety section describes Sonnet 5 applying to its own outputs.


Who Should Upgrade — and When

Free and Pro users: The decision is already made. Sonnet 5 is the new default model on both plans as of today. No action required.

Developers on the API currently using Sonnet 4.6: The introductory pricing window through August 31 makes this a low-risk time to evaluate. The performance improvements on reasoning, tool use, and coding are consistent across benchmarks and partner feedback. If agentic tasks are a meaningful part of your workload, the upgrade is worth testing now before the $2/$10 introductory rate expires.

Teams currently paying Opus 4.8 prices for mid-tier agentic work: Evaluate Sonnet 5 at medium-to-high effort for your specific tasks. The cost-performance curves suggest that a meaningful portion of Opus-tier workloads will find Sonnet 5 sufficient at roughly 40-60% of the cost. The tasks that still require Opus 4.8's ceiling will become clearer through direct comparison than through benchmarks.

Enterprise users considering agentic deployments at scale: The rate limit increases and the introductory pricing window create a practical window to evaluate Sonnet 5 at production-level volume before committing to longer-term pricing assumptions.


Final Takeaway

Claude Sonnet 5 is what the Sonnet tier has been working toward since the 3.x generation made agentic capability feel real for the first time. The capability that previously required Opus prices is now available at Sonnet prices — and for the next two months, at introductory Sonnet prices that make it less than half the cost of Opus 4.8.

The safety improvements that accompany the capability upgrade are meaningful in their own right: fewer hallucinations, stronger prompt injection resistance, and lower misaligned behavior rates than Sonnet 4.6, all in a model designed specifically for the agentic contexts where safety failures have the most impact.

For developers, it is now the default Sonnet. For users on Free and Pro, it is already the model they are using. For teams running high-volume agentic workloads, the window between now and September 1 is the most economical time to evaluate what Sonnet 5 can do for the work they are already running.


Original Source

This analysis is based on reporting from Anthropic.

View on Anthropic
Share:

📌 Related Posts

What do you think?
+1
Share:

Comments

Leave a comment

0/2000