Anthropic's new default Claude can plan, browse, and run tools on its own

Anthropic's new default Claude can plan, browse, and run tools on its own — type0 | type0

PREVIEWAnthropic's new default Claude can plan, browse, and run tools on its own · MD

Anthropic shipped Claude Sonnet 5 on Tuesday, and for most readers the change is the one they will feel first: the model that now answers by default in the Free and Pro Claude apps is a meaningfully more autonomous system than the Sonnet 4.6 it replaces. Sonnet 5 plans multi-step tasks, drives a browser, runs shell commands in a terminal, and sustains longer workflows without a human steering it, all behavior that until recently lived behind the paywall of Anthropic's larger, more expensive Opus tier.

That is the structural shift under the launch announcement. Sonnet is the middle band in Anthropic's three-tier lineup (smaller Haiku at the bottom, larger Opus at the top, Sonnet in between), and it is the model most developers and most product teams actually pay for. When the middle band gains genuine planning and tool-use capability, it does not just get incrementally better. It absorbs the workload that previously forced an upgrade to Opus, and it pulls that capability down to the free tier at the same time.

Anthropic's framing is that Sonnet 5 is "the most agentic Sonnet model yet," and the press materials translate that into specific capabilities: planning, browser and terminal tool use, and autonomous execution across longer task chains (Anthropic). The company also claims Sonnet 5 performance is close to its top-tier Opus 4.8 on reasoning, tool use, coding, and knowledge work, with substantial improvements over Sonnet 4.6 across the same axes (9to5Mac). Those claims are Anthropic's own, and as of launch day no independent third-party benchmarks, including SWE-bench or the major agentic eval suites, have been published to confirm them. The "close to Opus 4.8" framing should be read as a company benchmark until external measurement arrives.

For developers, the practical change is the price. Anthropic set introductory API pricing at $2 per million input tokens and $10 per million output tokens, running through August 31, 2026, and reverting to $3 per million input and $15 per million output after that (TechCrunch). The intro price is roughly one-third of what Opus 4.8 launched at, and it lands at the same time Anthropic raised rate limits across Chat, Cowork, Claude Code, and the Claude Platform. For teams running agent-style workloads at scale, from code review agents to browser-driven research loops to multi-step coding pipelines, the calculation is now whether the marginal capability gap to Opus is worth paying triple per token, or whether Sonnet 5 plus better scaffolding closes it.

For users, the rollout is more direct. Sonnet 5 is the default model for both the Free and Pro Claude plans starting today. Anyone who logs into Claude.ai or opens the desktop app will be talking to it without flipping a setting, and the difference shows up less in single answers than in how far a session can run before it needs human intervention. Sonnet 4.6 had a habit of stopping to ask after two or three tool calls; Sonnet 5 is positioned to keep going. Whether it does that reliably in real workflows, rather than just in Anthropic's demo tasks, is the question the next week of user reports will answer.

Two adjacent items are worth noting without letting them take over the story. The broader Anthropic lineup has its own constraints: Mythos 5 ships with looser guardrails for select security researchers and platform owners, and Fable 5 was briefly public before being blocked by the U.S. government, with Anthropic working to restore access. Neither item changes the Sonnet 5 launch; both shape how readers should weigh the rest of the catalog.

What to watch next: independent benchmarks on agentic eval suites, especially ones that measure multi-step tool use over long horizons rather than single-shot reasoning; rate-limit behavior on the Claude Platform under realistic agentic load, since the headline number is only as good as the throttle behind it; and confirmation of the post-August 31 standard pricing, since the introductory tier is what makes the launch feel like a price reset and the standard tier is what developers will actually plan budgets around.

Anthropic's new default Claude can plan, browse, and run tools on its own

Sources