Claude Opus 4.7: What Adaptive Thinking Changes About AI Product Design

WHAT CHANGED FROM OPUS 4.6 TO 4.7

The clearest break between Opus 4.6 and 4.7 is what was removed. The budget_tokens parameter — which let developers specify exactly how many tokens the model could spend on internal reasoning — is gone in 4.7. Passing it now returns a 400 error. The temperature, top_p, and top_k sampling parameters are also removed. The model's internal reasoning process is no longer exposed to the same set of dials.

What replaced it is adaptive thinking: thinking: {type: "adaptive"}. The model decides autonomously how much to think based on the complexity of the prompt. A simple extraction task gets a brief reasoning pass. A complex multi-step code migration gets an extended one. The allocation is invisible to the caller and driven by the model's own assessment of what the task requires.

This might sound like losing control, but in practice it trades a false precision for a real one. Developers who set budget_tokens: 8192 were guessing at what their task needed. The model was constrained to fit its reasoning into that budget whether or not it was appropriate. Adaptive thinking removes that friction — and the benchmarks show it: Opus 4.7 outperforms Opus 4.6 with any fixed budget on reasoning-intensive tasks, because the model can now use exactly as much thought as the task merits.

THE EFFORT PARAMETER IS YOUR NEW CONTROL SURFACE

The replacement for budget_tokens is the effort parameter inside output_config. Where budget_tokens was a token ceiling, effort is a quality-cost dial with five levels on Opus 4.7: "low", "medium", "high", "xhigh", and "max". The default is "high". The "xhigh" tier is new to Opus 4.7 and is the recommended setting for most coding and agentic use cases — it sits between high and max in cost and quality.

The practical decision framework: use "low" for subagent tasks, classification, and short-form generation where speed matters more than depth. Use "high" as the default for most intelligence-sensitive work — it balances quality and token efficiency well. Use "xhigh" when you're doing anything involving code generation, multi-step reasoning, or debugging where getting it right matters. Reserve "max" for the cases where correctness is genuinely paramount and cost is secondary: security review, complex data migrations, or financial analysis where an error has real consequences.

One important implementation note: on Opus 4.7, the effort parameter influences model behaviour more dramatically than on any prior Opus. Teams migrating from Opus 4.6 with budget_tokens should expect to re-tune their effort settings rather than assuming a direct mapping. The same task that needed a high budget on 4.6 may need only "high" effort on 4.7, or may benefit from "xhigh" where 4.6 would have plateaued regardless of tokens spent.

THINKING CONTENT IS NOW HIDDEN BY DEFAULT

One behavioural change that catches teams off guard: in Opus 4.7, the model still generates thinking blocks when reasoning — they still stream — but their text content is empty by default. To see the reasoning, you must explicitly opt in with thinking: {type: "adaptive", display: "summarized"}. The default is "omitted".

For applications that stream reasoning to users as a progress signal — "Claude is thinking about your request..." — this means the experience breaks silently. The model appears to pause with no visible activity, then output begins. Setting display: "summarized" restores visible progress by sending a condensed summary of the reasoning rather than the full chain of thought.

The broader implication for product design: if you were using thinking block content as an explanation or audit trail for the model's decisions, you need to explicitly request it. The default is to hide the reasoning, which makes responses faster to process but removes a transparency layer that some applications genuinely need. Know what your application does with thinking content before migrating.

TASK BUDGETS: THE NEW AGENTIC PRIMITIVE

Opus 4.7 introduced task budgets in beta: a way to tell the model how many tokens it has for an entire agentic loop, not just a single response. Set via output_config: {task_budget: {type: "tokens", total: N}} (minimum 20,000 tokens, requires beta header task-budgets-2026-03-13), the model receives a running countdown of remaining budget and self-regulates its behaviour accordingly.

This is qualitatively different from max_tokens, which is a hard ceiling the model cannot see. Task budgets are advisory — the model knows it has 50,000 tokens left for the whole job and adjusts how deeply it reasons, how many tool calls it makes, and how thoroughly it verifies its own output. As the budget decreases, the model naturally becomes more decisive and less exploratory.

For agentic applications with long-running tasks — code generation spanning multiple files, research agents that browse and synthesize, multi-step data transformation pipelines — task budgets give you a principled way to bound total spend without interrupting the agent mid-task. The model handles the allocation problem internally rather than requiring you to tune per-step limits and hope they add up correctly.

WHAT THIS MEANS FOR HOW YOU BUILD

The Opus 4.7 changes push AI application design toward higher-level abstractions and away from token-level tuning. You're no longer asked to know how many tokens a specific reasoning step requires. You're asked to know how much correctness your application needs relative to its cost tolerance — and to express that through effort levels and task budgets rather than raw token counts.

Teams that invested heavily in prompt engineering to fit reasoning into specific token windows need to revisit those prompts on 4.7. The constraints they were engineering around no longer exist, and the prompts that worked with forced token limits often over-specify instructions that the model's adaptive reasoning would handle naturally given appropriate effort. The migration is not a one-to-one port — it's an opportunity to simplify.

The result for well-tuned applications is better quality at similar or lower cost than comparable Opus 4.6 configurations. The model spends compute where it matters and skips internal deliberation where it doesn't. That's the bet Anthropic made by removing the manual dials — and in practice across the applications we've migrated, it's paying off.