Reasoning model, AI glossary

A reasoning model is a language model that thinks before it answers, generating internal “thinking” or “reasoning” tokens that the final response builds on. OpenAI’s o-series (o1, o3), Anthropic’s Claude with extended thinking, and Google’s Gemini Thinking are all examples.

The trade-off: reasoning models are slower and more expensive per response than standard models, but they’re meaningfully better on hard problems (multi-step math, complex code refactors, intricate planning).

When to use a reasoning model

Complex coding tasks with multi-file reasoning
Math, logic, planning problems
Research that requires weighing multiple competing factors
Anything where a wrong answer is more expensive than a slow answer

When NOT to use a reasoning model

Routine drafting, summarising, formatting (overkill)
Anything user-facing where latency matters (reasoning adds seconds-to-minutes per response)
Cost-sensitive workflows (reasoning tokens can 5-10x your output cost)

How it shows up in Claude Code

Claude Code supports “extended thinking” on supported models. You can toggle it explicitly via the model picker, or let the harness decide based on the task. Thinking tokens are billed but not displayed by default.

For most Australian SMB workflows we run, standard Sonnet 4.6 without extended thinking is the right default. Switch to thinking-enabled Opus for multi-file refactors or genuine reasoning challenges.

Watch for: extended thinking blowing your budget

A reasoning-enabled session can spend 10x more output tokens than a standard one. Set token caps. Audit your top 5 sessions monthly.

When to use a reasoning model

When NOT to use a reasoning model

How it shows up in Claude Code

Watch for: extended thinking blowing your budget

Related terms

Get the next one in your inbox

Want this built for your business?

Keep reading

Context window

Large Language Model

Token