Token
The unit of text a language model reads and writes. Roughly 4 characters of English text per token. Models are priced per million tokens.
A token is the unit of text a language model processes. It’s neither a word nor a character, it’s a chunk somewhere in between, defined by the model’s tokenizer.
For English text, a rough rule of thumb: 1 token ≈ 4 characters ≈ 0.75 words. So a 1,000-word document is roughly 1,300-1,400 tokens. A short email might be 200 tokens. A novel is 100,000+.
Numbers, code, and non-English text tokenise differently, sometimes more efficiently, sometimes less.
Why tokens matter
You’re billed per token. As of May 2026, Claude pricing in USD:
| Model | Input ($/1M) | Output ($/1M) |
|---|---|---|
| Opus 4.7 | $15 | $75 |
| Sonnet 4.6 | $3 | $15 |
| Haiku 4.5 | $1 | $5 |
In AUD (1 USD ≈ 1.55 AUD), Sonnet costs roughly $4.65 / $23.25 per million input/output tokens.
A typical hour-long coding session with prompt caching might use:
- 80k input tokens × $3/M (USD) = $0.24 input cost
- 12k output tokens × $15/M (USD) = $0.18 output cost
- Total: ~$0.42 USD or ~$0.65 AUD
Multiply across sessions to estimate monthly spend. A solo developer running 4-5 hours/day on Sonnet ends up at $40-100 AUD/month.
Output tokens cost 5x input
This is the single most important pricing fact. Asking for longer outputs (verbose explanations, full files rewritten when you only need a diff) is 5x more expensive than the input tokens you sent. Concise output is good economics as well as good UX.
Counting tokens
- Anthropic’s tokenizer: available via the API (
count_tokensendpoint) or in their Python/TS SDKs. - Rule of thumb: divide character count by 4.
- For code: similar to English, slightly fewer characters per token because of operators and indentation.
Related terms
Want this built for your business?
Book a free 30-minute AI audit. We'll map your business and show you exactly which systems we'd build first. No pitch deck, no scoping fee.
Book my free AI audit