Glossary

Token

The unit of text a language model reads and writes. Roughly 4 characters of English text per token. Models are priced per million tokens.

A token is the unit of text a language model processes. It’s neither a word nor a character, it’s a chunk somewhere in between, defined by the model’s tokenizer.

For English text, a rough rule of thumb: 1 token ≈ 4 characters ≈ 0.75 words. So a 1,000-word document is roughly 1,300-1,400 tokens. A short email might be 200 tokens. A novel is 100,000+.

Numbers, code, and non-English text tokenise differently, sometimes more efficiently, sometimes less.

Why tokens matter

You’re billed per token. As of May 2026, Claude pricing in USD:

ModelInput ($/1M)Output ($/1M)
Opus 4.7$15$75
Sonnet 4.6$3$15
Haiku 4.5$1$5

In AUD (1 USD ≈ 1.55 AUD), Sonnet costs roughly $4.65 / $23.25 per million input/output tokens.

A typical hour-long coding session with prompt caching might use:

  • 80k input tokens × $3/M (USD) = $0.24 input cost
  • 12k output tokens × $15/M (USD) = $0.18 output cost
  • Total: ~$0.42 USD or ~$0.65 AUD

Multiply across sessions to estimate monthly spend. A solo developer running 4-5 hours/day on Sonnet ends up at $40-100 AUD/month.

Output tokens cost 5x input

This is the single most important pricing fact. Asking for longer outputs (verbose explanations, full files rewritten when you only need a diff) is 5x more expensive than the input tokens you sent. Concise output is good economics as well as good UX.

Counting tokens

  • Anthropic’s tokenizer: available via the API (count_tokens endpoint) or in their Python/TS SDKs.
  • Rule of thumb: divide character count by 4.
  • For code: similar to English, slightly fewer characters per token because of operators and indentation.
Related terms

Want this built for your business?

Book a free 30-minute AI audit. We'll map your business and show you exactly which systems we'd build first. No pitch deck, no scoping fee.

Book my free AI audit