Claude Code cost optimisation: how to cut your AUD bill 60% without losing quality
Five concrete tactics that took our Claude Code spend from $280 AUD/month to $110 AUD/month across our businesses, without sacrificing output quality.
Five levers, in priority order: (1) enable prompt caching aggressively, (2) default to Sonnet 4.6, (3) keep your CLAUDE.md tight, (4) use /compact and /clear at task boundaries, (5) audit your top spend sessions monthly. Doing all five took our monthly bill from $280 AUD to $110 AUD with no quality drop.
We run Claude Code daily across DotVA and the Lead Gen Empire network. Before we got serious about cost, our combined bill was roughly $280 AUD/month. After applying the five tactics below, it sits at $110 AUD/month with the same or better output. Here’s what works.
1. Prompt caching: the 60% lever
Prompt caching lets Claude reuse the system prompt + early-conversation content from prior turns at 10% of the normal input rate. On long multi-turn sessions, this is the single biggest cost lever.
How to actually benefit:
- Keep early-conversation content stable. Don’t restructure your CLAUDE.md mid-session. Don’t add and remove tools constantly.
- Use the Claude Code default cache. It’s on by default for messages.create and messages.stream, you don’t need to opt in.
- Avoid
/clearmid-session if you can /compact instead. Compact preserves cacheable content; clear nukes it.
A real example: our typical Lead Gen Empire content session has ~80k input tokens of CLAUDE.md, tool context and example articles. First turn: ~$0.24 AUD input. Subsequent turns with cache hit: ~$0.025 AUD input. 10x cheaper from turn 2 onwards.
2. Right-size your model
Sonnet 4.6 vs Opus 4.7 in AUD (approx, FX 1 USD = 1.55 AUD):
| Tier | Sonnet 4.6 input | Sonnet 4.6 output | Opus 4.7 input | Opus 4.7 output |
|---|---|---|---|---|
| Per 1M tokens (AUD) | $4.65 | $23.25 | $23.25 | $116.25 |
| With prompt caching | $0.47 | $23.25 | $2.33 | $116.25 |
Opus is 5x more expensive on input, 5x more on output. For routine coding and Australian SMB workflows, that 5x doesn’t buy 5x more value.
Default to Sonnet. Switch with /model opus only when:
- You’re planning a multi-file refactor
- You’re debugging something subtle where you’ve already tried Sonnet
- You’re doing open-ended research where reasoning quality matters
/model haiku for batch text work, simple edits, throwaway tasks.
3. Keep your CLAUDE.md under 4k tokens
Every turn, your CLAUDE.md re-enters the context (or comes off the cache, at 10%). A 30k-token CLAUDE.md is a 30k-token tax on every session.
Audit it. We had one CLAUDE.md at the Lead Gen Empire repo that had grown to 28k tokens over six months, every past decision, every recipe, every gotcha had ended up there. Trimmed back to 3.8k tokens, with the long detail moved into docs/ and referenced on demand. Session cost dropped ~30%.
Test: open your CLAUDE.md in any token counter. If it’s over 4k tokens, you’re paying for context you probably aren’t using every session.
4. Use /compact and /clear strategically
/compactwhen you’ve finished a task but want to keep the high-level context for the next task. Summarises everything to date into a much shorter version./clearwhen you’re switching to a genuinely unrelated task. Starts fresh, loses the cache.
Don’t be precious about context. The “I might need that earlier conversation” instinct is almost always wrong, if you do need it, you can re-derive in 10 seconds with a fresh prompt. The accumulated context tax is much higher than the cost of one re-read.
5. Audit your top 5 sessions monthly
In Anthropic Console → Usage, you can see cost per session. Once a month, look at the top 5 most expensive sessions. Patterns emerge fast:
- “I let Claude run overnight without /compact” → put a /compact in your morning checklist
- “I switched to Opus for a simple task and forgot to switch back” → a hook that warns when you’ve been on Opus >30 min
- “Same CLAUDE.md tax every session for this project” → trim that file
Our Lead Gen Empire September audit revealed a single workflow (the daily SEO review) was costing $40 AUD/month because we were running it on Opus when Sonnet was fine. Switched. $32 AUD/month saved with zero quality drop.
Tactic 6 (bonus): batch your throwaway work
If you’ve got a bunch of small jobs (rename these 50 files, summarise these 30 PDFs), run them all in one Claude Code session rather than spinning up a new session for each. The cache amortises across all the work; cost-per-task drops dramatically.
What you should not do to save money
- Don’t skip CLAUDE.md. Saves a few cents per session, costs you hours in repeated explanation. Bad trade.
- Don’t use Haiku for real work. It’s tempting but the quality drop is real. Use Haiku for genuine batch-text work or as a research subagent, not for primary development.
- Don’t disable tools you actually use. Slightly fewer tool definitions = slightly cheaper context. The savings are tiny and the friction is high.
- Don’t constantly switch projects. Cache lives per session. Project-hopping kills the cache.
What a “tight” Claude Code budget looks like
| Use case | Realistic monthly AUD |
|---|---|
| Solo dev, 1-2 hrs/day on Claude Code, Sonnet + cache | $40 - $80 |
| Solo operator, non-dev, weekly business automation | $15 - $40 |
| Active developer, 4-6 hrs/day, heavy MCP usage | $120 - $200 |
| Team of 4 sharing an account (don’t do this) | Pain and confusion |
Don’t share Anthropic Console accounts across team members; you’ll never untangle who’s spending what. Each team member gets their own seat.
What we did at Boring Ventures
Bill timeline across three businesses combined:
- February 2026: $280 AUD/month
- March: $190 (added prompt caching enforcement to our CLAUDE.md template, switched default model)
- April: $135 (cleaned up the worst CLAUDE.md offenders, added /compact discipline)
- May: $110 (audited the top-5 sessions, killed one Opus-only workflow that didn’t need it)
Same output, more sessions, ~60% cheaper. The discipline is more valuable than any one trick.
Common questions
How much does prompt caching actually save?
Does Sonnet really do 95% of tasks as well as Opus?
Is there a flat-rate plan?
Want this built for your business?
Book a free 30-minute AI audit. We'll map your business and show you exactly which systems we'd build first. No pitch deck, no scoping fee.
Book my free AI audit