AI Strategy VIP 2026-04-10

Don't Do Everything With The Best One

Tasks have weight. Tools have weight. Work happens only when the weights match. Today we unpack this principle through Claude's three models.

Most people who start using AI tools experience the same thing a month in — the bill comes in bigger than expected. Why? The answer is in a surprisingly old principle: you didn't divide the work to match the weight of the tool.

This essay explains that principle from start to finish. If you've never used Claude before, you can still follow along — we'll go slowly. Today's example happens to be Claude's three models, but the principle applies to any AI tool, in any era. Three years from now, even when "Claude" is renamed, the spine of this essay still holds.


Let's start with something most people don't realize.

Claude isn't one thing. It's three.

When you open Claude.ai or Claude Code, you see one window. So "Claude" looks like a single product. But behind the scenes, three different models are running. Let me tell you their names first, so it's less confusing.

  • Opus — Latin for "a great work." Like "Opus No. 9" in music. It earned this name because it's the largest model in the Claude series.
  • Sonnet — the 14-line short poem. The middle size.
  • Haiku — the 3-line Japanese short verse. The smallest model.

The names sound poetic and brandy, but they're actually ordered by size. Opus → Sonnet → Haiku, getting smaller. "Bigger" means it has more internal numbers (parameters). Like more brain synapses in a person. More parameters means better at complex reasoning — but more expensive, and slower.

That leads to the natural question: why make three versions? Why not just use the best one?

The answer is simple. Not every task needs the biggest model.


To really understand this, we need a good analogy. Think of your local medical system.

Say you have a mild cold. Where do you go? Your neighborhood clinic. Not a university hospital professor. You could go to the professor, but it's inefficient. Getting an appointment is hard, waits are long, and the bill is several times higher. If the neighborhood doctor sees something serious, they write a referral to the university hospital. For a possible cancer diagnosis, you go straight to the university hospital. You match the doctor to the weight of the problem.

Claude's three models work exactly this way.

  • Opus = the university hospital professor. Complex judgment, big designs, deep reasoning. Expensive and slow, but handles the hard stuff.
  • Sonnet = the neighborhood clinic doctor. Handles well-defined work cleanly. Most tasks end right here.
  • Haiku = the nurse. Handles repetitive tasks quickly. Intake, classification, cleanup.

Giving a light task to Opus is like sending a cold patient to a university professor. Expensive, and wasteful for the professor too — they should be seeing the truly difficult cases in that time.


Good. Now let's look at "how expensive" with numbers. This part matters.

Claude charges by "tokens." Don't overthink it — a token is roughly a word and a half's worth of characters. A Korean sentence is about 30-50 tokens. This whole essay is about 5,000 tokens.

Pricing as of April 2026, per million tokens:

Model Input Output vs Haiku
Opus $15 $75 18×
Sonnet $3 $15 3.75×
Haiku $0.80 $4 baseline

Opus costs 18 times more than Haiku. Pin that number in your head. Again — 18×.

Why does that matter for the bill? Because most people run everything through Opus by default. Renaming files — Opus. Polishing a three-line email — Opus. A simple summary — Opus. Each is only a few cents, but dozens of calls a day stack up past $200 a month. Going back to the medical analogy: it's sending colds, vaccinations, and routine blood draws all to the university hospital. Of course the health insurance breaks.

Here's the first aha moment.

The bill problem isn't a money problem. It's a distribution problem.

If you avoid Opus to save money, hard tasks fall apart. If you use only Opus, the bill explodes. The answer is in the middle. Match the model to the weight of the work.


So how do you split? It's simpler than it looks. Ask one question:

"Does this task actually need a brain?"

The answer splits into three.

"Yes, a lot" → Opus

Strategy planning. Tasks that weigh many factors at once. Designing the architecture of complex code. Debugging that's stuck for 3 hours. Anything that needs deep, single-pass thinking. Send it to Opus.

"In between" → Sonnet

You know what to do; you just need it done well. Converting a designed spec into code. Writing a report from gathered material. Doing a defined refactor. This is why Claude Code defaults to Sonnet — most work lives here.

"Barely any" → Haiku

Mechanical, repetitive tasks. Lowercasing 100 filenames. Turning a long list into a clean table. Standardizing text formats. Simple classification. Closer to execution than thinking. Haiku does it 18× cheaper.

Here's the one-line to remember: Opus = judgment. Sonnet = making. Haiku = repetition. Three words. That's it.


Let's walk through a real example.

Task: I have a 20-page meeting record and need to extract action items.

Old way — hand the whole thing to Opus.

→ "Extract action items from this 20-page meeting record."

Opus reads all 20 pages, understands them, and produces action items. Quality is fine. Cost: around $3.

New way — two steps.

Step 1. Ask Opus for just the plan.

→ "I want to extract action items from this 20-page meeting record. Don't execute — just design the method. Which sections to scan first, what patterns to look for, what format to output."

Opus returns a short answer. "Scan for phrases like 'must do,' 'decided to.' Action items usually cluster at the end of each section, so read those paragraphs first. Output as a 3-column list: owner / action / deadline." Cost: $0.20.

Step 2. Paste that plan to Haiku.

→ "Follow the plan above to extract action items from this 20-page meeting record, formatted as owner / action / deadline."

Haiku executes. Cost: $0.30.

Total: $0.50. One-sixth of the old way. And funnier — the result is often better. Because when Opus only thinks about "how to do it," the thinking sharpens. In the medical analogy, the professor who focuses only on diagnosis and lets others handle intake and blood draws actually diagnoses better.


Now the last piece — how to actually switch models. Also easy.

In Claude Code, specify at launch:

claude --model opus        # start with Opus
claude --model sonnet      # Sonnet (default)
claude --model haiku       # start with Haiku

Or switch mid-session:

/model opus
/model haiku

On Claude.ai web, the model selector is in the top dropdown. Pick Opus / Sonnet / Haiku. You can switch within one conversation easily.

Try it for a week. Just one question running in your head each time: "Is this judgment, making, or repetition?" Awkward at first. Automatic within a few days. And the bill drops fast.


Summary.

Claude has three models. Opus is big, expensive, and smart. Haiku is small, cheap, and fast. Sonnet is in between. Most people run everything on Opus, and the bill explodes. But not every task needs Opus. Match the model to the weight of the work and you get the same outcome at 1/6 the cost.

Three words: Judgment → Opus. Making → Sonnet. Repetition → Haiku.

And hard tasks: split in two. Ask Opus for the plan. Hand the plan to Haiku for execution. Each one does what it's best at.

Using one tool well matters less than having the sense to divide across many tools. Once you develop this sense, any future AI tool works the same way. The names will change. The principle won't.

Edit Section