AI Strategy VIP 2026-04-18

What the 0.1 Doesn't Say

Newer is not always better. Version numbers only tell you the order. Today we unpack this principle through Opus 4.7 vs 4.6.

A new version drops and your heart moves first. "Opus 4.7 is out — does that mean 4.6 is done?" Bigger number, obviously better. But run them side by side on real work and something strange happens. Some tasks, 4.7 wins clearly. Other tasks, 4.6 is better. Why?

This essay walks through what the version number doesn't tell you. Opus 4.6 and 4.7 are today's example, but the principle applies to any numbered product. Three years from now, when "Opus" is a different name, the spine of this essay still holds. We'll go slowly.

Where did version numbers come from?

The notation came from the software era. The original meaning was defined like this.

  • Integer part changes: big change. 4 → 5.
  • First decimal: small feature added. 4.6 → 4.7.
  • Second decimal: bug fix. 4.7.1.

This worked because software is built by adding features. Add a feature, bump 0.1. Add another, bump again. Predictable math.

AI models aren't software. More precisely, they're not made by the addition logic software runs on. Let's see why.

Example — the odd difference between 4.7 and 4.6

In March 2026, Opus 4.7 shipped. A 0.1 bump from 4.6. Nuance: "a bit smarter."

I threw the same tasks at both side by side. The table:

Task 4.6 4.7 Winner
Complex code refactor OK Excellent 4.7
30-page doc summary Deep Flat 4.6
Korean sentence rhythm Natural Slightly stiff 4.6
Math reasoning Fine Excellent 4.7
Emotionally subtle writing Delicate Logical but dry 4.6

Five tasks — 4.7 won two, 4.6 won three.

The Korean rhythm row surprised me most. 4.7 is the smarter model overall, yet 4.6's Korean flows more smoothly. Same prompt asking for a sentence that opens with "자," — 4.6 produces it naturally on the first try, while 4.7 often sneaks in a stiffer clause. For my blog drafts, that single difference is decisive. A 0.1 bump, but which side wins depends on the domain. Why?

Because AI doesn't upgrade by addition. Each new version, the company retrains the model. Training data changes, training method changes, tuning direction changes. The result isn't addition — it's reconstruction. Some capabilities rise; others pay the cost and fall. That's the nature of AI models.

The version number encodes this reconstruction as pure ordering. "Came out later" — that's it. Not "better in every respect." Whatever AI company ships what next, this property repeats.

Analogy — new car models

Picture a car brand that ships a new model every year. 2024, 2025, 2026.

Looking at numbers, 2026 is obviously best. But walk into an owner forum and you hear:

  • "2024's ride is smoother."
  • "2025's fuel economy is great but the engine is louder."
  • "2026 looks beautiful but the trunk shrank."

Car makers don't add everything to each new model. Budget, weight, constraints. Put something in, something comes out. So the right car for your road isn't necessarily the latest. If you commute through the city, 2024 might suit you. If you drive country roads, 2026 might.

AI models are identical. Your model isn't guaranteed to be the latest model.

Here's the first aha moment.

A version number tells you order. It doesn't tell you fit.

So what's the real criterion?

If numbers aren't the criterion, what is? Context. More precisely, your task. Keep one question handy.

"Have I thrown this task at both models myself?"

If not, numbers are your only evidence. If yes, results are. Which is more accurate? Results, obviously.

Do this. A/B test.

  1. Pick 3 tasks you do often (code review, doc summary, email reply, etc.).
  2. Throw one of each at both models (new + previous).
  3. Lay results side by side. Which is better?
  4. Write it into a table.

Ten minutes. That ten minutes automates your choices for the next month. "Code goes to 4.7, summaries go to 4.6," and so on.

Apply — running both models

Concrete commands. If you're on Claude Code:

# Start with the new version
claude --model claude-opus-4-7

# Start with the previous version
claude --model claude-opus-4-6

# Switch mid-session
/model opus-4-6
/model opus-4-7

On Claude.ai web, settings has a "Previous version" option. Pick 4.6 there.

My current split:

  • Code / math / logical analysis → Opus 4.7
  • Long Korean writing / subtle summaries / tone-sensitive work → Opus 4.6

This split will shift in six months. When 4.8 arrives, I'll rerun the 10-minute A/B and update. The criterion isn't the number — it's the task.

Summary

Version numbers are the language of the software era. AI doesn't fit that language cleanly. 0.1 from 4.6 to 4.7 is reconstruction, not addition. Some domains go up; others come down. Today we unpacked this with Opus 4.6/4.7, but the same applies to any AI that comes next. When the names become Opus 7 or GPT-9, the reconstruction property repeats.

Keep one question handy — "Have I thrown this task at both models myself?" Trust the results, not the number. A 10-minute A/B decides the next month.

Three words to remember: Number. Context. Check. Don't look at the number — look at the context, and always check. The technology changes. The principle does not.

Edit Section