AI Strategy VIP 2026-06-11

Separate the Body of the Tool From Its Engine

The Claude Code body doesn't have to run a Claude engine. Change one line through OpenRouter and a free Google engine hums inside. Tools stay free only when body and engine are separable.

Most Claude Code users have felt the bill panic at least once. Opus is brilliant, but at $75 per million tokens, a few heavy runs a day stack fast. Then I learned something. You don't have to put a Claude engine inside Claude Code. You can swap engines. You can even swap in free ones. A few lines in a settings file.

Let's unpack the principle. This is not a story about OpenRouter specifically. This is about the tool and the engine becoming separable. The principle survives even when OpenRouter gets a new name next year. Slowly now.

Tool is the body, model is the engine

Understand the layers first. Any AI product you use today has two.

First, the body. The interface, shortcuts, file access, plugin structure. For Claude Code: the terminal command system, the .claude folder, slash commands. This shell is what makes Claude Code "Claude Code."

Second, the engine. The actual AI model generating sentences. Claude Opus, Sonnet, Gemini, GPT — each a unique engine. This is where intelligence comes from.

These used to be fused. Claude Code meant Claude engine. Cursor meant a Cursor-chosen engine. Body and engine were one unit. Like a car where the chassis and the engine are welded together. To change engines you changed cars.

That has shifted. You can keep the body and swap the engine. Drop a Google Gemma engine into a Claude Code chassis. Drop a free engine in. The chassis still feels like Claude Code, but what runs inside comes from a different company.

An example — OpenRouter as the engine hub

Concretely: there is a service called OpenRouter. One line: "a hub that gathers 600+ of the world's AI models behind a single API."

  • Anthropic's Claude line
  • OpenAI's GPT line
  • Google's Gemini line
  • Meta's Llama line
  • And 28+ free-to-use models

Why does this matter? If the tool (body) accepts OpenRouter as its engine port, you can snap any of those 600 engines into it. Claude Code happens to be such a tool.

An analogy — swapping the engine in a race car

Picture a race car. The body is gorgeous, the wheel feels perfect, the seat is set right. But the stock engine drinks gasoline. Running 100km costs $100 of fuel.

Old race cars welded the engine to the chassis. To get a cheaper engine you had to change cars. So you gritted your teeth and kept paying for the thirsty engine.

Today's Claude Code is an engine-swappable race car. Keep the chassis. Change the engine. Claude engine too expensive? Drop in Google Gemma. Drop in a free engine for experiments. Swap back to Claude for the real thing. Driving feel stays identical; fuel economy changes.

This works because Claude Code's designers published the "Anthropic API format" as a common engine port. OpenRouter mimics that port. From Claude Code's view, it doesn't notice whether it's plugged into Anthropic or OpenRouter. Same port, same engine in its eyes.

The numbers

Same workload — a simple code refactor, around one million tokens — billed across engines.

Engine Price per 1M output tokens Personality
Claude Opus 4.6 $75 Top judgment, complex design
Claude Sonnet 4.6 $15 General production, default
Claude Haiku $5 Simple repetitive work
Gemini 2.5 Flash Lite $0.50 Light practical work (1/10 of Haiku)
Gemma 3 series (OpenRouter) $0 Free, from Google
openrouter/auto (free-only) $0 Auto-pick among free models

Between Opus and free Gemma the price ratio is effectively infinite. Quality differs too — Opus is far smarter. The key is: not every task needs Opus-grade intelligence. Picking a coffee menu doesn't.

I ran a full month this way. Exploration on free engines, production on Sonnet, critical judgment on Opus. The monthly bill dropped to one third. The tool never changed. Claude Code stayed Claude Code. Only the engine rotated.

The aha.

Cost isn't a tool problem. It is a problem of the sense for choosing engines.

Which engine when — one question

With dozens of engines available, one question sorts them.

"How much does this task's outcome affect my life?"

Three answers.

"Little to none (exploring, tinkering)" → free engines

Brainstorming, dumping a hundred ideas, meeting a new concept. Results don't need to be perfect. Free engines are enough — Gemma 3, or openrouter/auto.

"Medium (practical output)" → mid-tier paid engines

Real deliverables. Reports, clean analyses, formal emails. Quality needs to be solid. Sonnet, Gemini 2.5 Flash Lite, that tier.

"Large (strategy, key judgment)" → top-tier engines

Decisions that are costly to reverse. Business strategy, core architecture of critical code, legally sensitive writing. Top-tier: Opus, GPT.

Three words: Explore free. Work mid. Decide top.

The actual config — line by line

Here is how to swap the engine in Claude Code. Inside your project folder, open or create .claude/settings.local.json.

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://openrouter.ai/api/v1",
    "ANTHROPIC_API_KEY": "paste_your_openrouter_api_key_here",
    "ANTHROPIC_MODEL": "openrouter/auto"
  }
}

Three lines. The first, ANTHROPIC_BASE_URL, changes the engine port's address. Claude Code normally looks at Anthropic's servers; this line points it at OpenRouter's. The hose runs from a different tank; the chassis is unchanged.

The second is the OpenRouter API key. Sign up at openrouter.ai, go to API Keys, click Create Key, paste it here.

The third, ANTHROPIC_MODEL, picks the engine. openrouter/auto lets OpenRouter pick an appropriate free engine. Want a specific one? Use a name like google/gemini-2.5-flash-lite.

Save, restart Claude Code. Every question you ask from now on is answered by the free engine. Same chassis, fuel cost gone.

One nuance. Keep this in local.json, not the global settings. Per-project, not system-wide. Because you'll want different engines on different projects — free for play, paid for income. Per-project splits keep that clean.

Summary

Let me close the loop.

A tool is a body and an engine. They used to be fused; they are separable now. Hubs like OpenRouter created a common engine port. Because of that, you can drop a free Google engine into a Claude Code body. Body stays, engine changes.

OpenRouter may be renamed in three years. The principle holds. More AI products will allow engine swaps. More hubs will appear. The essential skill is thinking of body and engine as separate parts. When cost bites, don't give up the tool — change the engine. Tools change. The body/engine split doesn't.

Three words to close.

Explore free. Work mid. Decide top.

Edit Section