AI Workflow VIP 2026-06-22

Separate Design from Execution — The Dual-Model Strategy

Running every task on the most expensive AI is wasteful. Design with the big model, execute with the fast one. The general-and-soldier structure cuts cost and raises speed.

Many people try to cut AI costs this way and that. Switch to a cheaper model — the output gets sloppy. Switch back to the expensive one — next month the bill is massive. A lot of people ping-pong between the two. The answer isn't ping-pong. It's dividing the work. Today we unpack the principle. Slowly.

Here's the spine. Design and execution are different tasks. Different tasks get different tools. Design with the high-intelligence model. Execute with the high-speed one.

AI models usually come in three tiers. Claude has Opus (most expensive, smartest), Sonnet (middle), Haiku (cheapest, fastest). ChatGPT is similar (GPT-5.4, GPT-5, Mini). The gap: the most expensive model runs about 18× more than the cheapest. Eighteen. That number is today's key variable.

The Shape of Confusion — Everything on the Biggest Model

Most people leave defaults on and run everything on the top model. A 3-line email — Opus. Cleaning up meeting notes — Opus. Renaming 50 files — Opus. Each is a few cents, but 50 times a day stacks past $200 a month. That's the source of bill shock.

The other extreme exists too. People who, to save money, run everything on the cheapest model. Now the problem flips. Complex debugging handed to Haiku gets stuck for 3 hours and still doesn't solve. Opus would have finished in 30 minutes. You saved dollars and lost hours. Hours cost more than dollars.

Both extremes fail for the same reason. You tried to do all work with one model. Work isn't one kind. So the tool shouldn't be one kind.

Analogy — General and Soldier

Think of a military operation. Planning a battle is the general's job. On the map, they consider the enemy position, friendly movement, weather, supply lines — all at once — and make the large calls. The general sees the whole.

But the general doesn't carry a rifle to the front. That's the soldier's job. The soldier executes the order. Accurately and fast. Thousands of soldiers each execute the plan in their positions, and the operation moves.

If the general did every soldier's work, the front collapses — one person can execute only so much. If soldiers acted without a general, strategy scatters and the operation falls apart.

AI models work the same way.

Opus (general) — big judgments, whole structure, complex reasoning. Slow and expensive.
Haiku (soldier) — repetitive execution, fixed patterns, high volume. Fast and cheap.

Put the general on soldier's work and money explodes. Put soldiers in the general's chair and the operation fails. Dividing roles correctly is the core skill.

Aha

The real skill is knowing when to switch models. Use the execution model at the design stage and the structure breaks. Use the design model at the execution stage and cost blows up.

This sentence is the key. Let's unpack.

Every task actually has two stages.

Design stage — deciding what to do, in what order, in what format. Many judgments, few repetitions. Once done well, it's done.

Execution stage — following the design to actually produce. Few judgments, many repetitions. Must be done many times fast.

Every task contains both. Most people blur them together and run on one model. Don't blur. Separate them and both sides improve.

Diagnosis — Which Stage Is Your Task

Ask this question.

'Does this task actually need judgment?'

Three answers.

Yes, a lot → strategy, complex debugging, architecture. Opus tier.
In between → implementing a defined spec, refactoring, writing a report. Sonnet tier.
Barely any → renaming files, standardizing formats, simple classification. Haiku tier.

Three words to memorize: Judgment → Opus. Making → Sonnet. Repetition → Haiku. Keep these three words in mind and ask once at the start of every task.

Real Example — 20-Page Meeting Record

Let's walk through. Task: 'extract action items from a 20-page meeting record.'

One-model way — all Opus Hand all 20 pages to Opus. 'Extract action items.' Opus reads, understands, pulls them out. Good result. Cost $3.

Dual-track way — Opus for design, Haiku for execution

Step 1: ask Opus for the plan only.

'I want to extract action items from a 20-page meeting record. Don't execute — design the method. What keywords to scan first, which sections usually hold them, what output format.'

Opus returns a short answer. 'Scan for phrases like must-do, decided-to. Decisions cluster at the end of sections, so read those first. Format as 3 columns: owner / action / deadline.' Cost $0.20.

Step 2: hand that plan to Haiku.

'Follow the plan above and extract action items from this 20-page record, owner / action / deadline.'

Haiku executes. Cost $0.30.

Total $0.50. One-sixth of the baseline. And what's interesting — the result is often better. When Opus focuses only on 'how to do it,' the thinking sharpens. Same as a general focusing only on strategy while soldiers execute — the operation runs better.

Commands — Switching Models

In Claude Code, switch mid-session with /model opus, /model sonnet, /model haiku. On Claude.ai web, use the top dropdown. With API, change the model parameter in the request.

Try it for a week. At the start of each task, whisper the one question — 'is this judgment, making, or repetition?' — and pick accordingly. The first 2-3 days feel awkward. After a week it's automatic. The bill drops visibly.

Summary

Design and execution are different tasks. Use different tools. One model for everything means either expensive or sloppy. Divide and both sides win.

This principle isn't only for AI. It applies at work, at home, in solo projects. Asking one person to both plan and execute leaves both half-baked. Divide and give each role to the right fit and both thrive. AI is the tool that makes you feel this principle 10 times a day.

Judgment → making → repetition. Run your own work through these three words.

자, AI 비용을 아끼려고 이것저것 해보시는 분들 많으실 겁니다. 싼 모델로 바꿔봤더니 결과가 허접해집니다. 다시 비싼 모델로 돌렸더니 이번엔 한 달 청구서가 크게 옵니다. 이 사이에서 왔다갔다만 하시는 분이 꽤 많으세요. 답은 왔다갔다가 아니라 나누는 것에 있습니다. 오늘은 이 원리를 설명드립니다. 끝까지 천천히 가겠습니다.

핵심부터 박습니다. 설계와 실행은 다른 일입니다. 다른 일에는 다른 도구를 씁니다. 설계는 고지능 모델로, 실행은 고속 모델로. 이게 오늘 글의 뼈대입니다.

AI 모델은 보통 세 단계로 나뉩니다. 예로 Claude를 들면 Opus(가장 비쌈, 가장 똑똑함), Sonnet(중간), Haiku(가장 쌈, 가장 빠름) 이렇게 셋이에요. ChatGPT도 비슷합니다(GPT-5.4, GPT-5, Mini). 가격 차이가 얼마나 되냐면, 가장 비싼 모델이 가장 싼 모델의 약 18배입니다. 18배예요. 이 숫자가 오늘 글의 핵심 변수입니다.

혼란의 모양 — 전부 비싼 모델

대부분의 분들이 기본 설정대로 가장 좋은 모델을 켜두고 모든 일을 시킵니다. 3줄짜리 이메일도 Opus, 회의록 정리도 Opus, 50개 파일명 바꾸기도 Opus. 각각은 몇 센트지만 하루 50번 쌓이면 한 달에 $200이 넘어갑니다. 이게 청구서 쇼크의 정체예요. 가장 좋은 걸 쓰고 있으니 결과는 좋은데, 지불 구조가 무너지는 거죠.

반대쪽 극단도 있습니다. 비용이 아까워서 전부 싼 모델로 돌리시는 분. 이번엔 문제가 뒤집힙니다. 복잡한 코드 디버깅을 Haiku에게 시키니까 3시간을 헤매고도 못 풀어요. Opus였으면 30분에 끝날 일이에요. 싸게 쓴 대신 시간을 잃고, 시간은 돈보다 비쌉니다.

두 극단 모두 틀린 이유는 같습니다. 한 모델로 모든 일을 하려고 했기 때문입니다. 일은 한 종류가 아니에요. 그러니 도구도 한 종류면 안 됩니다. 문제는 인터페이스가 단일 모델을 기본값으로 띄워놓는 게 많아서, 우리가 '하나로 해결한다'는 착각에 빠지는 거예요. 이 기본값을 의심하는 것부터 시작입니다.

비유 — 장군과 병사

쉽게 이해하시려면 군대를 떠올려보세요. 전투 작전을 짤 때 장군이 합니다. 지도 위에서 적의 위치, 아군의 움직임, 날씨, 보급로까지 동시에 고려해서 큰 판단을 내립니다. 장군은 전체를 봅니다.

그런데 장군이 직접 소총을 들고 진지에 가지는 않습니다. 그건 병사가 합니다. 병사는 명령을 받은 대로 실행합니다. 정확하고 빠르게. 수천 명의 병사가 장군의 명령을 각자의 위치에서 실행하면 작전이 굴러갑니다.

만약 장군이 모든 병사 일을 혼자 다 하면 어떻게 되느냐. 전선이 못 버팁니다. 장군 한 명이 감당할 수 있는 실행 총량이 너무 적거든요. 반대로 병사들이 장군 없이 각자 판단하면? 전략이 흩어져서 작전이 무너집니다.

AI 모델도 똑같습니다.

Opus(장군) — 큰 판단. 전체 구조 설계. 복잡한 추론. 느리고 비쌈.
Haiku(병사) — 반복 실행. 정해진 패턴. 대량 처리. 빠르고 쌈.

장군에게 병사 일을 시키면 돈이 터지고, 병사에게 장군 일을 시키면 작전이 무너져요. 역할을 제대로 나누는 게 핵심입니다. 훌륭한 장군은 자기가 나설 자리와 병사에게 맡길 자리를 구분하는 사람이에요. AI 작업도 똑같습니다.

아하 모멘트 — 언제 전환하는가

진짜 스킬은 모델을 언제 전환하는지 아는 것입니다. 설계 단계에서 실행 모델을 쓰면 구조가 무너지고, 실행 단계에서 설계 모델을 쓰면 비용이 폭발합니다.

이 문장이 오늘 글의 핵심입니다. 풀어보겠습니다.

모든 작업은 사실 두 단계로 나뉩니다.

설계 단계 — '무엇을, 어떤 순서로, 어떤 형식으로 할 것인가'를 정하는 단계. 판단이 많고 반복이 적습니다. 한 번만 잘 하면 됩니다.

실행 단계 — 설계를 따라 실제로 만들어내는 단계. 판단이 적고 반복이 많습니다. 여러 번 빨리 해야 합니다.

한 작업 안에 이 두 단계가 다 들어있는데, 많은 분들이 둘을 섞어서 한 모델로 처리하십니다. 섞지 마세요. 분리하면 양쪽 다 잘돼요.

분류 — 당신 작업은 어느 단계인가요

지금 하시려는 작업이 어느 쪽인지 이 질문으로 구분하실 수 있습니다.

"이 일에 진짜 판단이 필요한가?"

답이 세 가지로 나뉩니다.

예, 많이 → 전략 수립, 복잡한 디버깅, 아키텍처 설계. Opus급.
중간 → 정해진 스펙 구현, 리팩토링, 보고서 작성. Sonnet급.
거의 없음 → 파일명 바꾸기, 형식 통일, 단순 분류. Haiku급.

세 단어로 외우시면 좋습니다. 판단 → Opus. 제작 → Sonnet. 반복 → Haiku. 이 세 단어만 머리에 두시고 작업 시작할 때마다 한 번씩 물어보세요. 참고로 Claude Code를 기본값으로 쓰시면 Sonnet이 디폴트입니다. 대부분의 작업이 여기에 해당한다는 설계예요. 판단이 필요할 때만 Opus로 올리고, 반복이 주가 될 때 Haiku로 내리시면 됩니다.

실제 예제 — 회의록 20페이지 정리

구체적으로 해보시겠습니다. '20페이지 회의록에서 액션 아이템 뽑기'가 과제입니다.

한 모델 방식 — 전부 Opus 20페이지를 통째로 Opus에게 줍니다. "액션 아이템 뽑아줘." Opus가 읽고, 이해하고, 뽑아냅니다. 결과 좋습니다. 비용 $3.

이원화 방식 — 설계는 Opus, 실행은 Haiku

1단계: Opus에게 계획만 부탁.

"20페이지 회의록에서 액션 아이템 뽑고 싶어. 실행은 하지 말고 방법만 알려줘. 어떤 키워드를 먼저 찾을지, 어떤 섹션에 주로 나올지, 결과는 어떤 형식으로 정리할지."

Opus가 짧은 답을 줍니다. "'해야 한다', '하기로 했다' 같은 표현을 먼저 스캔하세요. 각 섹션 끝에 결정 사항이 모이니 그 문단을 우선 읽으세요. 담당자/할일/기한 3열로 정리하세요." 비용 $0.20.

2단계: 그 계획을 그대로 Haiku에게.

"위 방법대로 이 20페이지에서 액션 아이템을 담당자/할일/기한으로 뽑아줘."

Haiku가 실행합니다. 비용 $0.30.

합계 $0.50. 기존 방식의 6분의 1입니다. 그리고 흥미로운 건 결과가 더 좋을 때가 많다는 거예요. 왜냐하면 Opus가 '어떻게 할지'에만 집중할 때 사고가 더 선명해지거든요. 장군이 전략에만 집중하고 병사가 실행할 때 작전이 더 잘 돌아가는 것과 같습니다. 같은 원리가 장시간 작업에도 적용됩니다. 큰 프로젝트에서 Opus로 전체 설계를 딱 한 번 하시고, 그 뒤로 이어지는 50번의 반복을 전부 Haiku로 돌리시면 체감 속도가 3배 빨라집니다.

명령어 — 모델 바꾸는 법

Claude Code를 쓰시면 터미널에서 /model opus, /model sonnet, /model haiku로 중간에 바꾸실 수 있습니다. Claude.ai 웹은 상단 드롭다운으로 전환하고, API를 쓰신다면 요청할 때 model 파라미터만 바꾸시면 됩니다.

일주일만 이렇게 써보세요. 매번 작업 시작할 때 속으로 이 질문 하나 — "이 일은 판단인가, 제작인가, 반복인가?" — 그리고 거기에 맞는 모델 선택. 처음 2-3일은 어색하지만 한 주 지나면 자동으로 됩니다. 청구서가 확 떨어지는 게 눈에 보일 거예요. 평균 60-70% 감소하십니다. 같은 결과물이에요.

정리

설계와 실행은 다른 일입니다. 다른 도구를 써야 돼요. 한 모델로 다 하시면 비싸거나 허접하거나 둘 중 하나입니다. 나누시면 양쪽이 다 잘됩니다. 분업이라는 오래된 원리를 AI에 그대로 적용하는 것뿐이에요.

이 원리는 AI에만 적용되는 게 아닙니다. 회사에서도, 집에서도, 혼자 프로젝트할 때도 똑같아요. 기획과 실행을 한 사람이 다 하려 하면 둘 다 어설퍼집니다. 나눠서 적합한 사람에게 맡기면 둘 다 살아납니다. AI는 이 원리를 하루에 10번씩 실감하게 해주는 도구예요. 모델 이름은 2년 뒤 바뀔 수 있지만, 설계-실행 분리 원리는 그대로 유효합니다.

판단 → 제작 → 반복. 이 세 단어에 자신의 일을 비춰보세요.