Context Engineering 4 Strategies — Applying Write, Select, Compress, and Isolate to Multi-Agent Systems

If you run an LLM agent for a few days, there is a moment you are bound to encounter at least once. This occurs when the agent fails to remember what it just did, a sub-agent corrupts the orchestrator's conversation history, or a task is forcibly terminated because the 200k token window fills up. These problems all share the same root—how context is managed.

If prompt engineering was a matter of "what to say," context engineering is a matter of "what information to place, when, and where so that the model behaves optimally." In September 2025, Anthropic systematized agent context management into four strategies (buckets): Write, Select, Compress, and Isolate, through its official engineering blog. This article is aimed at backend and full-stack developers with experience in LLM API calls and covers what each strategy is, when to use it, and how to implement it in actual TypeScript code through a single consistent scenario (News Summary Multi-Agent).

If you read this article to the end, you will be able to diagnose where context bottlenecks occur in your agent code and implement a production-grade agent by using the four strategies independently or in combination.

Key Concepts

The context window is the agent's "working memory".

Like human short-term memory, the context window of an LLM is the total amount of information that can be processed at once. Based on Claude 3.7, this is 200,000 tokens, but as the number of tokens increases, the Context Rot phenomenon occurs. Due to the O(n²) relationship where every token attends every other token, the model's ability to accurately recall information deteriorates as the context becomes larger. This reveals why the idea that "simply using a longer context solves the problem" is incorrect.

Key Question: Which context configuration best elicits the model's desired behavior?

Anthropic defined four buckets to answer this question.

Strategy	Key Action	Problem Solved
Write	Writes information to storage outside the context	Cannot maintain state across session boundaries
Select	Dynamically load only relevant information into the context	Unnecessary context inflation and Context Rot
Compress	Summarize conversation history and restart in a new context	Force quit when context window limit is reached
Isolate	Allocate context windows independent of subagents	Subagent details do not pollute the orchestrator

These four strategies are not mutually exclusive. Actual production agents use a combination of two or more of them.

Write — Build memories outside the context

The Write strategy is a pattern in which an agent actively saves information outside the context window (files, databases, runtime state objects). Even if the context is reset or the session is disconnected, the agent can read the scratchpad written externally and continue.

applied period:

Long-term tasks spanning tens to hundreds of steps (codebase analysis, game agents)
When externalizing state that needs to be shared by multiple agents
When you need to retry from the last checkpoint instead of the beginning if an error occurs

Trade-off: External storage I/O latency is added, and file system dependencies are introduced. This can be mitigated by asynchronous writes and a local cache layer.

Select — Accurately retrieve only the necessary information

The Select strategy is a pattern in which the agent dynamically injects only information relevant to the current task into the context. Instead of putting all documents into the context at once, it searches for and retrieves only highly relevant chunks.

Three Memory Types of Select Strategy

Episodedic: Examples of desired behavior (few-shot examples)
Procedural: Agent Action Guidelines — Rule files like CLAUDE.md
Semantic: Task-related facts and knowledge (RAG, Vector DB)

Contextual Retrieval proposed by Anthropic improves upon the core limitations of existing RAGs. In existing RAGs, contextual information is lost when chunks are extracted from a document. Contextual Retrieval prepends and embeds phrases into each chunk that explain the context in which the chunk is situated within the document. Hybridly combining BM25 keyword search and semantic embedding search can reduce search errors by 49% compared to existing methods.

Trade-off: The overall performance of the agent depends on search quality. Injecting irrelevant chunks is counterproductive. Building an evaluation pipeline is essential.

Compress — Keep running beyond context limits

The Compress strategy is a pattern that summarizes (compacts) the conversation history and restarts with a new context when the context window approaches a threshold. Instead of forcibly terminating the session, it allows you to compress and continue the key content from that point.

Information loss is inevitable during the compression process. You must clearly determine what to preserve and what to discard.

Items to preserve	Things that can be discarded
Task Objectives and Completed Steps	Details of Intermediate Reasoning Process
Key Facts and Data Discovered	Path of Repeated Attempts and Failures
Current Status and Pending Items	Full Original Response from Successful Tool Calls
Constraints and Error History to Note	Original Pre-Compaction Version Already Summarized

The Claude Agent SDK supports automatic compaction with the compaction_control parameter. Claude Code automatically summarizes the entire history when the context window reaches 95%.

Trade-off: If the summary quality is low, critical information is permanently lost. It is recommended to verify the compression results separately or to back up the original data using Write before compression.

The Isolate strategy is a pattern that distributes complex tasks across multiple sub-agents, allowing each to have an independent context window. The orchestrator is responsible only for strategy and coordination, while the details of the actual work remain within the sub-agent's context.

Context Pollution: If long search results or error logs from subagents are fed directly into the orchestrator context, the orchestrator is unable to focus on high-level decision-making. The Isolate strategy blocks this by returning only a summary of results from subagents to the orchestrator.

Trade-off: Token costs skyrocket as the number of sub-agents increases. In Anthropic's multi-agent researcher case, up to 15 times more tokens were consumed compared to a single agent. The number of sub-agents and task decomposition criteria must be carefully designed.

Practical Application

Example 1: News Summary Multi-Agent — Comparison of 4 Strategies Before and After

This is a scenario for implementing an agent pipeline that collects, classifies, and summarizes hundreds of news articles every day.

Before — Naive implementation without strategy:

typescript

// 모든 문제가 발생하는 순진한 구현
interface Message {
  role: "user" | "assistant";
  content: string;
}
 
async function naiveNewsAgent(articles: string[]): Promise<string> {
  let messages: Message[] = [];
 
  for (const article of articles) {
    // 문제 1: 모든 기사를 순차적으로 컨텍스트에 쌓는다 → Context Rot
    messages.push({ role: "user", content: `이 기사를 요약해줘: ${article}` });
    const summary = await callClaudeWithHistory(messages);
    messages.push({ role: "assistant", content: summary });
    // 200기사 처리 시 컨텍스트 윈도우 초과 → 100번째 기사에서 강제 종료
    // 문제 2: 중단되면 처음부터 재시작 — 진행 상황 없음
    // 문제 3: 단일 스레드 순차 처리 — 처리 속도 느림
  }
 
  return messages[messages.length - 1].content;
}

After — Write + Select + Compress + Isolate 조합:

First, define the scratchpad for the Write strategy.

typescript

// [Write] 에이전트 스크래치패드 — 세션 간 상태 영속화
import fs from "fs/promises";
import path from "path";
 
interface ScratchpadEntry {
  step: number;
  timestamp: string;
  action: string;
  result: string;
  notes: string;
}
 
class AgentScratchpad {
  private filePath: string;
  private entries: ScratchpadEntry[] = [];
 
  constructor(taskId: string) {
    this.filePath = path.join("./scratchpad", `${taskId}.json`);
  }
 
  async load(): Promise<void> {
    try {
      const raw = await fs.readFile(this.filePath, "utf-8");
      this.entries = JSON.parse(raw);
      console.log(`[Write] 이전 체크포인트 ${this.entries.length}개 로드 완료`);
    } catch {
      this.entries = []; // 첫 실행이면 빈 상태로 시작
    }
  }
 
  async write(entry: Omit<ScratchpadEntry, "step" | "timestamp">): Promise<void> {
    const newEntry: ScratchpadEntry = {
      step: this.entries.length + 1,
      timestamp: new Date().toISOString(),
      ...entry,
    };
    this.entries.push(newEntry);
    await fs.mkdir(path.dirname(this.filePath), { recursive: true });
    await fs.writeFile(this.filePath, JSON.stringify(this.entries, null, 2));
  }
 
  getSummary(): string {
    return this.entries
      .map((e) => `[Step ${e.step}] ${e.action}: ${e.notes}`)
      .join("\n");
  }
}

The following is a contextual retrieval-based search of the Select strategy.

typescript

// [Select] Contextual Retrieval — 관련 청크만 컨텍스트에 주입
interface NewsChunk {
  id: string;
  content: string;
  contextualDescription: string; // 청크의 문서 내 위치 설명
}
 
class NewsKnowledgeBase {
  private chunks: NewsChunk[] = [];
 
  // 인덱싱 시: 각 청크에 컨텍스트 설명을 prepend (Contextual Retrieval 핵심)
  async indexArticle(articleId: string, fullText: string): Promise<void> {
    const rawChunks = this.splitIntoChunks(fullText, 400);
 
    for (const [i, chunk] of rawChunks.entries()) {
      // 각 청크가 전체 기사에서 어떤 역할을 하는지 설명 생성
      const contextualDescription = await callClaude(
        `다음 기사 "${articleId}" 전체 내용:\n${fullText}\n\n` +
        `아래 청크가 이 기사 내에서 어떤 맥락에 위치하는지 2문장으로 설명하라.\n\n청크: ${chunk}`
      );
 
      this.chunks.push({
        id: `${articleId}-${i}`,
        content: chunk,
        contextualDescription,
      });
    }
  }
 
  // 검색 시: BM25 + 임베딩 하이브리드 검색으로 관련 청크만 선별
  async buildContext(query: string, topK = 5): Promise<string> {
    // 실제 구현에서는 벡터 DB(Pinecone 등)와 BM25 점수를 결합
    const relevant = this.hybridSearch(query, topK);
    // 전체 문서 대신 관련 청크만 주입 → 토큰 절약 + 관련성 집중
    return relevant
      .map((c) => `[배경 정보]\n${c.contextualDescription}\n${c.content}`)
      .join("\n\n");
  }
 
  private hybridSearch(query: string, topK: number): NewsChunk[] {
    // 0.4 * BM25 + 0.6 * 임베딩 유사도 하이브리드 가중치
    return this.chunks.slice(0, topK); // 간략화된 예시
  }
 
  private splitIntoChunks(text: string, maxWords: number): string[] {
    const words = text.split(" ");
    const chunks: string[] = [];
    for (let i = 0; i < words.length; i += maxWords) {
      chunks.push(words.slice(i, i + maxWords).join(" "));
    }
    return chunks;
  }
}

Run category-specific sub-agents in parallel using the Isolate strategy.

typescript

// [Isolate] 서브에이전트 — 독립 컨텍스트에서 좁은 태스크 수행 후 요약만 반환
interface SubTaskResult {
  taskId: string;
  summary: string;
  keyFindings: string[];
}
 
async function runSubAgent(
  taskId: string,
  specificTask: string
): Promise<SubTaskResult> {
  // 서브에이전트는 자신의 태스크 범위만 알고 있다
  // 오케스트레이터의 전체 히스토리는 전혀 전달되지 않는다 (격리)
  const response = await callClaude(
    `[서브태스크 #${taskId}]\n${specificTask}\n\n` +
    `완료 후 반드시 JSON 형식으로 응답하라:\n` +
    `{ "summary": "...", "keyFindings": ["...", "..."] }`
  );
 
  // 오케스트레이터에게는 요약된 결과만 전달 — 세부 컨텍스트는 격리
  const parsed = JSON.parse(extractJson(response));
  return { taskId, ...parsed };
}

Automatic compression is performed when the context limit is reached using the Compress strategy.

javascript

// [Compress] 컴팩션 트리거 — 80% 도달 시 히스토리를 요약하고 재시작
const CONTEXT_WINDOW_TOKENS = 200_000;
const COMPACTION_THRESHOLD = 0.80;
 
function estimateTokenCount(messages: Message[]): number {
  return messages.reduce((acc, m) => acc + m.content.length / 4, 0);
}
 
async function compactIfNeeded(
  messages: Message[],
  taskGoal: string
): Promise<Message[]> {
  const usage = estimateTokenCount(messages) / CONTEXT_WINDOW_TOKENS;
 
  if (usage < COMPACTION_THRESHOLD) return messages;
 
  console.log(`[Compress] 컨텍스트 ${Math.round(usage * 100)}% 사용 → 컴팩션 실행`);
 
  const historyText = messages
    .map((m) => `[${m.role}]: ${m.content}`)
    .join("\n");
 
  const summary = await callClaude(
    `태스크 "${taskGoal}"의 대화 히스토리를 다음 형식으로 요약하라:\n` +
    `1. 완료된 작업 (구체적)\n2. 현재 상태 및 미완료 항목\n` +
    `3. 수집된 핵심 사실\n4. 주의해야 할 제약·오류\n\n` +
    `히스토리:\n${historyText}`
  );
 
  // 압축된 단일 메시지로 새 컨텍스트 시작
  return [
    {
      role: "user",
      content: `[이전 작업 요약]\n${summary}\n\n태스크를 이어서 진행하라.`,
    },
  ];
}

Finally, it is a production-level orchestrator that combines four strategies.

typescript

// 4전략 통합 오케스트레이터
async function productionNewsAgent(
  taskId: string,
  articleUrls: string[]
): Promise<string> {
  // [Write] 이전 실행 체크포인트 로드
  const scratchpad = new AgentScratchpad(taskId);
  await scratchpad.load();
 
  const knowledgeBase = new NewsKnowledgeBase();
  let messages: Message[] = [];
 
  // [Isolate] 카테고리별 서브에이전트 병렬 실행
  const categories = ["정치", "경제", "기술", "사회"];
  const subResults = await Promise.all(
    categories.map((category) =>
      runSubAgent(
        `${taskId}-${category}`,
        `다음 URL 목록에서 "${category}" 관련 기사만 선별해 핵심 내용을 요약하라:\n` +
        articleUrls.join("\n")
      )
    )
  );
 
  // [Write] 서브에이전트 결과를 스크래치패드에 영속화
  for (const result of subResults) {
    await scratchpad.write({
      action: `카테고리 요약 완료: ${result.taskId}`,
      result: result.summary,
      notes: result.keyFindings.join("; "),
    });
  }
 
  // [Compress] 오케스트레이터 컨텍스트가 임계값 도달 시 자동 압축
  messages.push({
    role: "user",
    content: `카테고리별 요약:\n${scratchpad.getSummary()}`,
  });
  messages = await compactIfNeeded(messages, "일간 뉴스 브리핑 생성");
 
  // [Select] 과거 관련 기사를 지식베이스에서 검색해 컨텍스트 보강
  const currentSummary = subResults.map((r) => r.summary).join(" ");
  const backgroundContext = await knowledgeBase.buildContext(currentSummary);
 
  return await callClaude(
    `[오늘 카테고리 요약]\n${scratchpad.getSummary()}\n\n` +
    `[관련 배경 정보]\n${backgroundContext}\n\n` +
    `위를 통합해 500자 이내의 독자용 데일리 뉴스레터를 작성하라.`
  );
}

Before vs After Quantitative Comparison:

Item	Before (No Strategy)	After (Combination of 4 Strategies)
Ability to process 200 articles	Impossible — Terminates at the 100th	Possible — Compress + Write combination
Restart on mid-game error	Full restart from the beginning	Resume from the last checkpoint
Token usage (200 articles)	~160k accumulated (all piled up in context)	~40k — Selectively load only relevant chunks
Processing Speed	Sequential Processing (200 articles × Average 3 seconds)	Parallel Processing by Category (4 Sub-agents)
Information contamination between categories	Political articles impact technical summaries	Completely isolated with Isolate

Pros and Cons Analysis

Advantages

Strategy	Key Advantages	Representative Use Cases
Write	Long-term task persistence across session boundaries	Codebase analysis, Pokémon game agent
Select	Reduced Token Costs + Improved Accuracy through Centralized Relevant Information	RAG System, Domain Knowledge Agent
Compress	Long-term tasks possible with virtually unlimited context length	Long-term conversation agent, iterative execution pipeline
Isolate	Increased throughput through parallel processing + prevention of context pollution	Research agent, multistep analysis system

Disadvantages and Precautions

Strategy	Disadvantages	Countermeasures
Write	External storage I/O latency, file system dependency	Asynchronous writes + added in-memory cache layer
Select	Overall performance depends on search quality	Introduction of Contextual Retrieval method + Establishment of search evaluation pipeline
Compress	Permanent loss of critical information if summary quality is low	Back up original with Write before compression, explicitly specify items to retain
Isolate	Token cost increases by the number of sub-agents (up to 15x)	Sub-agent limit, task decomposition criteria clarified

The Most Common Mistakes in Practice

Pattern using only Isolate without Select: Injects the entire document into the context for each sub-agent. The token cost increases exponentially with the number of agents. Isolate and Select must always be applied as a set.
Pattern of writing to logs only: It records to the scratchpad, but the content is not read back into the context during the next agent execution. Write is only meaningful if it is completed as a "write → read" cycle.
Pattern of triggering Compress too late: The context window fills up, terminates with an error, and restarts from the beginning. It is safe to trigger compaction preemptively at 80% and leave a checkpoint with Write beforehand.

In Conclusion

The core of context engineering is "putting the right information in the right place at the right time." Externalizing state with Write, selecting only relevant information with Select, breaking through window limits with Compress, and preventing contamination with Isolate—these four strategies complement each other and form the design basis for creating production-grade agents.

3 Steps to Start Right Now:

Diagnosis: If the agent is currently accumulating the entire conversation history in the context, start by applying the Write + Compress combination. Adding just a single scratchpad file enables the ability to resume long-term tasks.
Improvement: If you are using RAG, reduce search errors by prepending in-document location descriptions to each chunk using Anthropic Contextual Retrieval. Search accuracy increases significantly without major changes to existing code.
Extension: If a single agent is performing too many roles, split the tasks into sub-agents, but apply an Isolate strategy to return only the summary to the orchestrator. This achieves the effects of parallel processing and preventing context pollution simultaneously.

Next Post: Building a Custom MCP (Model Context Protocol) Server — A Practical Pattern for Injecting Domain-Specific Context into Agents in Real-Time

Reference Materials

Context Engineering 4 Strategies — Applying Write, Select, Compress, and Isolate to Multi-Agent Systems | DEV BAK - 기술블로그

Context Engineering 4 Strategies — Applying Write, Select, Compress, and Isolate to Multi-Agent Systems

Key Concepts

The context window is the agent's "working memory".

Key Question: Which context configuration best elicits the model's desired behavior?

Anthropic defined four buckets to answer this question.

Strategy	Key Action	Problem Solved
Write	Writes information to storage outside the context	Cannot maintain state across session boundaries
Select	Dynamically load only relevant information into the context	Unnecessary context inflation and Context Rot
Compress	Summarize conversation history and restart in a new context	Force quit when context window limit is reached
Isolate	Allocate context windows independent of subagents	Subagent details do not pollute the orchestrator

These four strategies are not mutually exclusive. Actual production agents use a combination of two or more of them.

Write — Build memories outside the context

applied period:

Long-term tasks spanning tens to hundreds of steps (codebase analysis, game agents)
When externalizing state that needs to be shared by multiple agents
When you need to retry from the last checkpoint instead of the beginning if an error occurs

Trade-off: External storage I/O latency is added, and file system dependencies are introduced. This can be mitigated by asynchronous writes and a local cache layer.

Select — Accurately retrieve only the necessary information

Three Memory Types of Select Strategy

Episodedic: Examples of desired behavior (few-shot examples)
Procedural: Agent Action Guidelines — Rule files like CLAUDE.md
Semantic: Task-related facts and knowledge (RAG, Vector DB)

Trade-off: The overall performance of the agent depends on search quality. Injecting irrelevant chunks is counterproductive. Building an evaluation pipeline is essential.

Compress — Keep running beyond context limits

Information loss is inevitable during the compression process. You must clearly determine what to preserve and what to discard.

Items to preserve	Things that can be discarded
Task Objectives and Completed Steps	Details of Intermediate Reasoning Process
Key Facts and Data Discovered	Path of Repeated Attempts and Failures
Current Status and Pending Items	Full Original Response from Successful Tool Calls
Constraints and Error History to Note	Original Pre-Compaction Version Already Summarized

The Claude Agent SDK supports automatic compaction with the compaction_control parameter. Claude Code automatically summarizes the entire history when the context window reaches 95%.

Practical Application

Example 1: News Summary Multi-Agent — Comparison of 4 Strategies Before and After

This is a scenario for implementing an agent pipeline that collects, classifies, and summarizes hundreds of news articles every day.

Before — Naive implementation without strategy:

typescript

// 모든 문제가 발생하는 순진한 구현
interface Message {
  role: "user" | "assistant";
  content: string;
}
 
async function naiveNewsAgent(articles: string[]): Promise<string> {
  let messages: Message[] = [];
 
  for (const article of articles) {
    // 문제 1: 모든 기사를 순차적으로 컨텍스트에 쌓는다 → Context Rot
    messages.push({ role: "user", content: `이 기사를 요약해줘: ${article}` });
    const summary = await callClaudeWithHistory(messages);
    messages.push({ role: "assistant", content: summary });
    // 200기사 처리 시 컨텍스트 윈도우 초과 → 100번째 기사에서 강제 종료
    // 문제 2: 중단되면 처음부터 재시작 — 진행 상황 없음
    // 문제 3: 단일 스레드 순차 처리 — 처리 속도 느림
  }
 
  return messages[messages.length - 1].content;
}

After — Write + Select + Compress + Isolate 조합:

First, define the scratchpad for the Write strategy.

typescript

// [Write] 에이전트 스크래치패드 — 세션 간 상태 영속화
import fs from "fs/promises";
import path from "path";
 
interface ScratchpadEntry {
  step: number;
  timestamp: string;
  action: string;
  result: string;
  notes: string;
}
 
class AgentScratchpad {
  private filePath: string;
  private entries: ScratchpadEntry[] = [];
 
  constructor(taskId: string) {
    this.filePath = path.join("./scratchpad", `${taskId}.json`);
  }
 
  async load(): Promise<void> {
    try {
      const raw = await fs.readFile(this.filePath, "utf-8");
      this.entries = JSON.parse(raw);
      console.log(`[Write] 이전 체크포인트 ${this.entries.length}개 로드 완료`);
    } catch {
      this.entries = []; // 첫 실행이면 빈 상태로 시작
    }
  }
 
  async write(entry: Omit<ScratchpadEntry, "step" | "timestamp">): Promise<void> {
    const newEntry: ScratchpadEntry = {
      step: this.entries.length + 1,
      timestamp: new Date().toISOString(),
      ...entry,
    };
    this.entries.push(newEntry);
    await fs.mkdir(path.dirname(this.filePath), { recursive: true });
    await fs.writeFile(this.filePath, JSON.stringify(this.entries, null, 2));
  }
 
  getSummary(): string {
    return this.entries
      .map((e) => `[Step ${e.step}] ${e.action}: ${e.notes}`)
      .join("\n");
  }
}

The following is a contextual retrieval-based search of the Select strategy.

typescript

// [Select] Contextual Retrieval — 관련 청크만 컨텍스트에 주입
interface NewsChunk {
  id: string;
  content: string;
  contextualDescription: string; // 청크의 문서 내 위치 설명
}
 
class NewsKnowledgeBase {
  private chunks: NewsChunk[] = [];
 
  // 인덱싱 시: 각 청크에 컨텍스트 설명을 prepend (Contextual Retrieval 핵심)
  async indexArticle(articleId: string, fullText: string): Promise<void> {
    const rawChunks = this.splitIntoChunks(fullText, 400);
 
    for (const [i, chunk] of rawChunks.entries()) {
      // 각 청크가 전체 기사에서 어떤 역할을 하는지 설명 생성
      const contextualDescription = await callClaude(
        `다음 기사 "${articleId}" 전체 내용:\n${fullText}\n\n` +
        `아래 청크가 이 기사 내에서 어떤 맥락에 위치하는지 2문장으로 설명하라.\n\n청크: ${chunk}`
      );
 
      this.chunks.push({
        id: `${articleId}-${i}`,
        content: chunk,
        contextualDescription,
      });
    }
  }
 
  // 검색 시: BM25 + 임베딩 하이브리드 검색으로 관련 청크만 선별
  async buildContext(query: string, topK = 5): Promise<string> {
    // 실제 구현에서는 벡터 DB(Pinecone 등)와 BM25 점수를 결합
    const relevant = this.hybridSearch(query, topK);
    // 전체 문서 대신 관련 청크만 주입 → 토큰 절약 + 관련성 집중
    return relevant
      .map((c) => `[배경 정보]\n${c.contextualDescription}\n${c.content}`)
      .join("\n\n");
  }
 
  private hybridSearch(query: string, topK: number): NewsChunk[] {
    // 0.4 * BM25 + 0.6 * 임베딩 유사도 하이브리드 가중치
    return this.chunks.slice(0, topK); // 간략화된 예시
  }
 
  private splitIntoChunks(text: string, maxWords: number): string[] {
    const words = text.split(" ");
    const chunks: string[] = [];
    for (let i = 0; i < words.length; i += maxWords) {
      chunks.push(words.slice(i, i + maxWords).join(" "));
    }
    return chunks;
  }
}

Run category-specific sub-agents in parallel using the Isolate strategy.

typescript

// [Isolate] 서브에이전트 — 독립 컨텍스트에서 좁은 태스크 수행 후 요약만 반환
interface SubTaskResult {
  taskId: string;
  summary: string;
  keyFindings: string[];
}
 
async function runSubAgent(
  taskId: string,
  specificTask: string
): Promise<SubTaskResult> {
  // 서브에이전트는 자신의 태스크 범위만 알고 있다
  // 오케스트레이터의 전체 히스토리는 전혀 전달되지 않는다 (격리)
  const response = await callClaude(
    `[서브태스크 #${taskId}]\n${specificTask}\n\n` +
    `완료 후 반드시 JSON 형식으로 응답하라:\n` +
    `{ "summary": "...", "keyFindings": ["...", "..."] }`
  );
 
  // 오케스트레이터에게는 요약된 결과만 전달 — 세부 컨텍스트는 격리
  const parsed = JSON.parse(extractJson(response));
  return { taskId, ...parsed };
}

Automatic compression is performed when the context limit is reached using the Compress strategy.

javascript

// [Compress] 컴팩션 트리거 — 80% 도달 시 히스토리를 요약하고 재시작
const CONTEXT_WINDOW_TOKENS = 200_000;
const COMPACTION_THRESHOLD = 0.80;
 
function estimateTokenCount(messages: Message[]): number {
  return messages.reduce((acc, m) => acc + m.content.length / 4, 0);
}
 
async function compactIfNeeded(
  messages: Message[],
  taskGoal: string
): Promise<Message[]> {
  const usage = estimateTokenCount(messages) / CONTEXT_WINDOW_TOKENS;
 
  if (usage < COMPACTION_THRESHOLD) return messages;
 
  console.log(`[Compress] 컨텍스트 ${Math.round(usage * 100)}% 사용 → 컴팩션 실행`);
 
  const historyText = messages
    .map((m) => `[${m.role}]: ${m.content}`)
    .join("\n");
 
  const summary = await callClaude(
    `태스크 "${taskGoal}"의 대화 히스토리를 다음 형식으로 요약하라:\n` +
    `1. 완료된 작업 (구체적)\n2. 현재 상태 및 미완료 항목\n` +
    `3. 수집된 핵심 사실\n4. 주의해야 할 제약·오류\n\n` +
    `히스토리:\n${historyText}`
  );
 
  // 압축된 단일 메시지로 새 컨텍스트 시작
  return [
    {
      role: "user",
      content: `[이전 작업 요약]\n${summary}\n\n태스크를 이어서 진행하라.`,
    },
  ];
}

Finally, it is a production-level orchestrator that combines four strategies.

typescript

// 4전략 통합 오케스트레이터
async function productionNewsAgent(
  taskId: string,
  articleUrls: string[]
): Promise<string> {
  // [Write] 이전 실행 체크포인트 로드
  const scratchpad = new AgentScratchpad(taskId);
  await scratchpad.load();
 
  const knowledgeBase = new NewsKnowledgeBase();
  let messages: Message[] = [];
 
  // [Isolate] 카테고리별 서브에이전트 병렬 실행
  const categories = ["정치", "경제", "기술", "사회"];
  const subResults = await Promise.all(
    categories.map((category) =>
      runSubAgent(
        `${taskId}-${category}`,
        `다음 URL 목록에서 "${category}" 관련 기사만 선별해 핵심 내용을 요약하라:\n` +
        articleUrls.join("\n")
      )
    )
  );
 
  // [Write] 서브에이전트 결과를 스크래치패드에 영속화
  for (const result of subResults) {
    await scratchpad.write({
      action: `카테고리 요약 완료: ${result.taskId}`,
      result: result.summary,
      notes: result.keyFindings.join("; "),
    });
  }
 
  // [Compress] 오케스트레이터 컨텍스트가 임계값 도달 시 자동 압축
  messages.push({
    role: "user",
    content: `카테고리별 요약:\n${scratchpad.getSummary()}`,
  });
  messages = await compactIfNeeded(messages, "일간 뉴스 브리핑 생성");
 
  // [Select] 과거 관련 기사를 지식베이스에서 검색해 컨텍스트 보강
  const currentSummary = subResults.map((r) => r.summary).join(" ");
  const backgroundContext = await knowledgeBase.buildContext(currentSummary);
 
  return await callClaude(
    `[오늘 카테고리 요약]\n${scratchpad.getSummary()}\n\n` +
    `[관련 배경 정보]\n${backgroundContext}\n\n` +
    `위를 통합해 500자 이내의 독자용 데일리 뉴스레터를 작성하라.`
  );
}

Before vs After Quantitative Comparison:

Item	Before (No Strategy)	After (Combination of 4 Strategies)
Ability to process 200 articles	Impossible — Terminates at the 100th	Possible — Compress + Write combination
Restart on mid-game error	Full restart from the beginning	Resume from the last checkpoint
Token usage (200 articles)	~160k accumulated (all piled up in context)	~40k — Selectively load only relevant chunks
Processing Speed	Sequential Processing (200 articles × Average 3 seconds)	Parallel Processing by Category (4 Sub-agents)
Information contamination between categories	Political articles impact technical summaries	Completely isolated with Isolate

Pros and Cons Analysis

Advantages

Strategy	Key Advantages	Representative Use Cases
Write	Long-term task persistence across session boundaries	Codebase analysis, Pokémon game agent
Select	Reduced Token Costs + Improved Accuracy through Centralized Relevant Information	RAG System, Domain Knowledge Agent
Compress	Long-term tasks possible with virtually unlimited context length	Long-term conversation agent, iterative execution pipeline
Isolate	Increased throughput through parallel processing + prevention of context pollution	Research agent, multistep analysis system

Disadvantages and Precautions

Strategy	Disadvantages	Countermeasures
Write	External storage I/O latency, file system dependency	Asynchronous writes + added in-memory cache layer
Select	Overall performance depends on search quality	Introduction of Contextual Retrieval method + Establishment of search evaluation pipeline
Compress	Permanent loss of critical information if summary quality is low	Back up original with Write before compression, explicitly specify items to retain
Isolate	Token cost increases by the number of sub-agents (up to 15x)	Sub-agent limit, task decomposition criteria clarified

The Most Common Mistakes in Practice

Pattern using only Isolate without Select: Injects the entire document into the context for each sub-agent. The token cost increases exponentially with the number of agents. Isolate and Select must always be applied as a set.
Pattern of writing to logs only: It records to the scratchpad, but the content is not read back into the context during the next agent execution. Write is only meaningful if it is completed as a "write → read" cycle.
Pattern of triggering Compress too late: The context window fills up, terminates with an error, and restarts from the beginning. It is safe to trigger compaction preemptively at 80% and leave a checkpoint with Write beforehand.

In Conclusion

3 Steps to Start Right Now:

Diagnosis: If the agent is currently accumulating the entire conversation history in the context, start by applying the Write + Compress combination. Adding just a single scratchpad file enables the ability to resume long-term tasks.
Improvement: If you are using RAG, reduce search errors by prepending in-document location descriptions to each chunk using Anthropic Contextual Retrieval. Search accuracy increases significantly without major changes to existing code.
Extension: If a single agent is performing too many roles, split the tasks into sub-agents, but apply an Isolate strategy to return only the summary to the orchestrator. This achieves the effects of parallel processing and preventing context pollution simultaneously.

Next Post: Building a Custom MCP (Model Context Protocol) Server — A Practical Pattern for Injecting Domain-Specific Context into Agents in Real-Time

Key Concepts

Write — Build memories outside the context

Select — Accurately retrieve only the necessary information

Compress — Keep running beyond context limits

Isolate — Share context with sub-agents

Practical Application

Example 1: News Summary Multi-Agent — Comparison of 4 Strategies Before and After

Pros and Cons Analysis

Advantages

Disadvantages and Precautions

The Most Common Mistakes in Practice

In Conclusion

Reference Materials

Key Concepts

Write — Build memories outside the context

Select — Accurately retrieve only the necessary information

Compress — Keep running beyond context limits

Isolate — Share context with sub-agents

Practical Application

Example 1: News Summary Multi-Agent — Comparison of 4 Strategies Before and After

Pros and Cons Analysis

Advantages

Disadvantages and Precautions

The Most Common Mistakes in Practice

In Conclusion

Reference Materials

Recommended Posts

Practical Guide to MCP Server Development — Transforming Internal APIs and DBs into a Professional Domain Agent

MCP Server Docker Deployment in 3 Steps — SSE Deprecated, Now Streamable HTTP

Stabilizing MCP Servers with HPA Custom Metrics + Grafana Dashboards: Practical Operation of AI Agent Servers on Kubernetes

Reducing LLM Agent Pipeline Token Costs by 50%: A Practical Comparison of Summary Agent vs. Chunk Injection vs. Prompt Caching

LangGraph vs CrewAI: Respond with State Machine Orchestration when the number of agents exceeds 3

Harness Engineering: Environment Design Guide for AI Agent Production