Implementing Multi-Agent Orchestration with LangGraph · A2A — From Shared Memory Design to Production Code

If you've spent enough time with Claude Code, you'll eventually hit a wall. Hand it a complex refactoring job and the context window fills up; try running test writing and documentation in parallel and the agent starts mixing everything together. At first I thought, "Maybe I just need to write better prompts?" — but in practice, as projects scale up, cramming everything into a single agent clearly has its limits.

That's why multi-agent orchestration is getting so much attention lately. It's an architecture where each agent maintains its own memory and role, sharing only the information it needs to collaborate. By the end of this article, you'll have hands-on experience building a 3-agent pipeline with LangGraph and implementing agent capability discovery using the A2A protocol.

To be honest, there are plenty of cases where people adopt multi-agent systems because they "look cool," only to get hit with a massive token bill. So this article doesn't just introduce the technology — it also covers the real-world tradeoffs. If you've used LangChain or LangGraph even once as a backend developer, you can follow right along. If this is your first time, skimming the LangGraph official docs Quick Start first will make things smoother.

Core Concepts

What Is Multi-Agent Orchestration

Multi-agent orchestration is an architecture where multiple AI agents each independently maintain their own memory and context while collaborating toward a shared goal. Think of it like a team project: each team member has their own laptop and area of expertise, and they exchange key information through shared documents like Notion or Confluence.

The three core components of this structure are as follows.

Concept	Description	Analogy
Private Memory	Independent memory optimized for each agent's role	A team member's personal laptop
Shared Context	A shared store that all agents read from and write to	The team's shared document
Orchestrator	A higher-level agent that coordinates the overall flow and spawns agents	The team lead

Private Memory (Independent Agent Memory): Each agent separately maintains short-term and long-term memory specialized for its own role. It cannot directly access the internal state of other agents, and must communicate only through designated channels.

⚠️ Only use this when: Multi-agent shines when you have 2 or more subtasks that can be parallelized independently, and each task requires different domain knowledge. Conversely, for sequential and simple tasks, attaching a good prompt to a single agent is far more economical.

MCP and A2A — Two Easy-to-Confuse Protocols

When I first encountered these, I thought "aren't they both agent-related protocols?" — but their roles are clearly distinct.

[Agent A] ──A2A──▶ [Agent B]
     │                      │
    MCP                    MCP
     │                      │
     ▼                      ▼
[Tool/DB/API]         [Tool/DB/API]

MCP (Model Context Protocol, Anthropic): The connection layer between agents and tools/data sources. The interface through which agents communicate with the outside world.
A2A (Agent2Agent Protocol, Google): The agent-to-agent communication layer. Provides a standardized message format for agents to discover capabilities, negotiate tasks, and synchronize state.

Key Insight: MCP and A2A are not competing — they're complementary. You use MCP to connect tools and A2A for agents to negotiate with each other.

When Google announced A2A in April 2025, it made waves with over 50 partners including Microsoft, Salesforce, SAP, and Atlassian joining at launch. It was later incorporated into the Linux Foundation as an open-source project, securing vendor neutrality. That said, as we move through the second half of 2025, most of the agent ecosystem appears to be converging around MCP. A2A continues to show strength in long-running workflow and human-in-the-loop scenarios.

Shared Memory Architecture — Two Key Patterns

When multiple agents share memory, the most critical concern is consistency. It's a situation you'll frequently encounter in production: when two agents simultaneously modify the same shared state, conflicts arise. Let's go over two patterns that are widely used in practice.

Pattern	Approach	Pros	Cons
Serialized Turns	Agents access shared memory in sequence	Simple to implement, no conflicts	Sacrifices parallelism
Semantic Locking	Query vector DB for "already known" facts before writing to prevent duplicate writes	Prevents redundant analysis	Query overhead

A third option is the bipartite access graph (dynamically connecting users, agents, and resources in a graph to apply fine-grained read/write policies), but honestly, it's almost never used in production — it's usually over-engineering. If you need an advanced architecture, the Collaborative Memory paper is worth a look.

Reducer Pattern: A function that defines how to merge updates when multiple agents simultaneously update state. In LangGraph, declaring Annotated[list, operator.add] automatically accumulates items into the list. It's the simplest and safest way to prevent concurrent write conflicts.

The Blackboard Architecture also deserves mention. It's a pattern where agents read from and write to a shared chalkboard (the blackboard) to collaboratively solve a problem, with the orchestrator dynamically deciding which agent to activate next based on the blackboard state. There are real research examples of software design systems where 9 specialized agents collaborate using a blackboard-based approach.

Practical Application

Example 1: Parallel Codebase Processing with Claude Code

First, let's implement the core structure of a multi-agent system directly in Python — an orchestrator spawning worker agents and sharing state via a blackboard. Since this is written with the vanilla anthropic SDK without LangGraph, the structure should be easier to grasp.

python

# Claude Code Multi-Agent — Orchestrator + Worker Agent Configuration Example
import anthropic
import concurrent.futures
 
client = anthropic.Anthropic()
 
# Orchestrator: analyzes the overall task and breaks it into subtasks
orchestrator_prompt = """
당신은 소프트웨어 엔지니어링 팀의 오케스트레이터입니다.
주어진 코드베이스 변경 요청을 분석하고, 다음 세 에이전트에게 작업을 분배합니다:
- test_agent: 테스트 케이스 작성 담당
- refactor_agent: 코드 리팩토링 담당
- doc_agent: 문서화 담당
 
각 에이전트는 독립된 컨텍스트를 가지며, blockers 리스트를 통해 이슈를 공유합니다.
"""
 
def spawn_agent(role: str, task: str, shared_context: dict) -> str:
    agent_prompts = {
        "test_agent": "당신은 테스트 전문가입니다. 주어진 코드의 엣지 케이스를 중심으로 테스트를 작성해 주세요.",
        "refactor_agent": "당신은 리팩토링 전문가입니다. 가독성과 성능을 함께 고려해 코드를 개선해 주세요.",
        "doc_agent": "당신은 기술 문서 전문가입니다. 개발자가 바로 이해할 수 있는 문서를 작성해 주세요.",
    }
 
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=4096,
        system=agent_prompts[role],
        messages=[
            {
                "role": "user",
                "content": f"작업: {task}\n\n공유 컨텍스트:\n{shared_context}"
            }
        ]
    )
    return response.content[0].text
 
# Shared context (acts as the blackboard)
shared_context = {
    "codebase_summary": "Python FastAPI 기반 REST API 서버",
    "changed_files": ["src/user_service.py", "src/auth.py"],
    "blockers": []  # 에이전트들이 공유하는 블로커 리스트
}
 
tasks = {
    "test_agent": "user_service.py의 유닛 테스트 작성",
    "refactor_agent": "auth.py의 토큰 검증 로직 리팩토링",
    "doc_agent": "변경된 API 엔드포인트 문서화"
}
 
# Run three agents in parallel
results = {}
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
    futures = {
        executor.submit(spawn_agent, role, task, shared_context): role
        for role, task in tasks.items()
    }
    for future in concurrent.futures.as_completed(futures):
        role = futures[future]
        results[role] = future.result()

Component	Role	Memory Scope
`orchestrator_prompt`	Overall task decomposition and agent assignment	Full codebase context
`spawn_agent()`	Executes role-specific agents	Individual independent context
`shared_context`	Acts as the blackboard, shared state	Read/write by all agents
`blockers`	Issue-sharing channel between agents	Simulates inter-agent mailbox

Example 2: LangGraph-Based Multi-Agent System with Shared Vector Memory

Where Example 1 demonstrated "parallel agent execution" itself, this time we express how agents structurally read from and write to shared state using LangGraph's StateGraph. In specialized domains like finance or legal, a valid pattern is having each agent maintain role-specific embedding vector stores while recording and querying facts to a shared vector DB (Qdrant, Weaviate, etc.). It prevents redundant analysis and lets agents reuse facts already discovered — quite effective in production.

python

# LangGraph-Based Shared Vector Memory Multi-Agent Example
# Required packages: pip install langgraph qdrant-client sentence-transformers
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver
from typing import TypedDict, Annotated
import operator
 
# Shared state schema definition — the blackboard all agents read from and write to
class SharedState(TypedDict):
    query: str
    facts: Annotated[list, operator.add]           # Reducer: multiple agents can append
    analysis_results: Annotated[list, operator.add]
    final_answer: str
    active_agent: str
 
# ── The functions below are interfaces that require actual implementation ──
# search_shared_vector_db(query): Query Qdrant collection for similar facts
# store_to_shared_vector_db(facts): Store new facts in Qdrant
# collect_facts(query): Collect new facts via LLM or external API
# load_private_memory(agent_id): Load domain knowledge from agent-specific vector store
# run_analysis(facts, knowledge): Run analysis using collected facts + domain knowledge
# synthesize(results): Synthesize multiple analysis results into a final answer
# ─────────────────────────────────────────────────────────
 
# Research agent — dedicated to fact collection
def research_agent(state: SharedState) -> dict:
    # Query shared vector DB for existing facts (semantic locking pattern)
    existing_facts = search_shared_vector_db(state["query"])
 
    if existing_facts:
        # Skip redundant analysis if facts are already known
        return {"facts": existing_facts, "active_agent": "analysis_agent"}
 
    # Collect new facts and store them in the shared vector DB
    new_facts = collect_facts(state["query"])
    store_to_shared_vector_db(new_facts)
    return {"facts": new_facts, "active_agent": "analysis_agent"}
 
# Analysis agent — dedicated to fact-based analysis (uses independent long-term memory)
def analysis_agent(state: SharedState) -> dict:
    # Load domain knowledge from the agent's own Private Memory
    domain_knowledge = load_private_memory("analysis_agent")
    analysis = run_analysis(state["facts"], domain_knowledge)
    return {"analysis_results": [analysis], "active_agent": "synthesis_agent"}
 
# Synthesis agent — composes the final answer
def synthesis_agent(state: SharedState) -> dict:
    final = synthesize(state["analysis_results"])
    return {"final_answer": final, "active_agent": "end"}
 
# Routing logic — orchestrator decides the next agent based on blackboard state
def route_next(state: SharedState) -> str:
    routing_map = {
        "analysis_agent": "analysis_agent",
        "synthesis_agent": "synthesis_agent",
        "end": END
    }
    return routing_map.get(state["active_agent"], END)
 
# Graph construction
from langgraph.graph import END  # END는 langgraph.graph에서 임포트
 
workflow = StateGraph(SharedState)
workflow.add_node("research_agent", research_agent)
workflow.add_node("analysis_agent", analysis_agent)
workflow.add_node("synthesis_agent", synthesis_agent)
 
workflow.set_entry_point("research_agent")
workflow.add_conditional_edges("research_agent", route_next)
workflow.add_conditional_edges("analysis_agent", route_next)
workflow.add_conditional_edges("synthesis_agent", route_next)
 
# Checkpoint — enables pause and resume, also useful for debugging
checkpointer = MemorySaver()
app = workflow.compile(checkpointer=checkpointer)

The key in the code above is the shared state fields declared with Annotated[list, operator.add]. By leveraging this reducer pattern, merging is handled automatically even when multiple agents simultaneously append items to the same list.

Semantic Lock: A pattern where an agent first performs a vector similarity search to check "is there already something similar?" before writing new information to shared memory. search_shared_vector_db plays exactly this role. For a real implementation, you can spin up a local Qdrant instance with docker run -p 6333:6333 qdrant/qdrant and connect via the qdrant-client library.

Example 3: Implementing Agent Capability Discovery with the A2A Protocol

Where the previous two examples dealt with agent collaboration within a single system, A2A is the protocol for collaborating with external agents across service boundaries. It shines when two independent systems need to cooperate using only standard interfaces — without exposing each other's internal implementation. For example, your company's order agent asking a partner's inventory agent for the optimal order quantity. The TypeScript example below is based on Node.js 18+ (crypto.randomUUID and fetch are both built-in since Node 18).

typescript

// A2A Protocol — Agent Card (Capability Advertisement) Example
// Runtime: Node.js 18+ (crypto.randomUUID, fetch built-in)
 
const inventoryAgentCard = {
  name: "inventory-optimizer-agent",
  version: "1.0.0",
  description: "Dedicated agent for real-time inventory optimization and order prediction",
  capabilities: {
    streaming: true,           // SSE-based streaming supported
    pushNotifications: true,   // Async notifications supported
    stateTransitionHistory: true
  },
  skills: [
    {
      id: "optimize-stock-level",
      name: "재고 수준 최적화",
      description: "현재 재고 데이터를 바탕으로 최적 발주량 계산",
      inputModes: ["application/json"],
      outputModes: ["application/json", "text/plain"]
    },
    {
      id: "predict-demand",
      name: "수요 예측",
      description: "과거 판매 데이터 기반 수요 예측",
      inputModes: ["application/json"],
      outputModes: ["application/json"]
    }
  ]
}
 
interface ProductData {
  productId: string
  currentStock: number
  salesHistory: number[]
}
 
interface OptimizationResult {
  recommendedOrderQuantity: number
  predictedDemand: number
}
 
// A2A client — delegates tasks to another agent
async function delegateToInventoryAgent(
  productData: ProductData,
  agentEndpoint: string
): Promise<OptimizationResult> {
  const taskRequest = {
    id: crypto.randomUUID(),
    message: {
      role: "user",
      parts: [{
        type: "data",
        data: productData
      }]
    }
  }
 
  // A2A cooperates through standard interfaces without exposing internal memory or tools
  const response = await fetch(`${agentEndpoint}/tasks/send`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(taskRequest)
  })
 
  return await response.json() as OptimizationResult
}

Reports indicate that food companies like Tyson Foods and Gordon Food Service are using similar patterns for supply chain optimization. Each company's inventory agent keeps its internal state hidden while exchanging product data and optimization information only through standardized channels.

Pros and Cons Analysis

Advantages

Item	Details
Parallel Processing	Independent agents execute concurrently, significantly reducing overall processing time
Role Specialization	Each agent maintains domain-specific memory, improving accuracy
Isolation	Internal state is protected without memory contamination between agents
Scalability	Functionality can be extended simply by adding new agents, with minimal impact on existing workflows
Standardization	A2A/MCP enables vendor-neutral agent composition

Disadvantages and Caveats

Item	Details	Mitigation
Token Cost	A 3-agent team consumes 2.5–4x more tokens than a single agent	Minimize the number of agents; apply parallelization only to tasks that genuinely need it
Shared Memory Bottleneck	Centralized shared memory risks throughput bottlenecks and single points of failure	Use sharding or distributed vector DBs; add a read cache layer
Consistency Issues	Conflicts can occur when multiple agents simultaneously modify shared state	Choose from reducer pattern, semantic locking, or serialized turns based on the situation
Orchestration Complexity	Coordination logic grows exponentially as the number of agents increases	Start with 3 or fewer agents and scale gradually
Debugging Difficulty	Tracing and reproducing distributed agent behavior is harder than with a single agent	LangSmith, checkpoint-based reproduction, and structured logging are essential
A2A Ecosystem Uncertainty	As of H2 2025, the ecosystem is converging on MCP; A2A adoption is slowing	Limit A2A to long-running workflows; prioritize MCP for general-purpose connectivity

In practice, the third item — consistency issues — is the one that trips people up most often. It looks simple at first, but as you add more agents, tracking who touched the shared state and when becomes increasingly difficult. Getting the reducer pattern right early on makes a big difference down the line.

The Most Common Mistakes in Production

Creating too many agents from the start — It's far safer to begin with 3 or fewer agents, confirm where parallelization is actually needed, and then expand. More agents simultaneously increases token costs and coordination complexity.
Stuffing too much information into the shared context — Not every agent needs to share all information. It helps to ask "does this agent actually need this information?" every time. When the shared context grows bloated, token consumption increases for all agents at once.
Deferring observability to later — When one agent produces a wrong result, if you can't trace which agent produced what output from what input, debugging becomes nearly impossible. It's better to attach tracing tools like LangSmith and checkpoint storage from the very beginning.

Closing Thoughts

Multi-agent orchestration isn't about "using multiple agents" — it's a design philosophy of each agent guarding its own memory while sharing only the information that's truly necessary.

Three steps you can start right now:

Build a 2-agent pipeline with LangGraph — After installing with pip install langgraph, try building a simple pipeline with just research_agent and analysis_agent from the example code above. You'll get hands-on intuition for shared state and routing concepts.
Experiment with shared vector memory using a local Qdrant instance — Spin one up locally with docker run -p 6333:6333 qdrant/qdrant, then implement search_shared_vector_db and store_to_shared_vector_db from Example 2 as real Qdrant calls. You'll get a direct feel for the semantic locking pattern.
Attach an A2A Agent Card to your own service — Reference the inventoryAgentCard structure from Example 3 and define your agent's capabilities as JSON. When the time comes to collaborate with an external agent, you'll have a ready foundation.

When it comes to resolving shared memory conflicts, the final arbiter must ultimately be a human — in the next article, we'll cover LangGraph's Human-in-the-Loop pattern: interrupt design strategies where agents request human approval before making critical decisions.

Next Article: LangGraph's Human-in-the-Loop Pattern — Interrupt design strategies where agents request human approval before making critical decisions

References

Implementing Multi-Agent Orchestration with LangGraph · A2A — From Shared Memory Design to Production Code | DEV BAK - 기술블로그

Claude

Implementing Multi-Agent Orchestration with LangGraph · A2A — From Shared Memory Design to Production Code

Core Concepts

What Is Multi-Agent Orchestration

The three core components of this structure are as follows.

Concept	Description	Analogy
Private Memory	Independent memory optimized for each agent's role	A team member's personal laptop
Shared Context	A shared store that all agents read from and write to	The team's shared document
Orchestrator	A higher-level agent that coordinates the overall flow and spawns agents	The team lead

Private Memory (Independent Agent Memory): Each agent separately maintains short-term and long-term memory specialized for its own role. It cannot directly access the internal state of other agents, and must communicate only through designated channels.

⚠️ Only use this when: Multi-agent shines when you have 2 or more subtasks that can be parallelized independently, and each task requires different domain knowledge. Conversely, for sequential and simple tasks, attaching a good prompt to a single agent is far more economical.

MCP and A2A — Two Easy-to-Confuse Protocols

When I first encountered these, I thought "aren't they both agent-related protocols?" — but their roles are clearly distinct.

[Agent A] ──A2A──▶ [Agent B]
     │                      │
    MCP                    MCP
     │                      │
     ▼                      ▼
[Tool/DB/API]         [Tool/DB/API]

MCP (Model Context Protocol, Anthropic): The connection layer between agents and tools/data sources. The interface through which agents communicate with the outside world.
A2A (Agent2Agent Protocol, Google): The agent-to-agent communication layer. Provides a standardized message format for agents to discover capabilities, negotiate tasks, and synchronize state.

Key Insight: MCP and A2A are not competing — they're complementary. You use MCP to connect tools and A2A for agents to negotiate with each other.

Shared Memory Architecture — Two Key Patterns

Pattern	Approach	Pros	Cons
Serialized Turns	Agents access shared memory in sequence	Simple to implement, no conflicts	Sacrifices parallelism
Semantic Locking	Query vector DB for "already known" facts before writing to prevent duplicate writes	Prevents redundant analysis	Query overhead

Reducer Pattern: A function that defines how to merge updates when multiple agents simultaneously update state. In LangGraph, declaring Annotated[list, operator.add] automatically accumulates items into the list. It's the simplest and safest way to prevent concurrent write conflicts.

Practical Application

Example 1: Parallel Codebase Processing with Claude Code

python

# Claude Code Multi-Agent — Orchestrator + Worker Agent Configuration Example
import anthropic
import concurrent.futures
 
client = anthropic.Anthropic()
 
# Orchestrator: analyzes the overall task and breaks it into subtasks
orchestrator_prompt = """
당신은 소프트웨어 엔지니어링 팀의 오케스트레이터입니다.
주어진 코드베이스 변경 요청을 분석하고, 다음 세 에이전트에게 작업을 분배합니다:
- test_agent: 테스트 케이스 작성 담당
- refactor_agent: 코드 리팩토링 담당
- doc_agent: 문서화 담당
 
각 에이전트는 독립된 컨텍스트를 가지며, blockers 리스트를 통해 이슈를 공유합니다.
"""
 
def spawn_agent(role: str, task: str, shared_context: dict) -> str:
    agent_prompts = {
        "test_agent": "당신은 테스트 전문가입니다. 주어진 코드의 엣지 케이스를 중심으로 테스트를 작성해 주세요.",
        "refactor_agent": "당신은 리팩토링 전문가입니다. 가독성과 성능을 함께 고려해 코드를 개선해 주세요.",
        "doc_agent": "당신은 기술 문서 전문가입니다. 개발자가 바로 이해할 수 있는 문서를 작성해 주세요.",
    }
 
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=4096,
        system=agent_prompts[role],
        messages=[
            {
                "role": "user",
                "content": f"작업: {task}\n\n공유 컨텍스트:\n{shared_context}"
            }
        ]
    )
    return response.content[0].text
 
# Shared context (acts as the blackboard)
shared_context = {
    "codebase_summary": "Python FastAPI 기반 REST API 서버",
    "changed_files": ["src/user_service.py", "src/auth.py"],
    "blockers": []  # 에이전트들이 공유하는 블로커 리스트
}
 
tasks = {
    "test_agent": "user_service.py의 유닛 테스트 작성",
    "refactor_agent": "auth.py의 토큰 검증 로직 리팩토링",
    "doc_agent": "변경된 API 엔드포인트 문서화"
}
 
# Run three agents in parallel
results = {}
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
    futures = {
        executor.submit(spawn_agent, role, task, shared_context): role
        for role, task in tasks.items()
    }
    for future in concurrent.futures.as_completed(futures):
        role = futures[future]
        results[role] = future.result()

Component	Role	Memory Scope
`orchestrator_prompt`	Overall task decomposition and agent assignment	Full codebase context
`spawn_agent()`	Executes role-specific agents	Individual independent context
`shared_context`	Acts as the blackboard, shared state	Read/write by all agents
`blockers`	Issue-sharing channel between agents	Simulates inter-agent mailbox

Example 2: LangGraph-Based Multi-Agent System with Shared Vector Memory

python

# LangGraph-Based Shared Vector Memory Multi-Agent Example
# Required packages: pip install langgraph qdrant-client sentence-transformers
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver
from typing import TypedDict, Annotated
import operator
 
# Shared state schema definition — the blackboard all agents read from and write to
class SharedState(TypedDict):
    query: str
    facts: Annotated[list, operator.add]           # Reducer: multiple agents can append
    analysis_results: Annotated[list, operator.add]
    final_answer: str
    active_agent: str
 
# ── The functions below are interfaces that require actual implementation ──
# search_shared_vector_db(query): Query Qdrant collection for similar facts
# store_to_shared_vector_db(facts): Store new facts in Qdrant
# collect_facts(query): Collect new facts via LLM or external API
# load_private_memory(agent_id): Load domain knowledge from agent-specific vector store
# run_analysis(facts, knowledge): Run analysis using collected facts + domain knowledge
# synthesize(results): Synthesize multiple analysis results into a final answer
# ─────────────────────────────────────────────────────────
 
# Research agent — dedicated to fact collection
def research_agent(state: SharedState) -> dict:
    # Query shared vector DB for existing facts (semantic locking pattern)
    existing_facts = search_shared_vector_db(state["query"])
 
    if existing_facts:
        # Skip redundant analysis if facts are already known
        return {"facts": existing_facts, "active_agent": "analysis_agent"}
 
    # Collect new facts and store them in the shared vector DB
    new_facts = collect_facts(state["query"])
    store_to_shared_vector_db(new_facts)
    return {"facts": new_facts, "active_agent": "analysis_agent"}
 
# Analysis agent — dedicated to fact-based analysis (uses independent long-term memory)
def analysis_agent(state: SharedState) -> dict:
    # Load domain knowledge from the agent's own Private Memory
    domain_knowledge = load_private_memory("analysis_agent")
    analysis = run_analysis(state["facts"], domain_knowledge)
    return {"analysis_results": [analysis], "active_agent": "synthesis_agent"}
 
# Synthesis agent — composes the final answer
def synthesis_agent(state: SharedState) -> dict:
    final = synthesize(state["analysis_results"])
    return {"final_answer": final, "active_agent": "end"}
 
# Routing logic — orchestrator decides the next agent based on blackboard state
def route_next(state: SharedState) -> str:
    routing_map = {
        "analysis_agent": "analysis_agent",
        "synthesis_agent": "synthesis_agent",
        "end": END
    }
    return routing_map.get(state["active_agent"], END)
 
# Graph construction
from langgraph.graph import END  # END는 langgraph.graph에서 임포트
 
workflow = StateGraph(SharedState)
workflow.add_node("research_agent", research_agent)
workflow.add_node("analysis_agent", analysis_agent)
workflow.add_node("synthesis_agent", synthesis_agent)
 
workflow.set_entry_point("research_agent")
workflow.add_conditional_edges("research_agent", route_next)
workflow.add_conditional_edges("analysis_agent", route_next)
workflow.add_conditional_edges("synthesis_agent", route_next)
 
# Checkpoint — enables pause and resume, also useful for debugging
checkpointer = MemorySaver()
app = workflow.compile(checkpointer=checkpointer)

Semantic Lock: A pattern where an agent first performs a vector similarity search to check "is there already something similar?" before writing new information to shared memory. search_shared_vector_db plays exactly this role. For a real implementation, you can spin up a local Qdrant instance with docker run -p 6333:6333 qdrant/qdrant and connect via the qdrant-client library.

Example 3: Implementing Agent Capability Discovery with the A2A Protocol

typescript

// A2A Protocol — Agent Card (Capability Advertisement) Example
// Runtime: Node.js 18+ (crypto.randomUUID, fetch built-in)
 
const inventoryAgentCard = {
  name: "inventory-optimizer-agent",
  version: "1.0.0",
  description: "Dedicated agent for real-time inventory optimization and order prediction",
  capabilities: {
    streaming: true,           // SSE-based streaming supported
    pushNotifications: true,   // Async notifications supported
    stateTransitionHistory: true
  },
  skills: [
    {
      id: "optimize-stock-level",
      name: "재고 수준 최적화",
      description: "현재 재고 데이터를 바탕으로 최적 발주량 계산",
      inputModes: ["application/json"],
      outputModes: ["application/json", "text/plain"]
    },
    {
      id: "predict-demand",
      name: "수요 예측",
      description: "과거 판매 데이터 기반 수요 예측",
      inputModes: ["application/json"],
      outputModes: ["application/json"]
    }
  ]
}
 
interface ProductData {
  productId: string
  currentStock: number
  salesHistory: number[]
}
 
interface OptimizationResult {
  recommendedOrderQuantity: number
  predictedDemand: number
}
 
// A2A client — delegates tasks to another agent
async function delegateToInventoryAgent(
  productData: ProductData,
  agentEndpoint: string
): Promise<OptimizationResult> {
  const taskRequest = {
    id: crypto.randomUUID(),
    message: {
      role: "user",
      parts: [{
        type: "data",
        data: productData
      }]
    }
  }
 
  // A2A cooperates through standard interfaces without exposing internal memory or tools
  const response = await fetch(`${agentEndpoint}/tasks/send`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(taskRequest)
  })
 
  return await response.json() as OptimizationResult
}

Pros and Cons Analysis

Advantages

Item	Details
Parallel Processing	Independent agents execute concurrently, significantly reducing overall processing time
Role Specialization	Each agent maintains domain-specific memory, improving accuracy
Isolation	Internal state is protected without memory contamination between agents
Scalability	Functionality can be extended simply by adding new agents, with minimal impact on existing workflows
Standardization	A2A/MCP enables vendor-neutral agent composition

Disadvantages and Caveats

Item	Details	Mitigation
Token Cost	A 3-agent team consumes 2.5–4x more tokens than a single agent	Minimize the number of agents; apply parallelization only to tasks that genuinely need it
Shared Memory Bottleneck	Centralized shared memory risks throughput bottlenecks and single points of failure	Use sharding or distributed vector DBs; add a read cache layer
Consistency Issues	Conflicts can occur when multiple agents simultaneously modify shared state	Choose from reducer pattern, semantic locking, or serialized turns based on the situation
Orchestration Complexity	Coordination logic grows exponentially as the number of agents increases	Start with 3 or fewer agents and scale gradually
Debugging Difficulty	Tracing and reproducing distributed agent behavior is harder than with a single agent	LangSmith, checkpoint-based reproduction, and structured logging are essential
A2A Ecosystem Uncertainty	As of H2 2025, the ecosystem is converging on MCP; A2A adoption is slowing	Limit A2A to long-running workflows; prioritize MCP for general-purpose connectivity

The Most Common Mistakes in Production

Creating too many agents from the start — It's far safer to begin with 3 or fewer agents, confirm where parallelization is actually needed, and then expand. More agents simultaneously increases token costs and coordination complexity.
Stuffing too much information into the shared context — Not every agent needs to share all information. It helps to ask "does this agent actually need this information?" every time. When the shared context grows bloated, token consumption increases for all agents at once.
Deferring observability to later — When one agent produces a wrong result, if you can't trace which agent produced what output from what input, debugging becomes nearly impossible. It's better to attach tracing tools like LangSmith and checkpoint storage from the very beginning.

Closing Thoughts

Multi-agent orchestration isn't about "using multiple agents" — it's a design philosophy of each agent guarding its own memory while sharing only the information that's truly necessary.

Three steps you can start right now:

Build a 2-agent pipeline with LangGraph — After installing with pip install langgraph, try building a simple pipeline with just research_agent and analysis_agent from the example code above. You'll get hands-on intuition for shared state and routing concepts.
Experiment with shared vector memory using a local Qdrant instance — Spin one up locally with docker run -p 6333:6333 qdrant/qdrant, then implement search_shared_vector_db and store_to_shared_vector_db from Example 2 as real Qdrant calls. You'll get a direct feel for the semantic locking pattern.
Attach an A2A Agent Card to your own service — Reference the inventoryAgentCard structure from Example 3 and define your agent's capabilities as JSON. When the time comes to collaborate with an external agent, you'll have a ready foundation.

Next Article: LangGraph's Human-in-the-Loop Pattern — Interrupt design strategies where agents request human approval before making critical decisions

Core Concepts

What Is Multi-Agent Orchestration

MCP and A2A — Two Easy-to-Confuse Protocols

Shared Memory Architecture — Two Key Patterns

Practical Application

Example 1: Parallel Codebase Processing with Claude Code

Example 2: LangGraph-Based Multi-Agent System with Shared Vector Memory

Example 3: Implementing Agent Capability Discovery with the A2A Protocol

Pros and Cons Analysis

Advantages

Disadvantages and Caveats

The Most Common Mistakes in Production

Closing Thoughts

References

Core Concepts

What Is Multi-Agent Orchestration

MCP and A2A — Two Easy-to-Confuse Protocols

Shared Memory Architecture — Two Key Patterns

Practical Application

Example 1: Parallel Codebase Processing with Claude Code

Example 2: LangGraph-Based Multi-Agent System with Shared Vector Memory

Example 3: Implementing Agent Capability Discovery with the A2A Protocol

Pros and Cons Analysis

Advantages

Disadvantages and Caveats

The Most Common Mistakes in Production

Closing Thoughts

References

Recommended Posts

Claude Code Routines Practical Guide — Fully Automating Repetitive Dev Tasks with Schedules, Webhooks, and API Triggers

Claude Code Hooks Practical Guide — Building Your Own Automation Pipeline with PreToolUse·PostToolUse

Automating PostgreSQL and REST API Calls with Claude Code MCP Server Integration and Hooks

The Complete Guide to Claude Code Memory: CLAUDE.md · Auto Memory · claude-mem Plugin Comparison and Practical Setup

Figma MCP × Claude Code × GitHub Actions: A Practical Guide to Design Token CI/CD Automation

Generating Next.js Components Without Figma Using Claude Design + Claude Code — A Practical Handoff Bundle Guide