How to Connect Parallel Execution, Human-in-the-Loop, and Multi-Agent with Mastra Workflow in a Single TypeScript File
If you've ever built an AI pipeline yourself, you've probably run into this situation: you lined up three LLM calls sequentially, only to realize the first two had no dependency on each other and could have run in parallel. Or the opposite — you had everything automated, but then hit a moment where "a human really should look at this," with no way to pause.
Mastra is a TypeScript-native AI agent framework. It bundles workflows, agents, RAG, and memory into a single package. The Workflow feature is particularly interesting — it's a deterministic pipeline that lets you explicitly define execution order, branching, and parallelism in code. Being able to handle an entire AI pipeline with TypeScript alone, without Python, is a significant advantage in practice.
This article walks through three patterns — Parallel Execution, Human-in-the-Loop (HITL), and Multi-Agent Orchestration — with real code, showing how you can combine them inside a single TypeScript pipeline. If you're building backends or full-stack apps in TypeScript and are interested in constructing AI pipelines, you'll be able to apply this right away.
Table of Contents
- Three Patterns: Parallel Execution, HITL, Multi-Agent
- Practical Application
- Pros and Cons
- Closing Thoughts
- References
Three Patterns: Parallel Execution, HITL, Multi-Agent
Parallel Execution
When building real workflows, you'll often find yourself asking, "These two steps have nothing to do with each other — why do they need to wait?" In a content generation pipeline, "topic research" and "competitor analysis" are completely independent tasks.
The .parallel() API runs multiple steps simultaneously and doesn't advance to the next stage until all of them complete.
import { createWorkflow } from '@mastra/core/workflows';
const contentWorkflow = createWorkflow({ id: 'content-pipeline' })
.addStep(fetchTopicStep) // Fetch the topic
.parallel([researchStep, competitorAnalysisStep]) // Run concurrently
.addStep(editorialStep) // Merge results and edit
.commit();In older versions (v0.3 and below), you expressed branching with .after(). It still works, but for new projects, .parallel() makes the intent much clearer.
// Legacy approach: fan-out using .after()
myWorkflow
.step(fetchTopicStep)
.then(researchStep)
.after(fetchTopicStep)
.step(competitorAnalysisStep)
.commit();Note: If one step inside a parallel block throws an exception, the entire block fails. If you want to allow partial failures, handle them with
try/catchinside each step and include the error state in the return value.
Human-in-the-Loop (HITL)
This is the trickiest part of an automated pipeline. The requirement "a human should review this under certain conditions" is common, but implementing it means pausing the workflow somewhere, waiting for an external event (an HTTP request, a UI button click, etc.), and then resuming.
Mastra handles this with a suspend() / resume() mechanism.
Understanding how it works before reading the HITL code makes it much easier to follow. When you call suspend(), the workflow pauses. When it's later resumed with resume(), the same execute function runs again from the beginning. At that point, resumeData is populated, so the key pattern is using a branch like if (resumeData?.approved === undefined) to distinguish between the initial run and a resume.
import { createStep, createWorkflow } from '@mastra/core/workflows';
import { z } from 'zod';
const approvalStep = createStep({
id: 'approval-step',
inputSchema: z.object({ content: z.string() }),
resumeSchema: z.object({
approved: z.boolean(),
feedback: z.string().optional(),
}),
suspendSchema: z.object({ reason: z.string() }),
async execute({ inputData, resumeData, suspend }) {
// First run: no approval decision yet, so suspend
if (resumeData?.approved === undefined) {
await suspend({ reason: 'Reviewer approval required.' });
return;
}
// After resume: resumeData contains the human's decision
if (!resumeData.approved) {
throw new Error(`Rejected: ${resumeData.feedback ?? 'No feedback provided'}`);
}
return { result: inputData.content };
},
});Resuming is done externally by calling workflow.resume(). You can use it in an HTTP endpoint or webhook handler like this:
// Example: called from a POST /approve handler
await workflow.resume({
runId: 'run-123',
stepId: 'approval-step',
resumeData: { approved: true, feedback: 'Looks good' },
});I was skeptical at first too — "does that actually work?" — but the entire execution state at the point of suspension is serialized as a snapshot and saved to storage, so even if the server restarts or a deployment happens, the workflow is restored exactly as it was.
Snapshot: When
suspend()is called, the entire current execution context (inputs, intermediate results, execution pointer) is serialized and saved to persistent storage like LibSQL or PostgreSQL. Whenresume()is called, that snapshot is loaded and execution continues from precisely where it left off. The fact that this is the default behavior — with no separate distributed workflow infrastructure required — is quite practical.
Multi-Agent Orchestration
Having a single agent try to do everything quickly hits a wall. The context gets too long, or you end up cramming completely different roles into a single prompt. Mastra officially supports the Supervisor pattern: a supervisor agent analyzes a user request and delegates it to specialized sub-agents.
import { Agent } from '@mastra/core/agent';
import { anthropic } from '@ai-sdk/anthropic';
// Specialized sub-agents
const researchAgent = new Agent({
name: 'ResearchAgent',
instructions: 'Deeply research a given topic and summarize the key findings.',
model: anthropic('claude-sonnet-4-6'),
});
const writerAgent = new Agent({
name: 'WriterAgent',
instructions: 'Write a reader-friendly blog post based on research findings.',
model: anthropic('claude-sonnet-4-6'),
});
// Supervisor: register delegation targets via the agents array
const supervisorAgent = new Agent({
name: 'SupervisorAgent',
instructions: 'Analyze user requests and delegate to the appropriate specialist agent.',
model: anthropic('claude-opus-4-7'), // Use a more powerful model for routing decisions
agents: [researchAgent, writerAgent],
});The onDelegationStart hook intercepts the moment just before the supervisor delegates to a sub-agent. Importantly, the value this hook returns becomes the actual message delivered to the sub-agent. Changing the return value changes the context the sub-agent receives, so you can use it for message transformation or logging, as shown below.
const result = await supervisorAgent.stream('Write a blog post on AI trends', {
onDelegationStart({ agent, messages }) {
console.log(`→ Delegating to ${agent.name}`);
return messages; // Pass through unchanged; return a transformed version to modify
},
});You can also use agents directly as steps inside a workflow. This uses mastra.getAgent(), which requires the agent to be registered with the Mastra instance beforehand.
import { Mastra } from '@mastra/core';
import { createStep } from '@mastra/core/workflows';
// Register the agent with the Mastra instance
const mastra = new Mastra({
agents: { ResearchAgent: researchAgent },
});
const researchStep = createStep({
id: 'research-step',
inputSchema: z.object({ query: z.string() }),
outputSchema: z.object({ result: z.string() }),
async execute({ inputData, mastra }) {
const agent = mastra.getAgent('ResearchAgent'); // Reference a registered agent
const response = await agent.generate(inputData.query);
return { result: response.text };
},
});Agent Network: The supervisor doesn't simply "call" sub-agents — sub-agents can themselves have tools or further sub-agents. This hierarchical structure lets you decompose complex tasks into discrete units of responsibility.
Practical Application
Example 1: Content Generation Pipeline — Parallel Research + HITL Quality Review
This is a situation you encounter often in practice. Run topic research and competitor analysis in parallel, then if the editorial output falls below a quality threshold, a human reviews it before the workflow resumes.
import { createWorkflow, createStep } from '@mastra/core/workflows';
import { z } from 'zod';
// fetchTopicStep: assumed to be injected externally in this example (scaffolding or separate file)
// Minimal stub example:
// const fetchTopicStep = createStep({
// id: 'fetch-topic', inputSchema: z.object({ keyword: z.string() }),
// outputSchema: z.object({ topic: z.string() }),
// async execute({ inputData }) { return { topic: inputData.keyword }; },
// });
// 1. Two steps to run in parallel
const researchStep = createStep({
id: 'research',
inputSchema: z.object({ topic: z.string() }),
outputSchema: z.object({ summary: z.string() }),
async execute({ inputData, mastra }) {
const agent = mastra.getAgent('ResearchAgent');
const res = await agent.generate(`Research the topic: ${inputData.topic}`);
return { summary: res.text };
},
});
const competitorStep = createStep({
id: 'competitor-analysis',
inputSchema: z.object({ topic: z.string() }),
outputSchema: z.object({ analysis: z.string() }),
async execute({ inputData, mastra }) {
const agent = mastra.getAgent('ResearchAgent');
const res = await agent.generate(`Analyze the competitive landscape for: ${inputData.topic}`);
return { analysis: res.text };
},
});
// assessQuality: returns a quality score between 0 and 1 for a draft.
// The actual implementation varies by project — LLM-based evaluation, rule-based checks
// (length, keyword density, etc.), etc. Simple character-count dummy example below:
function assessQuality(text: string): number {
if (text.length < 300) return 0.4;
if (text.length < 800) return 0.6;
return 0.85;
}
// 2. Editorial + quality review step (with HITL)
const editorialStep = createStep({
id: 'editorial',
inputSchema: z.object({ summary: z.string(), analysis: z.string() }),
resumeSchema: z.object({ approved: z.boolean(), feedback: z.string().optional() }),
suspendSchema: z.object({ draft: z.string(), reason: z.string() }),
async execute({ inputData, resumeData, suspend, mastra }) {
// Resume path: a human has made a decision
if (resumeData !== undefined) {
if (!resumeData.approved) {
// This error puts the entire workflow into a failed state.
// If retry logic is needed, handle it separately at the workflow level.
throw new Error(`Editorial rejected: ${resumeData.feedback}`);
}
return { published: true };
}
// Initial run: generate draft
const agent = mastra.getAgent('WriterAgent');
const draft = await agent.generate(
`Research: ${inputData.summary}\nCompetitor analysis: ${inputData.analysis}\n\nWrite a blog post draft based on this.`
);
const qualityScore = assessQuality(draft.text);
// Request human review if quality is below threshold
if (qualityScore < 0.7) {
await suspend({ draft: draft.text, reason: `Quality score below threshold: ${qualityScore}` });
return;
}
return { published: true, draft: draft.text };
},
});
// 3. Assemble the workflow
const contentPipelineWorkflow = createWorkflow({ id: 'content-pipeline' })
.addStep(fetchTopicStep)
.parallel([researchStep, competitorStep]) // Parallel execution
.addStep(editorialStep) // Merge results, edit, + HITL
.commit();| Component | Role |
|---|---|
.parallel([researchStep, competitorStep]) |
Runs both steps concurrently, advances only after both complete |
resumeData !== undefined branch |
The key pattern for distinguishing a resume from an initial run |
suspend({ draft, reason }) |
Serializes the entire execution state as a snapshot and waits |
assessQuality(draft.text) < 0.7 |
HITL trigger condition — a clear threshold for automatic vs. manual branching |
Example 2: Multi-Agent Customer Support — Supervisor Delegation + Sensitive Data Masking
The previous example was in the content generation domain; this one switches to customer support. The pattern here classifies customer inquiries and delegates them to specialist agents for technical support, billing, and refunds.
Honestly, building this kind of structure by hand — with routing logic, context passing, and error handling all tangled together — gets complicated fast. The Supervisor pattern keeps it remarkably clean.
import { Agent } from '@mastra/core/agent';
import { anthropic } from '@ai-sdk/anthropic';
const techSupportAgent = new Agent({
name: 'TechSupportAgent',
instructions: 'Diagnose technical issues and guide users to solutions.',
model: anthropic('claude-sonnet-4-6'),
});
const billingAgent = new Agent({
name: 'BillingAgent',
instructions: 'Handle billing inquiries. Work only with masked personal information.',
model: anthropic('claude-sonnet-4-6'),
});
const refundAgent = new Agent({
name: 'RefundAgent',
instructions: 'Review refund requests and guide users through the process.',
model: anthropic('claude-sonnet-4-6'),
});
const supportSupervisor = new Agent({
name: 'SupportSupervisor',
instructions: `Analyze customer inquiries and delegate to the appropriate specialist agent.
- Technical issues → TechSupportAgent
- Billing inquiries → BillingAgent
- Refund requests → RefundAgent`,
model: anthropic('claude-opus-4-7'), // A powerful model handles routing decisions
agents: [techSupportAgent, billingAgent, refundAgent],
});
// The return value of onDelegationStart is the actual message delivered to the sub-agent.
// Card numbers are masked before being sent to billing/refund agents.
const result = await supportSupervisor.stream(customerMessage, {
onDelegationStart({ agent, messages }) {
if (['BillingAgent', 'RefundAgent'].includes(agent.name)) {
return messages.map((msg) => ({
...msg,
content:
typeof msg.content === 'string'
? msg.content.replace(/\d{4}-\d{4}-\d{4}-\d{4}/g, '****-****-****-****')
: msg.content,
}));
}
return messages;
},
});| Component | Role |
|---|---|
agents: [...] |
The list of agents the supervisor can delegate to |
onDelegationStart return value |
The message array this hook returns is passed directly to the sub-agent |
| Card number regex masking | Prevents sensitive data from being exposed in the sub-agent's context |
| Supervisor model = Opus 4.7 | Routing and judgment use a more powerful model; execution uses a more efficient model |
Pros and Cons
Here's a summary of the advantages I noticed from hands-on use.
Pros
| Item | Description | Notes |
|---|---|---|
| TypeScript native | Build an entire AI pipeline in full-stack TypeScript without Python | Backend and frontend teams can share the same codebase |
| Type safety | Zod schemas validate step inputs and outputs, catching runtime errors early | inputSchema, outputSchema enforced |
| State persistence | Snapshot-based suspend/resume; workflow survives server restarts | LibSQL, PostgreSQL supported |
| Deployment speed | npx mastra deploy deploys to Vercel quickly |
Cloudflare Workers also supported |
| Official Supervisor pattern | Agent delegation can be declared directly at the stream() / generate() level |
v0.4+ official API |
There are also some rough edges worth noting.
Cons and Caveats
| Item | Description | Mitigation |
|---|---|---|
| API instability | Major breaking changes from v0.3 → v0.4 workflow API | Pin your version and follow the migration guide carefully |
| Parallel failure propagation | A single failure in a parallel block halts the entire block | Handle partial failures inside each step with try/catch |
| Immature ecosystem | 50–60 third-party integrations (vs. LangChain's hundreds) | MCP-based external tool connections cover a significant portion |
| Documentation gaps | Advanced features like memory persistence rely on Discord more than official docs | Supplement with the official Discord and GitHub Issues |
| No Python support | Cannot directly use the Python ecosystem (scikit-learn, etc.) | Expose Python tools via MCP servers or separate microservices |
MCP (Model Context Protocol): A standard protocol proposed by Anthropic that lets AI agents connect to external tools and data sources in a consistent way. Mastra can indirectly leverage Python-based tools through MCP. It goes a long way toward bridging the ecosystem gap.
The Most Common Mistakes in Practice
-
Forgetting the
resumeDatabranch — When a workflow resumes aftersuspend(),execute()is called again from the beginning. If you don't check for the presence ofresumeData, it behaves like an infinite loop. Theif (resumeData !== undefined)branch is a mandatory pattern for any HITL step. -
Ignoring partial failures in parallel blocks — If a step that calls an external API is inside a parallel block, timeouts or network errors can bring down the entire block. For low-priority steps, it's safer to catch errors internally and return an empty result, rather than letting them propagate. This connects directly to the parallel failure propagation behavior described earlier.
-
Using the highest-performance model for both supervisor and sub-agents — Having the supervisor use a powerful model like Opus for routing decisions and sub-agents use Sonnet-class models for execution lets you optimize for both performance and cost. The example code intentionally separates them this way.
Closing Thoughts
Mastra Workflow lets you handle three complex problems — parallel execution, human intervention, and multi-agent delegation — in a consistent way within a single TypeScript codebase.
The API is still stabilizing and the ecosystem is still maturing, so you'll need to pay attention to version management for production deployments. That said, for TypeScript-based teams that want to build and operate AI pipelines directly without Python, it's one of the most realistic options available right now.
Here's a step-by-step path to getting started:
- Install and initialize a project — Run
npx create-mastra@latestto bootstrap a project. It comes with scaffolding that includes one agent and a basic workflow. - Experience the HITL pattern — Paste in the
approvalStepcode shown above and runpnpm mastra devto bring up Mastra Studio. You can test suspend/resume directly through the GUI, which really helps you internalize how it works. - Add parallel execution — If you already have a workflow running sequentially, try wrapping two steps that have no dependency on each other with
.parallel([stepA, stepB]). You can see how execution time changes in Mastra Studio's timeline view.
References
- Workflows Overview | Mastra Official Docs
- Control Flow (Parallel Execution) | Mastra Official Docs
- Human-in-the-Loop | Mastra Official Docs
- Suspend & Resume | Mastra Official Docs
- Supervisor Agents | Mastra Official Docs
- Multi-agent Systems Concepts | Mastra Official Docs
- Supervisor Agent Example | Mastra
- Human in the Loop Example | Mastra
- Changelog 2026-02-26 | Mastra Blog
- Mastra Agent Workflow: Human In The Loop, Suspend and Resume | Medium
- Orchestrating Agents with Mastra Workflows (Why I Stopped Using Temporal) | Substack
- AI Agent Framework Comparison | Speakeasy
- Mastra in 2026: What It Is, When to Use It, and How It Compares | DEV.to
- Mastra Agents with Memory Sharing | Trigger.dev
- mastra-ai/mastra | GitHub