Structurally Tracking Agent User State with Mastra Working Memory + Zod Schema
There's a frustration that's common when you first build an AI agent. You clearly told it "I live in Seoul" in a previous conversation, but in the next request the agent responds as if it's meeting you for the first time. Stuffing the entire conversation history into the context every time drives token costs sky-high, but leaving it out means the agent loses its memory. I struggled with this myself, trying to attach a custom state management layer, but Mastra's Working Memory solves it pretty cleanly.
This post covers how to connect Zod to the schema option of Mastra Working Memory so that an agent can structurally track user state. Rather than a simple API walkthrough, I want to focus on what choices in schema design lead to what outcomes, centered on situations you frequently encounter in practice. After reading this, you'll be able to design a structure that lets your agent create natural continuity — like "Your deadline you mentioned last time was this Friday, right?"
The content is aimed at developers familiar with TypeScript who want to integrate an AI agent into a real service. If you're new to Mastra, start from the core concepts section; if you already know the basics, you can jump straight to the practical application section.
Core Concepts
Why Working Memory Is Different from Conversation History
Mastra is an AI agent framework designed TypeScript-first, created with the involvement of the Gatsby founder. Unlike LangChain's Python-centric ecosystem, its memory layer is integrated at the framework level.
Mastra's memory system is broadly divided into four layers.
| Layer | Role | Characteristics |
|---|---|---|
| Conversation History | Stores the most recent N messages | Injected directly into the context window |
| Semantic Recall | Retrieves past messages via embedding-based search | Requires vector DB integration (not covered in this post) |
| Working Memory | Structured user state | Schema definition, JSON merge |
| Observational Memory | Stores compressed/summarized conversations | Handled by an Observer agent (not covered in this post) |
Working Memory is the layer that stores "the core state the agent needs to know about this user right now." What makes it different from conversation history is that instead of storing entire messages, it only keeps structured data the agent has explicitly decided "I need to remember this."
Working Memory: A mechanism that stores the core state an agent must maintain between conversation turns as JSON. It is automatically injected into the system prompt on every request, allowing the agent to immediately reference user context.
Schema Mode vs Template Mode — This Difference Is Key
There are two ways to configure Working Memory, and this choice changes the entire way the agent updates its memory.
| Mode | Configuration | Update Semantics | Best For |
|---|---|---|---|
| Schema-based | schema: z.object({...}) |
Merge — only provide changed fields; the rest are preserved | User profiles that build up incrementally |
| Template-based | template: "..." string |
Replace — provide the entire content fresh each time | Memory at the level of simple text notes |
Honestly, at first I thought "can't I just use template?" but the difference becomes clear once you actually use it. In Template mode, when the agent updates memory it must return all existing content along with the changes. If it forgets, the previous data is wiped entirely. In Schema mode, returning just { goals: [...newItem] } is enough to automatically preserve other fields like userContext.
What Happens When You Inject a Zod Schema
import { Memory } from "@mastra/memory";
import { z } from "zod";
const memory = new Memory({
options: {
workingMemory: {
enabled: true,
schema: z.object({
name: z.string().optional(),
location: z.string().optional(),
preferences: z.object({
communicationStyle: z.enum(["formal", "casual"]).optional(),
projectGoal: z.string().optional(),
}).optional(),
activeDeadlines: z.array(z.object({
label: z.string(),
date: z.string(),
})).optional(),
lastTopicDiscussed: z.string().optional(),
}),
},
},
});When you put a Zod object in the schema field, Mastra internally converts this schema to JSON Schema. It is then injected into the agent's system prompt in roughly the following form.
# Working Memory (Current State)
{
"name": null,
"location": null,
"preferences": null,
"activeDeadlines": [],
"lastTopicDiscussed": null
}
When you identify user information in the conversation, update it to match the JSON structure above.
You only need to include fields that have changed — the rest will be preserved automatically.When the agent identifies user information during the conversation, it generates JSON matching that structure to update working memory, and starting from the next request it references that state directly from the system prompt.
From a TypeScript perspective, the z.infer<typeof schema> type is automatically generated, guaranteeing type safety when accessing memory.
const userProfileSchema = z.object({
name: z.string().optional(),
location: z.string().optional(),
});
type UserProfile = z.infer<typeof userProfileSchema>;
// → { name?: string | undefined; location?: string | undefined }
// You can safely handle memory query results with this typeZod: A TypeScript-first schema declaration and validation library. It provides runtime validation and TypeScript type inference simultaneously. Mastra supports both Zod v3/v4, and JSON Schema format schemas can also be used directly in the
schemafield.
Practical Application
Example 1: Personalization Agent Based on User Profile
This is the most typical pattern. It tracks the user's basic information alongside current session context and ongoing goals.
import { Agent } from "@mastra/core";
import { Memory } from "@mastra/memory";
import { z } from "zod";
const userProfileSchema = z.object({
profile: z.object({
name: z.string().optional(),
timezone: z.string().optional(),
preferredLanguage: z.string().optional(),
}),
currentSession: z.object({
topic: z.string().optional(),
openQuestions: z.array(z.string()).default([]),
}),
goals: z.array(z.object({
description: z.string(),
deadline: z.string().optional(),
status: z.enum(["active", "completed", "paused"]),
})).default([]),
});
type UserProfile = z.infer<typeof userProfileSchema>;
const agent = new Agent({
name: "PersonalAssistant",
instructions: `You are the user's personal assistant.
When you identify the following information in conversation, update working memory immediately:
- User's name, timezone, preferred language → profile field
- Current conversation topic or unresolved questions → currentSession field
- Goals or deadlines the user mentions → add to goals array
When adding a new item to goals, return the full array including existing items.
Do not ask the user again for information already in working memory — use it naturally.`,
memory: new Memory({
options: {
workingMemory: {
enabled: true,
schema: userProfileSchema,
scope: "resource", // Share state across all threads for the same user
},
},
}),
});
const response = await agent.generate("Hello, my name is Kim Minsu.", {
resourceId: "user-kim-minsu",
threadId: crypto.randomUUID(),
});| Point | Description |
|---|---|
scope: "resource" |
All threads sharing the same resourceId reference the same working memory |
.default([]) |
Setting a default value for array fields allows type-safe access even before the first update |
optional() |
Fields the agent hasn't learned yet are left as undefined and filled in incrementally |
instructions |
If you don't explicitly specify what information to remember, when, and how, the LLM may not update memory autonomously |
The advantage of this pattern is that the agent can create natural continuity like "Kim Minsu, your deadline you mentioned last time was this Friday, right?" Thanks to scope: "resource", information learned in a customer support thread can be used as-is in a recommendation thread.
Example 2: Task Management Agent — Making Proper Use of Merge Semantics
(Import statements are omitted in subsequent examples — they are the same as in Example 1.)
const taskSchema = z.object({
tasks: z.array(z.object({
id: z.string(),
title: z.string(),
priority: z.enum(["high", "medium", "low"]),
done: z.boolean().default(false),
createdAt: z.string(),
})).default([]),
userContext: z.object({
focusArea: z.string().optional(),
workingHours: z.string().optional(),
preferredNotification: z.enum(["email", "slack", "none"]).optional(),
}).optional(),
});
const taskAgent = new Agent({
name: "TaskManager",
instructions: `When adding a task or changing its status, update only the relevant field.
When adding a new item to the tasks array, return the full array including existing items.
Do not update userContext unless there is an explicit request to change it.`,
memory: new Memory({
options: {
workingMemory: {
enabled: true,
schema: taskSchema,
},
},
}),
});There's an important practical detail here. Even with Merge semantics, array fields like tasks require the agent to return the full array including existing items for the update to work correctly. Returning only a single new item will overwrite the existing array. It's worth making this explicit in the instructions.
const result = await taskAgent.generate(
"I need to finish writing the API documentation by tomorrow at 3 PM. Add it as high priority.",
{
resourceId: "user-dev",
threadId: crypto.randomUUID(),
}
);
// Structure the agent updates internally (example)
// {
// tasks: [
// ...existingTasks,
// {
// id: crypto.randomUUID(),
// title: "Write API documentation",
// priority: "high",
// done: false,
// createdAt: "2026-05-21"
// }
// ]
// // userContext is not touched → existing value automatically preserved
// }Example 3: Multi-Agent Memory Sharing
Even when a customer support bot and a product recommendation bot operate separately, they can share the same record in storage by using the same Memory configuration and the same resourceId.
const sharedMemory = new Memory({
options: {
workingMemory: {
enabled: true,
schema: z.object({
customerProfile: z.object({
name: z.string().optional(),
purchaseHistory: z.array(z.string()).default([]),
preferences: z.array(z.string()).default([]),
}),
currentIssue: z.object({
description: z.string().optional(),
status: z.enum(["open", "resolved", "escalated"]).optional(),
}).optional(),
}),
},
},
});
const supportAgent = new Agent({ name: "SupportBot", memory: sharedMemory });
const recommendAgent = new Agent({ name: "RecommendBot", memory: sharedMemory });
// Same resourceId → reads and writes the same working memory record in storage
await supportAgent.generate("There's a problem with my product.", {
resourceId: "user-123",
threadId: crypto.randomUUID(),
});
await recommendAgent.generate("Please recommend another product.", {
resourceId: "user-123", // Directly references the customerProfile recorded by SupportBot
threadId: crypto.randomUUID(),
});The key point here is that the two agents not only share the sharedMemory instance at the code level, but because resourceId: "user-123" is the same, they read and write the same record in storage. That is, even if the two agents run in separate processes, as long as they point to the same storage and use the same resourceId, they share the same working memory.
The recommendation bot can directly use the purchase history and preferences the support bot learned from the conversation, without a separate query. This is a quite useful pattern when building agent systems with separated roles, like microservices.
Pros and Cons Analysis
These are items I narrowed down to what I actually felt while attaching this to real services.
Pros
| Item | Content |
|---|---|
| Type Safety | Zod schema and TypeScript type inference work together to catch memory structure errors at compile time |
| Merge Semantics | Preserves the full state with only partial updates — reduces the risk of accidentally losing data |
| Cross-Thread Sharing | scope: 'resource' enables sharing state across all conversations for the same user |
| Multi-Agent Compatibility | Supports distributed architectures where multiple specialized agents share a single working memory |
| Mutex Protection | Internal locking ensures working memory updates are processed sequentially without conflicts when parallel requests come in for the same user |
| Storage Flexibility | Can be swapped out to suit the environment — LibSQL (local), PostgreSQL (production), Upstash (serverless), etc. |
Drawbacks and Caveats
| Item | Content | Mitigation |
|---|---|---|
| Storage Dependency | External storage is strictly required to persist data between sessions | Use LibSQL in development, transition to PostgreSQL in production gradually |
| Upstash Cost | High conversation volume can lead to unexpected costs with the pay-as-you-go model | Set usage alerts; switch to self-hosted Redis if needed |
| Memory Decay in Long Conversations | Working memory alone becomes insufficient when conversations grow very long | Recommended to use Observational Memory in parallel |
| Schema Evolution Management | Renaming fields or changing types in a Zod schema while data is already stored can cause parsing errors | Add optional() instead of deleting fields to maintain backwards compatibility before removing |
| Mastra Cloud Limitation | Storage configuration at the agent level is not available when using Mastra Cloud Store | Storage can only be configured at the Mastra instance level |
| Repeated Instructions in Multi-Agent | Design complexity from having to repeat the same memory collection instructions to multiple agents | Extract into a shared instructions template function for reuse |
LibSQL: An open-source fork of SQLite with local file-based storage. It works immediately without a separate DB server, making it suitable for prototyping in the development phase. Integrated via the
@mastra/store-libsqlpackage.
The Most Common Mistakes in Practice
1. Misunderstanding Array Field Merge
Even in schema-based mode, for arrays like tasks, the agent must return the full array including existing items for the update to work correctly. The concept of "only changed fields" does not apply at the level of individual items within an array. Returning only a single new item will overwrite the entire existing array. I didn't know this at first and spent a long time tracking down a bug where only one task ever remained. It's worth stating this explicitly in the instructions.
2. Casually Changing the Schema in Production
If you rename a field or change its type in the Zod schema while there is already stored data, parsing errors will occur. If you need to remove a field, the safe sequence is to first convert it to optional() to migrate existing data, and then remove it.
3. Calling Without resourceId
Omitting resourceId makes the scope: "resource" setting meaningless. It's good practice to establish a team convention of always explicitly passing a resourceId that identifies the user when calling the agent. Adding resourceId validation to a wrapper function can prevent omissions at the source.
Closing Thoughts
Once you connect a Zod schema to Working Memory, your perspective on the agent shifts a little. Instead of "how do I store state," you start by designing "what does the agent need to know about this user." The schema defines the agent's memory structure, TypeScript guarantees that structure at compile time, and Merge semantics reduce the risk of data loss. Not pretending to remember, but tracking in a genuinely auditable way.
Three steps you can start right now:
-
Install packages and configure a basic agent — After installing with
pnpm add @mastra/core @mastra/memory zod, you can copy theuserProfileSchemaexample introduced above and connect it with local LibSQL storage. It works immediately without a separate DB server. -
Verify Working Memory in real time with Mastra Playground — In the Playground UI, you can directly query and edit the values the agent currently holds in memory. It's useful for quickly verifying whether the agent is filling fields as intended during the early stages of schema design.
-
Test the
scope: "resource"andresourceIdcombination — Call the agent from two separate threads with the sameresourceIdand confirm that user information learned in the first thread is correctly referenced in the second. Once this behavior is confirmed, you're ready to scale up to a multi-agent architecture.
References
Good resources to start with:
- Working Memory | Mastra Official Docs
- Example: Working Memory with Schema | Mastra
- Memory Overview | Mastra Official Docs
For a deeper dive:
- Observational Memory: 95% on LongMemEval | Mastra Research
- Announcing Observational Memory | Mastra Blog
- Agent Memory System | DeepWiki (mastra-ai/mastra)
- Memory and Storage Architecture | DeepWiki
- Multi-agent systems | Mastra Concepts
- Mastra agents with memory | Trigger.dev
- Observational Memory cuts AI agent costs 10x | VentureBeat
- GitHub — mastra-ai/mastra