Infrastructure for Multi-Agent Systems
In this lesson: set the rules and let the agent finish, treat the state as the hard part, and build swarms with a watchdog.
Top 3 takeaways
Define the rules and let the agent finish the job
Managed agents move you away from chatting back and forth. You define the primitives, the agent, the environment, and the session, and the work then runs on hosted infrastructure until it is done.
State is the hard part
A demo becomes a product once sessions, memory, and environments are managed for you. A finished session goes idle so you pay nothing for compute while it sits, and anything you want to keep lives in a memory store.
Build swarms with a watchdog
A coordinator plus specialist sub-agents keeps context isolated while work runs in parallel, and a sentinel agent watches long-running jobs so they do not break. You can reuse the same pattern inside Claude Code.

Matt Wood
Teacher, Gauntlet AI
Matt Wood is an AI instructor at Gauntlet and a veteran software builder with more than 20 years of experience creating products for companies ranging from JPMorgan to NBC. He specializes in AI agents, analytics, and turning complex systems into practical tools that teams can use every day.
Lesson notes
A written walkthrough of the lecture, covering the patterns, the code, and the things that trip people up.
Managed Agents Move Work Off Your Machine
Claude Managed Agents represent a shift away from the traditional AI workflow where engineers run agents on their own machines and stitch together infrastructure with frameworks like LangChain or the AI SDK.
Instead, Anthropic hosts and manages the infrastructure. You define the rules, tools, and environment, then the agent executes the work on your behalf. The focus moves from continuously interacting with an agent to designing a system and letting it run.
The key concept is primitives: reusable building blocks that can be combined to create workflows declaratively.
The Core Primitives
Managed Agents are built from a handful of foundational components:
- Agents define the model, instructions, and available tools.
- Environments provide the compute environment where work runs.
- Sessions combine agents and environments to execute tasks.
- Skills package prompts, scripts, and reusable workflows.
- Vaults securely store credentials and external connections.
Together, these primitives make it possible to build sophisticated workflows without managing servers or custom infrastructure.
New Capabilities After Launch
Anthropic later introduced additional primitives that expanded what managed agents can do.
Outcomes allow developers to define success criteria and evaluation rubrics. Sandboxes let sensitive workloads run on infrastructure you control while still leveraging Claude's reasoning capabilities. MCP Connectors make it possible to connect agents to external tools and internal systems.
The result is a workflow that feels much closer to a serverless application than a traditional chatbot.
Building Agent Workforces
These primitives make it possible to build multi-agent systems where a coordinator agent delegates work to specialized sub-agents.
A single workflow might include planners, researchers, designers, reviewers, and watchdog agents running in parallel. Breaking work into specialized agents helps reduce context overload and improves reliability.
The lecture also demonstrates Braid, a tool that generates managed-agent workflows from natural-language descriptions. Rather than manually configuring every component, users can describe the workflow they want and generate the underlying agents, environments, and evaluation criteria automatically.
The Tradeoffs and Costs
Managed Agents simplify infrastructure but introduce new considerations.
They require API usage rather than standard Claude subscriptions, meaning costs can range from a few dollars to hundreds of dollars depending on workload. Environments are ephemeral, so anything that needs to persist must be stored in dedicated memory systems rather than on the underlying machine.
Failures should be expected during development. The platform provides visibility into tool calls, reasoning steps, and execution history, making it easier to debug workflows and improve reliability over time.
The Bigger Trend
The broader takeaway is that agent infrastructure is becoming a managed service. Instead of building and operating custom agent systems from scratch, developers can increasingly focus on defining workflows, evaluation criteria, and business logic while infrastructure providers handle the execution layer.
As these platforms mature, the challenge shifts away from running agents and toward designing effective systems of agents that can operate reliably at scale.
FAQ
What is a multi-agent system? +
What does it take to ship a multi-agent app? +
What is Braid? +
Why is state the hard part of multi-agent apps? +
What can you build with it? +
How does this fit with my existing stack? +
What separates a multi-agent demo from a shipped app? +
What's next?
Keep building with the rest of Night School, or apply to Gauntlet โ twelve weeks of technical intensity with the best AI engineers we can find.