How to Build Your Own Software Factory
In this lesson: become the director rather than the coder, the toolkit beats the tools, and you put gates where you are vulnerable
Top 3 takeaways
Your job is now directing agents
Code generation is no longer the bottleneck. The work that counts is writing clear specs, placing quality gates, and orchestrating agents from start to finish.
The toolkit outlasts any single tool
Every factory needs the same 11 swappable primitives, including record, memory, orchestrator, runtime, gates, and observability. Swapping a provider takes three steps, which are updating the MCP, the skill, and the key.
Put gates where you're vulnerable
Human checkpoints belong at the spec, the architecture, and the final review. Build them as interrupts that save state, so a stalled agent can retry without restarting the whole run.

Ash Tilawat
CTO, Gauntlet AI
CTO of Gauntlet AI, leading the company's technical direction and AI-native training programs. Has trained 1,200+ engineers across 104 companies and run multiple corporate trainings this year — including the AI sales course that firms like a16z, Mainsail, and PwC brought Gauntlet in to teach. Focused on turning AI from a prototype tool into something teams use in real production workflows, with an emphasis on evaluation and systems thinking.
Lesson notes
A written walkthrough of the lecture, covering the patterns, the code, and the things that trip people up.
The Engineer Becomes a Director
The core idea of the lecture is that software engineering's bottleneck has shifted. Tools like Claude Code, Codex, and Replit Agents can generate code for hours without supervision, making code generation less important than planning, testing, deployment, and verification.
The modern engineer acts less like a coder and more like a director, managing AI agents that execute work. A software factory makes this possible by taking a specification and turning it into a deployed application with minimal human involvement. The technology is still early, however. Factories can automate much of the work, but human engineers are still required to review outputs, resolve edge cases, and push projects from 80% complete to production-ready.
The 11 Building Blocks of a Software Factory
A software factory is less about any single tool and more about how the components work together.
The key building blocks are:
- Record System – Tracks work and ownership (e.g.
Linear). - Memory – Persistent project context stored in files.
- Orchestrator – Coordinates agents and workflows (e.g.
LangGraph). - Execution Environment – Sandboxed environments where agents operate.
- Agent Runtime – The coding agents themselves (
Claude Code,Codex, etc.). - Integration Layer – Connections to tools like
GitHub,Vercel,Supabase, andSlack. - Quality Gates – Human review checkpoints.
- Delivery Target – Where applications are deployed.
- Observability – Tracing and monitoring agent behavior.
- Skills – Reusable domain knowledge and workflows.
- Identity & Secrets – API keys, credentials, and access control.
The specific tools may change over time. The architecture remains largely the same.
How the Factory Works
The process begins with a ticket in a project management system such as Linear.
A webhook triggers project setup, creating the repository, deployment environment, database, and project memory. A product-management agent writes the initial specification, which is reviewed by a human. An architecture agent then produces the system design and implementation plan, followed by another review checkpoint.
The orchestrator breaks the project into parallel tasks and assigns each one to an isolated coding agent. As work progresses, agents contribute to a shared pull request while automated review and testing agents validate the output.
Once final human approval is given, the code merges and deploys.
Memory and Agent Coordination
Agents do not share memory by default. Each starts with minimal context and must rely on documentation, files, and generated artifacts to understand the project.
Ash argues that file-based memory is often more reliable than graph-based memory because it is easier to maintain and less prone to drift. The same philosophy underlies Anthropic's Skills system: persistent knowledge stored in files and revealed when needed.
Three principles make memory effective:
- Never overwrite history; append instead.
- Read relevant context before writing code.
- Avoid multiple agents editing the same memory simultaneously.
Agent coordination remains one of the hardest problems in multi-agent systems. A practical solution is assigning ownership of specific files and enforcing file locks to prevent conflicts.
Orchestration, Cost, and Governance
The orchestrator acts as the factory manager. It maintains state, launches agents, handles retries, and resumes work after interruptions without restarting the entire workflow.
Cost can be managed by pairing models strategically. A common pattern is to have a lower-cost model generate code while a more capable model reviews it. This dramatically reduces spend while maintaining quality.
Governance starts with visibility. Every agent action should be traceable through observability tools. Once visibility exists, teams can add evaluations and scoring systems to measure agent performance and continuously improve the factory.
The overall takeaway is that successful software factories are not built from dozens of specialized agents. They are built from a small number of reusable agents, clear workflows, strong memory systems, and well-designed human review checkpoints.
FAQ
What is a software factory? +
Why isn't just using Claude enough anymore? +
What does the planning step look like? +
How is this different from vibe coding? +
Do you still need to understand the code the AI writes? +
What roles does AI play across the loop? +
How do you keep quality high when AI writes most of the code? +
What's next?
Keep building with the rest of Night School, or apply to Gauntlet — twelve weeks of technical intensity with the best AI engineers we can find.