How to Build Your Own Software Factory

In this lesson: become the director rather than the coder, the toolkit beats the tools, and you put gates where you are vulnerable

Ash Tilawat · 60 min · March 2026
Released March 25, 2026

Top 3 takeaways

01

Your job is now directing agents

Code generation is no longer the bottleneck. The work that counts is writing clear specs, placing quality gates, and orchestrating agents from start to finish.

02

The toolkit outlasts any single tool

Every factory needs the same 11 swappable primitives, including record, memory, orchestrator, runtime, gates, and observability. Swapping a provider takes three steps, which are updating the MCP, the skill, and the key.

03

Put gates where you're vulnerable

Human checkpoints belong at the spec, the architecture, and the final review. Build them as interrupts that save state, so a stalled agent can retry without restarting the whole run.

Ash Tilawat

Ash Tilawat

CTO, Gauntlet AI

CTO of Gauntlet AI, leading the company's technical direction and AI-native training programs. Has trained 1,200+ engineers across 104 companies and run multiple corporate trainings this year — including the AI sales course that firms like a16z, Mainsail, and PwC brought Gauntlet in to teach. Focused on turning AI from a prototype tool into something teams use in real production workflows, with an emphasis on evaluation and systems thinking.

Lesson notes

A written walkthrough of the lecture, covering the patterns, the code, and the things that trip people up.

The Engineer Becomes a Director

The core idea of the lecture is that software engineering's bottleneck has shifted. Tools like Claude Code, Codex, and Replit Agents can generate code for hours without supervision, making code generation less important than planning, testing, deployment, and verification.

The modern engineer acts less like a coder and more like a director, managing AI agents that execute work. A software factory makes this possible by taking a specification and turning it into a deployed application with minimal human involvement. The technology is still early, however. Factories can automate much of the work, but human engineers are still required to review outputs, resolve edge cases, and push projects from 80% complete to production-ready.

The 11 Building Blocks of a Software Factory

A software factory is less about any single tool and more about how the components work together.

The key building blocks are:

  1. Record System – Tracks work and ownership (e.g. Linear).
  2. Memory – Persistent project context stored in files.
  3. Orchestrator – Coordinates agents and workflows (e.g. LangGraph).
  4. Execution Environment – Sandboxed environments where agents operate.
  5. Agent Runtime – The coding agents themselves (Claude Code, Codex, etc.).
  6. Integration Layer – Connections to tools like GitHub, Vercel, Supabase, and Slack.
  7. Quality Gates – Human review checkpoints.
  8. Delivery Target – Where applications are deployed.
  9. Observability – Tracing and monitoring agent behavior.
  10. Skills – Reusable domain knowledge and workflows.
  11. Identity & Secrets – API keys, credentials, and access control.

The specific tools may change over time. The architecture remains largely the same.

How the Factory Works

The process begins with a ticket in a project management system such as Linear.

A webhook triggers project setup, creating the repository, deployment environment, database, and project memory. A product-management agent writes the initial specification, which is reviewed by a human. An architecture agent then produces the system design and implementation plan, followed by another review checkpoint.

The orchestrator breaks the project into parallel tasks and assigns each one to an isolated coding agent. As work progresses, agents contribute to a shared pull request while automated review and testing agents validate the output.

Once final human approval is given, the code merges and deploys.

Memory and Agent Coordination

Agents do not share memory by default. Each starts with minimal context and must rely on documentation, files, and generated artifacts to understand the project.

Ash argues that file-based memory is often more reliable than graph-based memory because it is easier to maintain and less prone to drift. The same philosophy underlies Anthropic's Skills system: persistent knowledge stored in files and revealed when needed.

Three principles make memory effective:

  • Never overwrite history; append instead.
  • Read relevant context before writing code.
  • Avoid multiple agents editing the same memory simultaneously.

Agent coordination remains one of the hardest problems in multi-agent systems. A practical solution is assigning ownership of specific files and enforcing file locks to prevent conflicts.

Orchestration, Cost, and Governance

The orchestrator acts as the factory manager. It maintains state, launches agents, handles retries, and resumes work after interruptions without restarting the entire workflow.

Cost can be managed by pairing models strategically. A common pattern is to have a lower-cost model generate code while a more capable model reviews it. This dramatically reduces spend while maintaining quality.

Governance starts with visibility. Every agent action should be traceable through observability tools. Once visibility exists, teams can add evaluations and scoring systems to measure agent performance and continuously improve the factory.

The overall takeaway is that successful software factories are not built from dozens of specialized agents. They are built from a small number of reusable agents, clear workflows, strong memory systems, and well-designed human review checkpoints.

FAQ

What is a software factory? +
A software factory is an operating model where one engineer runs the full planning, building, testing, and shipping loop with AI doing most of the execution. You orchestrate AI across every stage rather than hand-coding everything yourself.
Why isn't just using Claude enough anymore? +
Six months ago using an AI assistant set you apart, and now everyone has one, so the edge has moved to engineers who restructure how they work, delegating planning, implementation, testing, and review to AI in a repeatable system.
What does the planning step look like? +
You write the plan and the specs first, and then you let the model build against them so it follows your direction, since clear upfront direction is what keeps AI output on track.
How is this different from vibe coding? +
Vibe coding improvises prompts to produce a demo, while a software factory produces shippable software through a deliberate, repeatable pipeline with checks at each stage.
Do you still need to understand the code the AI writes? +
Yes, since you act as the tech lead and reviewer who sets direction, verifies output, and catches mistakes. AI scales how much you can execute, and your judgment is what holds the work together.
What roles does AI play across the loop? +
AI drafts plans, generates the implementation, writes and runs tests, and reviews diffs, and you stitch these stages together so each one feeds the next.
How do you keep quality high when AI writes most of the code? +
Add quality gates such as tests, reviews, and evals at each stage, so regressions get caught before they ship. The AI Code Review lesson covers this in more detail.

What's next?

Keep building with the rest of Night School, or apply to Gauntlet — twelve weeks of technical intensity with the best AI engineers we can find.

▶ Play lesson