We were six weeks into a new platform initiative. AI coding tools were everywhere. Features were shipping, PRs were merging, and velocity was up. However, two months later, I was staring at a codebase nobody could fully explain. Inconsistent IDs across services. Error shapes that differed across endpoints from ‘the same spec’ — except there was no spec. There were three prompts in three chat windows nobody had saved. 

The AI hadn’t done anything wrong. We did. We’d treated AI codegen as a shortcut past the planning process instead of as a faster execution of a planning process that still needed to happen. That’s when we found Spec Kit and SQUAD. 

The Real Problem: Institutional Amnesia 

The failure mode isn’t the AI writing bad code. Modern AI coding tools write pretty good code, most of the time. The failure mode is institutional amnesia. 

When an engineer decides to use UUIDs for primary keys, that decision lives in their head and maybe in a PR description. When an AI makes the same decision because you told it to “build a REST API,” the decision lives nowhere — it just manifests in the code. In a traditional workflow, tech stack choices, error handling conventions, authentication approach and API versioning strategy get captured in ADRs. In an AI-first workflow, they get swallowed by the chat window. 

Figure 2: Solving Invisible AI Decisions via Spec-Driven Development 

What Spec Kit Does in a DevOps Context 

Spec Kit is a CLI maintained by GitHub. Think of it as a structured documentation pipeline: You put a feature description in one end, and five phases of spec artifacts come out the other. 

Figure 3: The Spec Kit Five-Phase Pipeline — From Feature Description to a Versioned Artifact Trail in Specs 

 The part that matters most for DevOps teams isn’t any individual phase — it’s the artifact trail. 

The specs/ folder tells you what the feature does (spec.md), why it’s built that way (research.md), what the API contract looks like (contracts/api.md) and the full ordered task list (tasks.md). These are real files in your repo. They’re version-controlled, searchable and they answer the ‘why did we do it this way’? A question that your 2 a.m. incident call always gets stuck on. 

SQUAD: The Multi-Agent Team That Executes the Spec 

SQUAD is a separate tool built by Brady Gaster. Where Spec Kit creates the planning artifacts, SQUAD executes them using four specialized agents. 

Figure 4: SQUAD Architecture — Four Single-Purpose Agents, one Source of Truth 

The key design decision: No agent tries to do everything. Linus writes, Livingston reviews. They never share the same context window. An agent reviewing its own work catches almost nothing, because the gaps in its generation are the same gaps in its review. Livingston has never seen Linus’s reasoning — only the output — which is the same adversarial posture a good code reviewer has. 

Install SQUAD: npm install -g @bradygaster/squad-cli@latest 

 

The Constitution: Highest ROI, Lowest Adoption 

The constitution is a one-time setup per project. You run:  

/speckit.constitution create Python FastAPI application 

 

You get a Constitution.md, encoding your team’s non-negotiables: Test runner, error shape standard, ID strategy, API versioning. Here’s what it does in a SQUAD workflow: Basher reads it before accepting any plan. If the plan proposes integers for primary keys (against principle I), Basher blocks it and explains why. The violation gets logged in decisions.md. 

This is policy as code for your development process. The same way a Rego policy in OPA enforces infrastructure standards at deploy time, the constitution enforces development standards at design time. It works in natural language and costs nothing to set up. For platform teams standardizing AI codegen across SQUADs, it’s the enforcement mechanism you didn’t know you needed. 

What Livingston Actually Catches 

Once the spec folder exists, SQUAD runs execution. Here’s what Livingston’s review produces on the TODO API from our example — built from the spec, not from vibes: 

LIVINGSTON REVIEW REPORT 
SPEC COMPLIANCE 
✅  17/17 acceptance scenarios have test coverage 
✅  Constitution I: UUIDs used for all primary keys 
✅  Constitution III: 93% test coverage (gate: 90%) 
SECURITY 
✅  No raw SQL concatenation found 
✅  Input validation on all route handlers (Pydantic v2) 
⚠️  SQL injection surface via sort_by parameter — BLOCKING 
QUALITY 
✅  Error shapes consistent with Constitution II standard 
✅  Async/await correctly used in all route handlers 
⚠️  Missing test: pagination reserved parameters return 501 
STATUS: 2 blocking issues require resolution before merge. 

The SQL injection finding is real: Sort_by query parameters passed directly to ORDER BY are a textbook OWASP injection vector. Linus missed it. Livingston caught it because Livingston has a security checklist that runs every time. The missing test for US4 Scenario 3 was found by checking spec.md line by line. Neither finding would have surfaced in a typical AI codegen workflow. They’d have shipped. 

How This Changes Incident Response 

Here’s the concrete difference Spec Kit makes when things go wrong.  

Before:  

“Why is this endpoint returning integers for IDs? I don’t know, the AI probably just did it that way. Who made this call? Nobody, really.” 

After:  

Run grep -r “integer” specs/ — found in research.md: “Auto-increment integer IDs considered and rejected. Reason: Enumeration attack surface. UUID selected per Constitution Principle I.” The resolution time drops from a week of archaeology to 20 minutes of reading. At 2 a.m. during an incident, that’s the difference between a bad night and a catastrophic one. 

How it Compares to the Alternatives 

Approach  Traceability  Consistency  New Eng. Ramp-Up  Incident Debug 
Just use Copilot inline  None  Depends on the day  Weeks of archaeology  Archaeology 
Wikis + ADRs (manual)  Good (if maintained)  Low (nobody updates)  Days to weeks  ADRs are stale 
Single-agent ‘build it’  None  Varies by context  Archaeology  Archaeology 
Spec Kit + SQUAD  Full, versioned  Enforced by constitution  30 minutes  Read the spec folder 

 

The ‘full, versioned’ row deserves a note: The spec folder is a first-class citizen in version control. You can git blame it. You can diff it in PRs. You can grep it during incidents. It’s not a separate system — it lives next to the code and moves with it. 

The DevOps Readiness Checklist 

When I look at a team adopting AI-assisted development, these are the things I check: 

  • A Constitution.md in version control, owned by a named person, updated when standards change. 
  • Spec Kit artifacts committed to specs/ in the same repo as the code they describe. 
  • CI rule that flags PRs where code files change without corresponding spec file updates. 
  • SQUAD’s decisions.md treated as an architectural decision log — committed, reviewed, referenced. 
  • Livingston’s review report in the PR as a machine-generated comment, human-reviewed before merge. 
  • Postmortem template that includes ‘what does research.md say about this decision’? as a standard question. 
  • New engineer onboarding — read Constitution.md + research.md before touching code. 
  • Spec folder included in backup and disaster recovery — it’s an artifact, not a scratch pad. 

Miss the first three and you get the benefits of AI speed without the traceability you need when things break. 

Advice for the DevOps Team Starting Today 

Start with the constitution. Before the next feature gets built, run /speckit.constitution and spend one hour writing down the five or six things your team should never have to redecide. That document is the most valuable thing you’ll produce this month. 

Then, for the next feature, run the full four-command pipeline before touching the code. This adds maybe two hours to the front of a feature that probably takes a week. In return, you get the spec folder for the life of the project. 

The AI coding problem is not a code quality problem. It’s a knowledge management problem. The code is usually fine. The decisions that produced the code are invisible. Spec Kit and SQUAD make them visible. 

SHARE THIS STORY