How I set up a state-of-the-art Agentic PDLC in two weeks โ with an AI assistant doing the research, evaluation, and implementation alongside me.
Wake word detection, real-time Gemini conversation, tool execution for calendar, lights, music, project management. Runs 24/7 on a Mac mini. 86+ PRs, 160+ tickets, full observability.
Physics-based stick figure animations with Rapier.js WASM, scroll-driven scene orchestration, real-time text reflow. Same pipeline, different project, same agents.
"The models are incredible. The problem is the orchestration."
Monitors pipeline, cron orchestration, status reports, maintains living specification documents updated by retros.
Receives Linear webhooks, spawns Claude Code sessions, routes work to the right context.
Plans approach, delegates to 3-4 parallel subagents, verifies all output before commit.
Evidence ร Alignment, RICE scoring, pre-mortem analysis. The gate that prevents bad tickets.
Every 10 minutes: merge, build, test, diagnose, file tickets, trigger agents. 144 runs per day.
Nightly analysis of entire pipeline. Classifies issues, proposes fixes, catches what humans miss.
The real timeline from project kickoff to first autonomous PR.
Four major failures surfaced. Each one taught us what autonomous systems actually need.
30+ tickets filed without evaluation. Agents wasted ~12 hours on low-value work.
Created Strata PM Agent. Every feature goes through Evidence ร Alignment, RICE, pre-mortem.
100% of early PRs lacked integration tests. 3 PRs merged that failed in production.
Hard rule in CLAUDE.md: integration tests mandatory before every push.
50 tickets on Linear board. 28 stale research tasks. 9 duplicates. No clear priorities.
Cleanup: 50 โ 13 relevant tickets. Added automated board hygiene to retro process.
2-4 hour delay from approval to coding. 100% required human intervention.
Automated brief โ Spec Kit โ tickets โ Cyrus webhook. <5 minute routing.
Something unexpected happened. The system started catching its own problems.
Every night, analyzes: Linear tickets, GitHub PRs, error logs, cron health, Strata decisions, and industry research. Produces structured report with classified issues and concrete proposals. By end of Week 2: 20+ spec changes applied in one bulk update.
Day 1: Basic orchestration assistant
Week 2: Full PM with living spec documents updated by daily retros
April 1 Rule: "NEVER write code. You are PM + DevOps. Cyrus is the developer. Non-negotiable."
Added after Morty wrote Rust code all day โ every fix got overwritten by Cyrus's PRs.
Born from: "You should really be using a process for these tickets"
Now: Every feature โ Evidence ร Alignment โ RICE โ Pre-mortem โ Case-against โ Product Brief
Updated to read project constitution + architecture before evaluating. 28KB specification document.
Pattern: Opus tech lead โ 3-4 parallel subagents โ verify โ commit
Best day: 16 PRs merged, 0 reverts
Worst day: Idle 48 hours (webhook silently failed)
Both caught by retrospectives.
Evolution:
Merge bot โ Test monitor โ Ticket filer โ State manager โ Duplicate preventer โ Channel-aware
Each capability added after a real problem caught by retrospectives. 144 runs per day.
These findings would have persisted indefinitely without automated daily analysis.
The honest โ marks are the point. The system surfaces its own problems every day so they get fixed.
The AI doesn't decide if code is good. The pipeline decides.
First autonomous PR by end of day.
Issue tracking for your pipeline
Compile, lint, test, coverage โ automated from Day 1
Connect Linear โ GitHub โ Claude Code
Secure, permanent webhook endpoint
Unlimited access for all agents
Operating instructions for your coding agent
Your first autonomous PR
Self-hosted CI runner, 95% coverage gate, mutation testing, PR review agent, cron monitoring every 30 minutes.
Architecture knowledge base, AI PM daily monitoring, retrospectives, agents maintaining agent infrastructure.
Paul Koch ยท Head of Technology ยท Blueprint Equity