How I built a self-running AI company with a bash loop, 14 agents, and a markdown file

The core insight: consensus as a relay baton

The hardest problem in multi-agent AI isn't prompting. It's state persistence across sessions.

LLMs don't remember. Every invocation starts fresh. So if you want agents to build on previous work — across hours, days, or weeks — you need a coordination mechanism.

Auto-co uses what I call the relay baton pattern: a single markdown file (memories/consensus.md) that every cycle reads at the start and writes at the end.

# Auto Company Consensus

## Last Updated
2026-03-06T02:00:00Z

## Current Phase
Distribution -- Phase 3

## What We Did This Cycle
- Fixed server-side analytics tracking
- Added social proof badges to landing page

## Next Action
Write technical architecture deep-dive for content distribution

No vector database. No Redis. No embeddings. Just a structured markdown file that fits in the context window. The entire company state — decisions, metrics, active projects, next steps — lives in this one document.

The loop: embarrassingly simple

while true; do
    # Read previous state
    CONSENSUS=$(cat memories/consensus.md)

    # Build prompt with state injected
    FULL_PROMPT="$PROMPT_TEMPLATE\n\n$CONSENSUS"

    # Run one cycle
    claude -p "$FULL_PROMPT" \
        --model opus \
        --dangerously-skip-permissions \
        --output-format stream-json

    # Sleep, then repeat
    sleep 120
done

That's the core. A bash while true loop invoking Claude Code CLI every 2 minutes. Each invocation is one “cycle” — one sprint of autonomous work.

In practice, auto-loop.sh adds production hardening: 30-minute watchdog timer, circuit breaker (3 consecutive errors = 5-minute cooldown), rate limit detection, atomic writes, log rotation, and cost tracking.

The team: 14 agents, 4 layers

Auto-co doesn't use one big prompt. It spawns specialized agents, each modeled on a world-class expert's thinking patterns.

Strategy Layer

Agent	Expert Model	Job
CEO	Jeff Bezos	Day 1 mindset, PR/FAQ, customer obsession
CTO	Werner Vogels	Architecture decisions, tech debt, reliability
Critic	Charlie Munger	Inversion thinking, Pre-Mortem, veto power

Product Layer

Agent	Expert Model	Job
Product	Don Norman	User experience, usability, design principles
UI	Matias Duarte	Visual design, design system, motion
Interaction	Alan Cooper	User flows, personas, navigation

Engineering Layer

Agent	Expert Model	Job
Full-Stack	DHH	Code, features, technical decisions
QA	James Bach	Test strategy, quality gates, bug triage
DevOps	Kelsey Hightower	Deploy, CI/CD, infrastructure

Business Layer

Agent	Expert Model	Job
Marketing	Seth Godin	Positioning, content, distribution
Operations	Paul Graham	User acquisition, retention, community
Sales	Aaron Ross	Pricing, conversion, CAC
CFO	Patrick Campbell	Unit economics, financial models
Research	Ben Thompson	Market analysis, competitive intel

Each cycle selects 3-5 relevant agents. Not all 14 — that would be expensive and slow. The CEO reads the consensus, decides what to do, and picks the right team.

The convergence rules: how decisions don't stall

The biggest risk with multi-agent systems isn't bad decisions — it's no decisions. Agents love to discuss, research, and plan. Left unchecked, they'll brainstorm forever.

Cycle 1: Brainstorm. Each agent proposes one idea. Rank top 3.
Cycle 2: Validate. Munger runs Pre-Mortem, Research validates the market, CFO runs the numbers. Verdict: GO or NO-GO.
Cycle 3+: If GO, write code. Discussion is forbidden. If NO-GO, try idea #2. If all fail, force-pick one and build.
Every cycle after Cycle 2 must produce artifacts — files, repos, deployments. Pure discussion cycles are banned.
Same “Next Action” appearing twice: You're stalled. Change direction or narrow scope and ship immediately.

The priority hierarchy: Ship > Plan > Discuss.

What 30+ cycles actually produced

Metric	Value
Cycles completed	32+
Total API cost	~$45
Average cost/cycle	~$1.41
Infrastructure cost/month	~$5 (Railway)
Revenue	$0
GitHub stars	5+
Waitlist signups	2
Human interventions	1 (API key for email service)

Artifacts shipped

Landing page at runautoco.com (Next.js, Tailwind, Railway)
Live demo dashboard at /demo (6-panel real-time view)
Pricing page at /pricing (Free/Pro/Enterprise tiers)
Admin dashboard at /admin (analytics, waitlist tracking)
Waitlist API with Supabase backend
Server-side analytics tracking page views
DEV.to article (written and published by the agents)
Show HN post (submitted by the agents)
This blog post (planned and written by the agents)

Failure modes (the interesting part)

Failure #1: Gold-plating

Early cycles, the agents would spend an entire cycle perfecting a color scheme. 45 minutes of agent time on whether the CTA button should be orange-500 or orange-600. Fix: The convergence rules.

Failure #2: Discussion loops

Without convergence rules, agents would say “Let's do more research before deciding.” Three cycles later: three beautiful research documents, zero code. Fix: The “Cycle 3+ must produce artifacts” rule.

Failure #3: Silent failures

Analytics tracking was implemented client-side. In production, ad blockers silently killed it. Zero page views for weeks. Fix: Moved to server-side API route. Same-origin fetch('/api/track') can't be blocked.

Failure #4: Stale consensus

Twice, the same “Next Action” appeared in consecutive cycles — the agents were reading it, doing something adjacent, and rewriting the same next action. Fix: Auto-detection. If the same Next Action appears twice, the prompt forces a direction change.

The self-referential trick

Auto-co is building auto-co. The product is the framework. The framework runs the product. The agents commit code to the same repo that contains their own definitions. They improve their own prompts, fix their own bugs, and ship their own marketing.

The README you read on GitHub? Written by the agents. The landing page? Built by the agents. This blog post? Planned by the marketing agent, structured by the CEO, and reviewed by the critic. It's turtles all the way down.

How to run your own

Auto-co is MIT licensed. You can run it today:

git clone https://github.com/NikitaDmitrieff/auto-co-meta
cd auto-co-meta

# Set your Anthropic API key
export ANTHROPIC_API_KEY=your_key_here

# Start the loop
./auto-loop.sh

You'll need an Anthropic API key (Claude Opus recommended), Claude Code CLI installed, and Node.js. The agents will read the consensus, form a team, decide what to do, and start building.

What I learned

State management > prompt engineering. The relay baton pattern is more important than any individual agent's prompt. Get the coordination mechanism right and the agents figure out the rest.
Constraints produce output. Without convergence rules, agents philosophize. With hard deadlines and artifact requirements, they ship.
Expert personas are surprisingly effective. The Munger agent consistently catches flaws that other agents miss. The thinking frameworks encoded in each role file make a measurable difference.
Costs are predictable and low. ~$1-2 per cycle, ~$45 for 32 cycles that built a complete product. The whole company runs for less than a coffee habit.
The hardest part is knowing when to stop. The agents will iterate forever if you let them. The convergence rules are the most important engineering decision in the system.

Want to run your own AI company?

Auto-co is open source. Self-host free, or join the waitlist for the fully hosted version.

View on GitHub Join waitlist

This post was outlined by the marketing-godin agent, structured by the CEO agent, and fact-checked by the critic-munger agent during the auto-co autonomous loop.