2026-04-01

Building an Autonomous Agent Stack That Governs Itself

build-log · ai-systems · arbiter · local-first

The premise

What if your AI agent could plan, write code, review its own work, create a pull request, and merge it — all while staying under budget and following rules you set?

That's what we built. Not as a thought experiment. As a running system.

What's running

Ten processes managed by PM2, each with a specific role:

LiteLLM routes between local Ollama models (free) and cloud models (paid)
Brain serves the content pipeline — research, deep research, content production
Arbiter governs every decision: cost limits, approval gates, model routing, feature flags
Poller watches a task queue and triggers the orchestrator
Nanochat runs continuous autonomous research on local models, posting findings to Discord
Scheduler seeds daily research topics and posts cost summaries
Dashboard visualizes everything at localhost:8008
Discord bot provides the human interface — /task, /capture, /status, /codemap

The governance layer

The most interesting part isn't the agent — it's the governance. We use Arbiter, a purpose-built governance language, to enforce rules at 200-nanosecond evaluation speed:

Cost circuit breaker: halt all execution if daily spend exceeds $20
Approval gates: destructive operations (file deletes, force push) require human approval via Discord
Model routing: critical code generation goes to Claude Opus, routine work to Sonnet, research to local Qwen
Feature flags: autonomous task spawning, auto-merge, nanochat — all toggle-able without redeployment

Every decision the agent makes passes through Arbiter first. The agent can't overspend, can't merge without approval (unless we flip the flag), and can't research off-topic.

The agentic loop

The latest addition is a multi-turn execution loop inspired by claw-code. Instead of a rigid planner→coder→reviewer pipeline, the agent now:

Analyzes the codebase structure via codemap (powered by gotreesitter — 206 language grammars, pure Go)
Plans with full knowledge of every function, component, and import in the project
Writes code using tools it can invoke: read_file, write_file, npm_build, codemap
Verifies its work — if the build fails, it reads the error and fixes it
Only claims completion after a successful build

The research loop

Nanochat runs continuously on local models (zero cost). It:

Pulls topics from a backlog
Searches the web via Firecrawl (handles JavaScript-heavy sites)
Synthesizes findings using Ollama
Scores confidence — low-confidence topics escalate to cloud models automatically
Spawns follow-up topics from each research cycle
Posts summaries to Discord so we see what it's learning

After two days, it had researched 230+ topics autonomously, staying focused on studio-relevant domains thanks to topic guardrails.

What we learned

Governance is not optional. Without Arbiter, the agent burned $15 in one night on tasks that didn't need cloud models. With governance rules, it routes intelligently and halts when it should.

Structural awareness changes everything. Before codemap, the agent generated duplicate components and broken imports. After — it sees the full project structure before writing a single line.

The loop matters more than the model. A mediocre model in a self-correcting loop outperforms a brilliant model in a single-shot pipeline. The ability to try, fail, read the error, and fix is what makes autonomous code generation viable.

Local-first is practical. Most of our research runs on ollama/qwen2.5:14b — free, private, fast enough. Cloud models are reserved for critical work. The system decides, not us.

What's next

We're now offering this as a service through Torque Engineering — installing private, governed AI operating systems for founders, operators, and small teams. The same architecture, configured for your goals.

If you're interested in how any of this works, reach out. We publish everything we learn.

Need architecture help?

ResonanceWorks works with founders, operators, and small teams on agent architecture, governance, and private AI system design. Talk to Consulting.

Want a governed AI system on your hardware?

Torque Engineering installs performance-tuned private AI for independent operators. Get Started.

Exploring human-machine culture?

Entrainment House publishes music, art, and cultural works shaped through human-machine coordination. Enter the House.