Agent Operations: the crews I run, and why

How to read this Six archetypes, then four principles. The archetypes are what runs and when. The principles are why the lines sit where they do.

The attended heartbeat.01

The day starts with me at the keyboard. I sit down, the agent shows me what's waiting, and we decide what moves.

The agent reads the board and drafts a brief. Nothing changes until I approve, redirect, or defer each item.

The heartbeat only runs while I'm here. When I step away, it stops. You don't automate direction.

How it actually works

When I open a session, the agent pulls the current queue: open items, tagged comments, and overnight digests I haven't read. It ranks them into a brief: what needs a call today, what can wait, what can be dropped.

I work through the brief item by item. Anything I approve is written to the work system right then. Deferred items stay in the queue. The session ends when the brief is clear or I close it. Nothing gets written after that.

Scheduled drafters.02

These agents run on a clock and turn out drafts and proposals. They never publish on their own.

Daily runner

Fires once each morning

Drafts comments, summaries, and next steps for items I've marked safe to draft on. Everything waits as a draft.

Scope: draft + report

Under the hood

The runner finds items I've tagged with the draft-only label, works through them one by one, and posts its draft as a comment in the thread. It never changes status, assigns an owner, or moves an item along. The draft sits until I act on it.

Weekly board health

Fires once per week

Walks the backlog. Flags stale items, missing owners, and work that has wandered from what it set out to do. Writes a short digest.

Scope: observe + report

Under the hood

The agent reads every open item and checks where it stands against why it was opened and when it last moved. It changes nothing. The digest comes back as a report, and I decide what to triage.

Idea incubator

Fires once per week

Takes a one-line idea and works it into a short case: what it is, what it would cost to chase, how it stacks up against what's already on my plate. Commits to nothing.

Scope: draft

Under the hood

The incubator gets a cleaned list from the intake poller. For each idea it writes a one-page brief: the problem, rough effort, what it leans on, and a call to pursue, defer, or drop. The brief is a draft. Nothing lands in the work system until I promote it.

fig. 02 · draft, then wait

Every scheduled drafter has the same shape: fire, draft, wait. Nothing ships until a human says so.

Sweepers & digesters.03

These agents pull from scattered places and boil them down to one short daily digest. The point is that nothing gets missed, not that nothing stays unread.

Commitment sweep

Fires each morning

Reads recent email, meeting transcripts, and notes. Pulls out the things I said I'd do, the deadlines, and the follow-ups. Hands back one short digest. No deleting, no replying.

Scope: read + report

What it reads

The sweep covers the last 24 hours of inbound email, any finished meeting transcripts, and notes I've flagged. It looks for the wording that signals a promise: a deadline, "I'll send," "by Friday," anything that sounds like an action. The digest is plain text, grouped by source, with a short flag on anything time-sensitive.

Link digester

Standing, runs on each drop

Watches one channel where I drop links and loose notes. Reads each one, sorts it into a bucket, and summarizes it. Hands back the sort. Acts on none of it.

Scope: read + triage

The buckets

Four buckets: build (something to install or make), reference (docs or context to file), inspiration (mood or direction, nothing to do), monitor (worth watching over time). The bucket is a suggestion, not an order. I read the digest and decide what actually happens.

The responder.04

A listener that wakes when someone mentions it and replies right there in the channel. It speaks for me. It doesn't act for me.

The responder reads the thread and posts a reply. It won't create a task, move a record, or message anywhere outside the channel unless I ask it to.

A reply in a thread is talking. Creating a task or sending a message out is doing. The responder talks. The doing always needs a human.

How the line is held

The responder only has the tools for its own channel: read the history, post a reply. The tools to create records, change the work system, or message anywhere else are not handed to it. The line is built into what it can touch, not a rule it's asked to follow.

Intake pollers.05

These agents turn dropped links and half-formed ideas into sorted, scoped items, ready for me to promote or drop.

Link triage poller

Standing, checks for new drops

Picks up new links from the intake channel, reads what they point to, works out what they're for, and writes a short summary with a suggested next step. The step itself waits for me.

Scope: read + sort + draft

How it sorts

It reads each link and sorts it into one of four buckets: build (something to make), reference (something to file), inspiration (direction or mood), monitor (worth watching over time). The suggested step is part of the summary; the poller never runs it. A fifth bucket, archive, is there for links it reads as nothing worth doing soon.

Idea intake

Weekly batch

Gathers one-line ideas from a running notes file and tagged messages. Groups them, drops the duplicates, and hands the clean list to the incubator to work up.

Scope: collect + organize

Health & hygiene crons.06

Recurring checks that keep the system honest. They don't do the work; they report on whether the things that do are healthy.

Smoke test

Runs daily

Checks that the key services are answering. It speaks up when something fails and stays quiet when it doesn't. Silence means green.

Scope: probe + report

What gets checked

The smoke test pokes the pieces the fleet leans on: the work board, the channel the responder listens to, the outside sources the sweepers read, and the scheduler that fires the timed agents. One failed check sends a notification, and the rest of the fleet keeps running. Fixing it is my job.

Knowledge hygiene

Runs weekly

Walks the knowledge base for stale pages, dead links, and entries that have sat too long. Writes a report. I decide what to update.

Scope: audit + report

What "stale" means

A page is flagged stale when it hasn't been touched for longer than its kind allows: 30 days for system pages, 90 for reference, 14 after a project last moved. The report lists each flagged page with its age and a one-line reason. The agent deletes and changes nothing.

why it's built this way

THE PRINCIPLES

Four rules that govern every agent in the fleet. Not guidelines. Load-bearing.

Steward scope.07

Every agent here is a steward, not a principal. It drafts and reports. It doesn't decide and ship.

Draft and report, never decide and ship.

An agent can write a comment, a summary, a proposal, a sort. It can't send the email, publish the doc, or move money. Anything you can't take back belongs to a human, not a process running in the background.

Why the line is what you can't undo

You can fix a draft, a report, or a staged change before it matters. You can't unsend a message, un-merge code, un-publish a post, or claw back money. The line isn't about what an agent can do. It's about what it can't take back. As an agent earns trust, it can draft more. That line stays where it is.

A clear yes is the gate.

It can take many forms: a review click, a reply, a labeled approval. There's always one. "The agent ran fine" doesn't count. The gate is me saying yes to a specific thing.

What counts as a yes

A yes is a deliberate act aimed at one thing: approving this draft, clicking this review action, replying to this report. Doing nothing doesn't count. Not deleting a message, leaving a tab open, not complaining: none of that is a yes. The system makes that yes as cheap as it can be, but it has to be there.

The site is glass, not load-bearing.08

The dashboard is a window onto the system. It is not the system.

Every button is a real write into real machinery.

Clicking "approve" posts a comment to the work system. Clicking "halt" sends a stop signal to the agent's channel. The dashboard itself does nothing. It's a view and a trigger. The writes land in systems that keep running if the dashboard goes dark.

What that means in practice

Each button maps to one write you can trace: a labeled comment, a status change, a message to a channel. Nothing magic. They're the same writes I could make by hand. The dashboard just makes them faster and consistent. If it breaks, I can still do the write myself. Nothing in the system needs the dashboard to exist.

If the site dies, the system keeps running.

Timed agents fire on a clock, not on a heartbeat from the dashboard. The work system holds the state. The knowledge base stands on its own. The dashboard is replaceable.

Three planes.09

The system splits cleanly into three layers. Each has its own job and its own owner.

fig. 09 · three planes

Agents write into the work plane. The ops plane gathers the state. The surface plane is where a human looks and acts.

Work plane: the system of record.

Where tasks, drafts, and history live. Agents and humans both write here. Nothing counts until it lands here.

Why one system of record matters

Spread the state across a database, a channel, a spreadsheet, and memory, and you get drift: the "real" answer depends on which one you ask. The work plane is the one source of truth. Every other layer reads from it and writes to it. An agent that runs without writing here produces work the rest of the system can't see.

Ops plane: fleet state.

A short view of what the agents are up to: recent events, service health, how deep the queue is, when each last moved. Built from what the agents and services report, not from what the screen shows.

Surface plane: where you steer.

The layer you actually touch. It reads from the ops plane and writes back to the work plane through approved actions. It can go down without stopping anything else.

A human gate on everything that matters.10

Automation earns trust by being predictable. To stay predictable, you keep what any one agent can do on its own narrow.

Narrow scope by default.

Each agent has a set, checkable scope: read this source, produce this output, stop. New abilities are added on purpose. The question isn't "should we let it do more?" It's "has it earned more, and have we written more down?"

How scope is set

Each agent's scope is written out as a list of the tools it can use and the places it can write. A daily runner allowed to read the board and post draft comments can't, by that same list, send email or add calendar entries. The list is the authority. Anything not on it isn't available. Changing the scope is a deliberate review, not a slow drift.

Output you can read, not action you can't see.

An agent that hands you a readable draft can be checked. An agent that just does the thing, with nothing to look at first, can't. I lean on drafts, reports, and summaries because you can read them and take them back, not because they're timid.

The gate is permanent, not temporary.

The human gate isn't training wheels. It's the design. A well-run fleet never graduates to "fully autonomous." It graduates to faster drafts I approve with less friction. The gate stays. What changes is how long it takes to say yes.

What "graduating" actually looks like

Graduating means the gate gets cheaper, not gone. An agent with a solid track record earns a one-click approval instead of a review-and-confirm flow. The yes still happens. It just takes two seconds instead of thirty. The aim is to make the right call the easy call, not to drop the need for a human to make it.

want a system like this

Building one of these from scratch takes longer than it looks.

The archetype list is short. The hard part is what triggers what, how the planes connect, and how you keep it from sprawling. The discovery kit is where we map that out before anything gets built.

Start with the discovery kit