Progress report · June 6, 2026

We're at Level 3 of 5. Here's what actually crossed.

Our last report said Level 2, and said we'd show you when we moved. This is that post, earlier than the quarterly cadence, because the crossing was real and the receipts are specific.

What crossed, exactly

On May 28 we hit the milestone that defines Level 3 on the ladder we track ourselves against: the first full initiative completed end to end with the system, not a person, carrying most of the gates. Less than half the checkpoints in that initiative needed a human decision. The rest ran on the machinery.

What “the machinery” means in practice:

Every piece of work runs through Helm, the work-tracking system we built. Cards, handoffs, acceptance criteria, status changes. Work gets dispatched to working sessions in batches, and the sessions report back into the same system.
Completion loops run on schedules now, not on someone remembering. Scheduled jobs pick up work, execute it against named acceptance criteria, and close it out.
AI is a defined team member with explicit capability tags, not a tool someone opens. Work routes to it the way work routes to a person.

That's the Level 3 description from the ladder, lived: standardized, repeatable, semi-automated, fewer surprises. The same work runs the same way twice without us holding it.

The honest asterisk

Our Level 2 post measured the strictest way we could: nine business functions, and the published number is the floor, the weakest one. By that measure we said Level 2, because four functions were still climbing out of Level 1.

The Level 3 crossing is a different measurement, and we'd rather explain the difference than blur it. The milestone ladder measures the operating system itself: can the machinery run an initiative end to end with humans on judgment instead of on every gate? As of May 28, it can, and that's what we now report on the AI Journey page.

The function-by-function floor is still climbing. Sales, delivery, customer success, and finance, the four we named last time, got their first real scaffolding and are mid-climb, not done. The machinery is Level 3. The coverage isn't yet, and we'll keep publishing both numbers until they meet.

If that asterisk feels like a lot of honesty for a progress announcement, that's the point. A number you can't interrogate is marketing. A number with its measurement attached is a report.

What got us here since the last post

Overnight execution, gated.Work can now be dispatched on a schedule with a default-deny safety check: nothing runs unattended unless it's on a safe-set a human approved. Autonomy grew, and so did the guardrails around it.
The content engine runs as a playbook. The workflow that publishes posts like this one now has a weekly operator cadence, an end-to-end validated chain from source material to platform-ready output, and email capture live on the site.
The front door got smarter. Ideas enter the system as one line and come back shaped: classified, scored, and filed where they belong. Triage stopped being a founder memory exercise.
Cost intelligence started.We began instrumenting what the system costs to run, so “is this workflow worth it” becomes math instead of vibes. Early, but it's named work now.

What broke

Same rule as always: this section is real or the whole post is worthless.

A dispatch bug let the button say “success” while nothing actually launched, because the preview environment and the live agent were watching two different databases. Found it, fixed the round-trip, and learned to verify on the environment that matters.
An integration test suite quietly wiped a shared development database. Nothing customer-facing, but a full day of re-seeding and a permanent lesson about what test isolation actually means.
The agent that launches working sessions can sit alive-but-deaf after a dropped connection, failing silently. We've restarted it more times than we'd like; making that failure loud instead of silent is on the list.

What Level 4 requires

Both Level 4 milestones are open, and neither is close enough to date:

A first process where human involvement is the exception, not the default. Today a human is still in the loop by design on nearly everything. Level 4 means at least one process inverts that.
Replacing the legacy measure of effort. When the system does the executing, hours stop being the honest metric. We have to finish replacing it with measures of judgment and outcome, in our own operation first.

And the floor still has to catch up: the four lagging functions reach Level 2, then earn Level 3 the same way the machinery did. We won't claim Level 4 before any of that is lived.

Why we publish this

Most of our category sells Level 4 and operates at Level 2. We refuse to do that. The work of climbing from one rung to the next is the marketing. Not the polish, not the press release, not the inflated outcome claim.

Four refusals hold the line:

We won't sell you a level we haven't lived in.
We won't promise outcomes we can't ground in lived work.
We won't take more clients than we can fully deliver to.
We won't position done-with-you as something we run while you watch.

Last time we wrote: read this in 90 days, we'll have moved. It took less than that, and we showed you exactly how. The next report gets the same treatment, including the parts that don't flatter us.

— Vance and the GBT team

One email when the next one lands

We publish where we are, honestly, on a cadence.

No drip sequences, no upsells. Just one email when there's a real change to report.