before you switch models, run this 30-minute audit on your openclaw stack

the audit kit that finds yours in 30 minutes, plus the consultant playbook to package the same fix as a paid service.

OpenClaw Unboxed and Josh Davis

Apr 26, 2026

∙ Paid

most people blame the model first.

sometimes that’s actually true.

a lot of the time, the bill got ugly because the stack design got lazy.

heartbeat is doing work that wanted cron.

a premium lane is handling routine checks.

main-session context keeps dragging old baggage into cheap work.

tool-heavy runs keep escalating without a stop rule.

the stack works.

the bill still makes no sense.

openclaw’s own docs draw the line clearly enough that this shouldn’t stay fuzzy.

cron is for exact timing and isolated execution.

heartbeat is a periodic main-session turn with full session context.

cron executions create task records.

heartbeat turns don’t.

if you treat those as the same thing, you make the stack harder to inspect and easier to overpay for.

cost pressure is no longer something operators can hide behind flat-rate assumptions. operator threads are still full of people discovering that a cheap stack stopped feeling cheap once recurring checks, premium models, and growing sessions piled up.

this article does one job.

it shows you how to run a token autopsy before you gut the whole system. the same process can become a paid offer if you want to sell it.

what a token autopsy is

a repeatable way to answer five questions.

which agent or workflow is spending the most.

which jobs are paying for full context when they shouldn’t.

which recurring checks belong on cron instead of heartbeat.

which model lanes are stronger than the work needs.

whether the fix worked after you changed the stack.

that’s the point of the kit. it turns the part most people guess at into something you can inspect line by line.

where the bill usually comes from

you don’t need twenty theories. you need the leak map.

heartbeat bloat

heartbeat is useful when the work benefits from approximate checks and full context. inbox awareness fits. calendar awareness fits. notification awareness fits.

the cost problem starts when heartbeat carries a premium model, bloated session files, or jobs that wanted exact timing and isolated execution.

that’s not a heartbeat problem.

that’s a design problem.

wrong model in the wrong lane

a lot of builders treat “best model” like a permanent identity choice.

routine checks, status summaries, classification, cleanup, and extraction don’t need the strongest reasoning lane every turn. once you split the lanes, you put the strong model where the judgment lives and a cheaper model on everything else.

tool-heavy loops on expensive lanes

browser steps, screenshot paths, pdf work, and repeated execution loops add up fast when every hop climbs into a premium lane.

the bill rises even when the stack never feels smarter.

history drag

main sessions get heavier quietly. each turn looks small. the session still gets fatter. eventually the stack pays to re-explain itself on every routine call.

unowned recurring jobs

once background work starts piling up without a ledger, most operators lose the ability to answer the most basic question.

what ran, how often, and at what price.

the use case that makes this concrete

picture a small ecommerce team.

one openclaw setup watching a shared inbox, checking a spreadsheet export, nudging follow-up tasks, and writing twice-daily summaries.

they wanted a cheap assistant. instead they built a stack quietly paying for context-aware reasoning to babysit routine admin work.

from the outside it looked like openclaw got expensive.

from the inside the problem was smaller.

scheduled work was living in the wrong lane. recurring checks were heavier than they needed to be, and oversized context files were dragging old material into routine awareness work.

after the audit on this stack:

44 percent of estimated weekly spend was tied to heartbeat rows
the highest-cost rows were routine checks, not real reasoning work
two recurring jobs should’ve moved to cron on day one
the post-change baseline dropped 44 percent in the sample pack

before: $93.90 per week. after: $52.60. saved: $41.30. that’s the case study that anchors the kit.

the first 30 minutes

if you’ve never run an audit, this is the path.

run the kit on the example data first. it ships with sample logs, a sample config, and a sample task map.

open dashboard.html in any browser. you should see total cost, total tokens, cost grouped by job type, and the top ten highest-cost rows.

then replace the sample files with your own.

start with one agent or one workflow. the goal is to find the first leak, not to audit your whole stack on day one.

open heartbeat_audit.md. look for premium models on heartbeat rows.

open spend_ledger.csv. sort by cost. check the top rows. if they’re mostly heartbeat or routine summaries, you found the wrong lane.

open cron_recommendations.csv. pick one exact-time job to move. good first candidates: a daily report, a fixed-time reminder, a weekly review, a recurring follow-up nudge.

upgrading here gets you the exact build behind this article. deployable scripts, configs, install steps, monitoring services, hardening checklists, the consultant playbook, and 38 passing tests so you trust the code before you run it on real data. operator-grade assets and the system to ship it as your own service.

repo link 👇