the free openclaw compute ladder

How builders run agents 80% free and only pay when deep reasoning actually matters.

OpenClaw Unboxed

Mar 11, 2026

Most AI agent setups fail for one reason.

They rely on one AI provider.

When that provider rate limits or goes down, the agent retries the same provider until everything stops working.

You hit 429 errors.
Your quota burns.
Workflows stall.

You don’t need unlimited AI.

You need a compute ladder.

Fast models handle most tasks.
Fallback providers keep the system alive.
Paid models only run when they actually matter.

This stack’s currently running in my own OpenClaw environment using Groq, OpenRouter, Ollama, and OpenAI.

the compute ladder

tier 4   openai/gpt-4o
tier 3   openrouter/moonshotai/kimi-k2:free
tier 2   openrouter/z-ai/glm-4.5-air:free
tier 1   groq/llama-3.1-8b-instant
tier 0   ollama/qwen2.5:0.5b

Most tasks finish on tier 1.

Tier 2 and tier 3 act as fallback providers if earlier models fail.

Tier 4’s the break-glass paid model for deep reasoning, complex coding, or long analysis.

Tier 0 runs locally so your system never completely dies even if every cloud provider fails.

how requests move through the ladder

request
 ↓
groq fast model
 ↓
glm reasoning fallback
 ↓
kimi heavier reasoning
 ↓
paid model if needed
 ↓
local model safety net

Most requests resolve at the top.

Lower tiers only activate when needed.

before you start

This assumes you’ve already got OpenClaw installed and running.

If not, follow the beginner install guide here

You should already be able to run:

openclaw status
openclaw gateway status
openclaw doctor

step 1 install the local fallback model

Install Ollama if you don’t already have it.

https://ollama.com

Pull the local model:

ollama pull qwen2.5:0.5b

Test it:

ollama run qwen2.5:0.5b

Type:

say hello in one sentence

If it replies, your local fallback works.

Press Ctrl+C to exit.

step 2 create provider keys

You’ll need API keys for:

Groq

https://console.groq.com

OpenRouter

https://openrouter.ai

OpenAI

https://platform.openai.com

During OpenClaw onboarding you can connect these providers.

Note: OpenRouter free models have rate limits. Expect roughly 20 requests per minute and daily caps unless credits are added.

That’s why Groq sits at the top of the ladder.

step 3 open the openclaw dashboard

Run:

openclaw dashboard

If authentication’s required:

openclaw config get gateway.auth.token

If a token doesn’t exist:

openclaw doctor --generate-gateway-token

Paste the token when prompted.

step 4 configure the ladder

Open the dashboard.

Go to:

config → raw json

Paste this:

{
 “agents”: {
   “defaults”: {
     “model”: {
       “primary”: “groq/llama-3.1-8b-instant”,
       “fallbacks”: [
         “openrouter/z-ai/glm-4.5-air:free”,
         “openrouter/moonshotai/kimi-k2:free”,
         “openai/gpt-4o”
       ]
     },
     “heartbeat”: {
       “model”: “ollama/qwen2.5:0.5b”
     }
   }
 }
}

Save the configuration.

Restart OpenClaw if prompted.

step 5 verify providers

Run:

openclaw models status --probe

You should see:

providers authenticated
models resolving
fallback chain visible

That confirms your ladder’s working.

step 6 verify inside openclaw

Open a chat session and run:

/model status

You should see the active model and fallback chain.

step 7 add a routing guardrail

Send this once to the agent:

you are running on a multi provider ai ladder.

rules

use the primary model first

if a provider returns 429 or timeout move immediately to the next fallback

never retry the same provider more than once

use the paid model only for deep reasoning or coding

prefer free models for short tasks

That prevents retry loops that burn tokens.

step 8 test the ladder

Test 1

Ask:

summarize this article in three sentences

Expected result: fast response using Groq.

Test 2

Disable your Groq key temporarily.

Ask the same question.

Expected result: OpenClaw falls back to OpenRouter.

Test 3

Switch to the local model:

/model ollama/qwen2.5:0.5b

Ask:

confirm heartbeat is active

You should receive a local response.

example workflow using the ladder

Here’s a simple real task to test the stack.

Paste this prompt into your OpenClaw chat.

analyze this article and return:

1 short summary
3 key takeaways
1 action step

keep the response under 150 words

Most runs will complete on the Groq fast model.

If Groq fails, the task automatically moves down the ladder.

That’s the entire point.

Reliable agents without burning paid tokens.

important note about fallbacks

Fallback models activate when providers fail.

They don’t activate because a task’s complex.

If you want to intentionally use a stronger model:

/model openai/gpt-4o

the real goal

The goal isn’t just free AI.

The goal’s resilient AI infrastructure.

A system that:

stays online during provider outages

keeps most tasks free

escalates only when necessary

stays easy to debug

That’s what this stack provides.

James Utley PhD

Mar 11

The Groq - “compound”

model is lowkey amazing

I have it embedded in many different workflows and automations, basically it is smart and fast, and a great complement to a competent engineer

1 reply by OpenClaw Unboxed

John Holman

Mar 14

That’s a solid set up 😎👍

8 more comments...

Discussion about this post

Ready for more?