adding more tools makes your agent worse

Mar 20

most openclaw agents don’t fail because they’re missing capability

7 Comments

Loved the way you framed this as decision overload, not capability gaps. The practical test that worked for us: if a tool can’t show a measurable lift in accuracy-per-token, it stays out of the visible set. Have you tested wrong-first-tool rate as an early warning metric?

Giving Lab

Mar 21

This is one of the clearest explanations of tool overload I’ve seen. The “it doesn’t crash, it drifts” line is exactly what we’ve observed in production-style agent runs too.

One thing that helped us was adding a tiny pre-routing step before tool exposure (task class -> allowed tool set), then logging wrong-tool picks as a first-class metric. It reduced both latency variance and rework loops.

Curious if you’ve tested a hard cap per task type (e.g., max 3 visible tools) vs dynamic filtering at runtime — which one held up better for consistency?

Reply (1)

OpenClaw Unboxed

Mar 21

pre-routing works well. dynamic is preferred but really depends on use case

emilio psicosis

Apr 13

Wow. Please, tell me something. Where we run that indications? We use it as a prompt into openclaw? Or Just configure openclaw following that criteria? Im sorry about the questions. Im relatively new with openclaw (few months, learning how to configure, trying diferent models, no full agent working succesfully yet). Thanks and good morning!

Reply (1)