adding more tools makes your agent worse
most openclaw agents don’t fail because they’re missing capability
i 100% didn’t believe that at first.
i kept adding tools thinking i was making the system stronger.
it looked more powerful.
it felt more complete.
but every time i added something new, performance slipped.
slower runs
weirder decisions
higher cost
less trust
i blamed the model.
that was wrong.
the real problem
every tool is a decision.
and the model has to get that decision right.
every step.
not just:
“can this be done”
but:
should i use a tool
which one
why this one over the others
what inputs
when to stop
this is where things break.
as tool count increases, selection accuracy drops and inefficiency rises
you are not just adding capability.
you are increasing the chance of being wrong.
the failure you don’t see
it doesn’t crash.
it drifts.
it picks a “good enough” tool instead of the right one
it chains unnecessary steps
it ignores better paths
it behaves differently across runs
you end up with something that looks smart…
but you don’t trust it.
the part most people miss
tools are not just actions.
they are context.
every tool adds:
description
parameters
structure
that all gets injected into the prompt.
and models don’t process everything equally.
they prioritize, skip, and compress.
so what happens:
important tools get buried
irrelevant tools get chosen
decisions get noisier
this is not randomness.
it’s overload.
the punchline
your agent isn’t failing because it’s weak.
it’s failing because it has too many choices.
the moment it clicked
i stripped a workflow down.
didn’t improve it.
didn’t upgrade the model.
just removed tools.
and everything got better.
faster
cleaner
predictable
this is not just anecdotal.
real systems are seeing the same thing.
one team removed ~80% of their tools and saw:
success rate: 80% to 100%
execution time: ~275s to ~77s
tokens: ~102k to ~61k
steps: ~12 to roughly ~7
they didn’t add intelligence.
they removed noise.
the money part (this is what actually matters)
this is the difference between:
a workflow that costs cents
and one that quietly burns dollars every run
same task.
different tool exposure.
one makes money.
one leaks it.
what’s actually happening
three things compound fast:
1. decision overload
more tools = more comparisons
the model spends more time deciding than executing
2. context dilution
more tools = more noise
relevant options get buried
3. path explosion
more tools = more possible chains
more chains = more failure paths
this is why performance degrades as systems scale in complexity
not because models are weak
because systems get messy
the hidden cost nobody tracks
bad tool paths don’t always fail.
they just become inefficient.
more steps
more retries
more tokens
and you don’t notice.
until the bill shows up.
the deeper problem
tool usage has three failure points:
deciding if a tool is needed
selecting the correct one
using it correctly
all three get worse as tool count increases.
and most systems do nothing to reduce that burden.
they just keep adding more.
what actually works
not more tools.
less exposure.
systems that perform well:
limit visible tools per task
merge similar tools
hide tools unless needed
replace repeatable paths with deterministic logic
there is active work showing that filtering tools before exposure improves both accuracy and efficiency in large tool environments
the direction is clear.
not expansion.
compression.
use this immediately
run this on any workflow you have.
act as a systems optimizer for agent workflows.
goal: reduce tool-induced failure without reducing the outcome.
input:
- workflow goal
- current tools
- tool descriptions
tasks:
1. find overlapping tools
2. find tools that create ambiguity
3. find tools that are rarely needed
4. identify tools that should be conditional
5. identify tools that should be replaced with deterministic functions
6. reduce to the minimum viable toolset
7. rank remaining tools by:
- necessity
- risk
- likelihood of incorrect selection
output:
- failure risks
- tools to remove
- tools to merge
- simplified architecture
- why this improves performancethe rule i use now
before adding anything:
does this really change the outcome
is there already overlap
will this create confusion
can this be merged
what happens if it’s used incorrectly
if the answer isn’t obvious
it doesn’t get added
a better starting point
start with only three buckets:
retrieval
transformation
action
force everything into one.
if it doesn’t fit cleanly
you probably don’t need it yet
the uncomfortable truth
most people are building agents like feature lists.
more integrations
more connectors
more actions
it feels like progress.
it isn’t.
real systems don’t fail because they lack features.
they fail because they can’t choose correctly.
and every tool you add makes that harder.
final point
the best agents i’ve seen are not impressive.
they’re major boring.
they do one thing.
with very few options.
and they work every time.
that’s what actually wins.




less is more honestly I don't want these things invading my privacy