the beginner guide to making small models…

Mar 5

it's time to split work across tiny models so your agents stay reliable and cheaper

5 Comments

Some of the Qwen3.5 models are nice in this space as well. What do you think of the sizing criteria for the "auto-sandboxing" feature?

honestly i’d size auto-sandboxing by action risk, not model size. if a model can run shell, edit files, or control risky tools, sandbox it by default. then relax only after it proves reliable on your real tasks. bigger models are more capable, but not automatically safer per say

https://chetantekur.substack.com/p/making-hybridclaw-leveraging-on-device?r=f2tmd&utm_medium=ios

I created HybridClaw for exactly this pattern. Unfortunately I am limited by my local hardware. Planning to get something better speced or experiment on the cloud.

Do you think finetuning the small models might lead to higher quality. This is the path I plan to explore.

Where this gets interesting is the boundary case. There's a class of work no pipeline handles well — the moment where what looked like an extraction problem turns out to be a judgment call. The router assumes the task is already categorizable. The strong version of your argument might be: use small models for everything that CAN be pipelined, and reserve the expensive model not for "harder tasks" but for the moments where the pipeline's own categories break down.

Reply

Share