@okwhateverdude

okwhateverdude@lemmy.world · 4 days ago

I feel like every time I’ve gone looking for an info page, it was just the man page content, but now I’ve got some useless shit I installed.

I mean, maybe this is a debian thing.

okwhateverdude@lemmy.world · 4 days ago

What’s worse, is if this is GNU-ware, there is a good chance the answer IS NOT IN THE MAN PAGE. I think it was bash or maybe gawk. I don’t remember exactly, but I had a question that simply wasn’t answered in each man page. GNU docs are absolute trash, written without any consideration for the audience.

okwhateverdude@lemmy.world · 4 days ago

Cool man. It is really refreshing to see this level of engagement. You’ve really thought this though. You’re right about the routing model moving it up a level and also about retraining. It’s all trade-offs.

Are you intending this for others to use or is this really just for you? Because I think what you’re slowly building is a power tool with a whack-a-mole set of routing tweaks specifically for you. Nothing wrong with that, but the barrier to entry for others to use this is reading that routing and understanding the foibles that have been baked in with your preferences in mind, and even adding fixes and tweaks of their own which kinda breaks the magic a little.

This was really the point I was making about transparency.

I appreciate others also doing real work with potato GPUs because I, too, have a potato GPU (6GB). I think there is real utility in continuing to develop this.

I’ll give this a star and follow along. It doesn’t really fit my mental model of how I’d like my harness to behave, but I will totally steal some of these ideas.

okwhateverdude@lemmy.world · 4 days ago

So I was curious about how you accomplished this and took a look with the robots to figure it out.

TL;DR: the router is a massive decision tree using heuristics and regex to avoid LLM calls on unprefixed prompts.

I think this is an interesting, brute force approach to the problem, but one that will always struggle with edge cases. The other bit it will struggle with is transparency. Yes, it might be deterministic because it is a decision tree, but unless you really understand how that decision tree works under the hood and know where the pitfalls are, you’re going to end up talking to the LLM a lot of the time anyhow.

Something you might want to consider is doing a fine-tune of a smol model (think something like qwen3:1.7B or even smaller like one of the gemma3n sub-1B) that will do the routing for you. You can easily build the dataset synthetically or harvest your own logs. I think this might end up covering more edge cases more smoothly without resorting to a big call to a larger model