FRESH

Hacker News

Optimizing Tool Selection for LLM Workflows with Differentiable Programming

122 points by viksit

by viksit

6 subcomments

I was experimenting with how local, learnable routers can reduce token overhead, and lower costs, and decided to publish a post about it. The main goal is to delegate tool calls via a PyTorch based learner and examples of how to integrate this into a DSPy pipeline. Feedback welcome!

by pcwelder

0 subcomment

You've essentially just trained your own LM instead of using a pretrained large LM.
Speaking generically -- any place in your workflow you feel the task is not hard, you can use smaller and cheaper LM.
Smaller LMs come with accuracy reduction, particularly in tail cases. So in the real world this doesn't work out.
Also is gumble softmax usage intentional? It looks like a straightforward classifier that just needs regular softmax.

by Garlef

2 subcomments

Is selection really the issue?
You'd still need to figure out what payload to give to the tool based on your context.
But I guess depending on your business case it might be worth it. It's not something I'd do from the beginning, though.

by bGl2YW5j

1 subcomments

I don’t think the problem is “how to optimise tool selection for the LLM”. I think the real problem is using an LLM to do tool selection at all. This is control flow and I believe should be handled with hardcoded rules and/separation of concerns.
If LLMs could handle determinism better, I’d say having a single chat-based entrypoint into a plethora of services makes sense. But as they stand, it doesn’t make sense. Simpler control flow and constraining the number and type of downstream services that sit behind a single interface I think is the way to go.
That said, I agree we should keep the ambition to move to the one size fits all approach.

by shusaku

1 subcomments

Yes I think once you’ve got an LLM in the loop it’s easy to be lazy and just use it to make all decisions. But it’s good to step back and think if there is a cheaper way, I mean even some hardcoded logic can do the job.

by bigmadshoe

0 subcomment

by crazylogger

0 subcomment

by jaksa

0 subcomment

Figuring out which tool to call is trivial, passing the correct arguments is the difficult and error prone part. Smarter agents would even use a varying amount of tool calls until they get the desired response.

by nphard85

2 subcomments

Very interesting. How does this approach work for complex agentic workflows where the LLM is expected to orchestrate across multiple tools (such as when using MCP)? Or is this mainly for simple cases like the ones presented in the blog post?

by viksit

0 subcomment

(author here, put the code in a gist here for reference)
https://gist.github.com/viksit/c67d1d960c4cec89488290496defb...

by digitcatphd

0 subcomment

this is smart, but I think NVIDIA's paper on fine tuning small language models presents a sightly more efficient approach

by tomlue

2 subcomments

by apsears

1 subcomments

I have been thinking a lot about tool selection lately, and something that I keep repeating to myself is: "the LLM has intuition, but I have data".
I guess that applies when you're not able to fine-tune the LLM you're using. Presumably Anthropic has a lot of data too.