Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> There isn't, pretty much everyone wants the best of the best.

For direct user interaction or coding problems, perhaps. But as API calls get cheaper, it becomes more realistic to use them for completely automated workflows against data-sets, or as sub-agents called from expensive SOTA models.

For example, in Claude, using Opus as an orchestrator to call Sonnet sub-agents, is a popular usage "hack." That only gets more powerful, as the Sonnet equivalent model gets cheaper. Now you can spawn entire teams of small specialized sub-agents with small context windows but limited scope.



Exactly.

I did create my own MCP with custom agents that combine several tools into a single one. For example, all WebSearch, WebFetch, Context7 exposed as a single "web research" tool, backed by the cheapest model that passes evaluation. The same for a codebase research

Use it with both Claude and Opencode saves a lot of time and tokens.


Smart approach combining tools behind a single interface. The "cheapest model that passes evaluation" pattern is underrated.

One data point that might help with the research tool: not all APIs work equally well when called by agents. We scored 387 APIs on agent-friendliness and 54% fail the bar. The main gaps are no CLI tool (66%) and no machine-readable pricing (72%). If your research tool is helping agents pick APIs to integrate, the scores at clirank.dev/api/apis?sort=score could save the expensive model from wasting tokens on APIs that'll fail headless.


I'd be interested in seeing the source for this if you have a moment


> But as API calls get cheaper, it becomes more realistic to use them for completely automated workflows against data-sets

Seems like a huge waste of money and electricity for processes that can be implemented as a traditional deterministic program. One would hope that tools would identify recurrent jobs that can be turned into simple scripts.


It depends on the specific task.

For example: "Here our dataset that contains customer feedback comment fields; look through them, draw out themes, associations, and look for trends." Solving that with a deterministic program isn't a trivial problem, and it is likely cheaper solved via LLM.


It makes sense if the dataset is so large that LLM cost is a prohibitive factor. Otherwise a frontier LLM has the advantage of producing a better result.


That is a very complex, high level use case that takes time to configure and orchestrate.

There are many simpler tasks that would work fine with a simpler, local model.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: