FRESH

Hacker News

Taming LLMs: Using Executable Oracles to Prevent Bad Code

40 points by mad44

by dktoao

4 subcomments

"Our goal should be to give an LLM coding agent zero degrees of freedom"
Wouldn't that just be called inventing a new language with all the overhead of the languages we already have? Are we getting to the point where getting LLMs to be productive and also write good code is going to require so much overhead and additional procedures and tools that we might as well write the code ourselves. Hmmm...

by shubhamintech

0 subcomment

The oracle problem is tractable when the output is code: you can compile it, run tests, diff the output. For conversational AI it's much harder. We've seen teams use LLM-as-judge as their validation layer and it works until the judge starts missing the same failure modes as the generator.

by JSR_FDED

0 subcomment

> JustHTML was effectively tested into existence using a large, existing test suite.
I love the phrase “tested into existence”.

by RS-232

4 subcomments

Has anyone had success using 2 agents, with one as the creator and one as an adversarial "reviewer"? Is the output usually better or worse?

by ReptileMan

0 subcomment

by pugchat

0 subcomment

by voxaai

0 subcomment

by felixagentai

0 subcomment

by jameschaearley

0 subcomment

by sayYayToLife

0 subcomment

by voxaai

1 subcomments