sam0x17 20 days ago:
Didn't want to bury the lead, but I've done a bunch of work with this myself. It goes fine as long as you give it both the textual representation and the ability to walk along the AST. You give it the raw source code, and then also give it the ability to ask a language server to move a cursor that walks along the AST, and then every time it makes a change you update the cursor location accordingly. You basically have a cursor in the text and a cursor in the AST and you keep them in sync so the LLM can't mess it up. If I ever have time I'll release something but right now just experimenting locally with it for my rust stuff On the topic of LLMs understanding ASTs, they are also quite good at this. I've done a bunch of applications where you tell an LLM a novel grammar it's never seen before _in the system prompt_ and that plus a few translation examples is usually all it takes for it to learn fairly complex grammars. Combine that with a feedback loop between the LLM and a compiler for the grammar where you don't let it produce invalid sentences and when it does you just feed it back the compiler error, and you get a pretty robust system that can translate user input into valid sentences in an arbitrary grammar.
I seem to have gotten 'lucky' and it split an emoji just right.
---
For anyone curious: this is great for large, disjointed, and/or poorly documented code bases. If you kept yours tight and files smaller than ~600 lines, it is almost always better to nudge llm's into reading whole files.
For example under the "Why CK?" section, "For teams" is of no substance compared to "For developers"
~/c/l/web % ck --sem 'error handling'
ℹ Semantic search: top 10 results, threshold ≥0.6
⠹ Searching with semantic mode...
All I got was spinning M2 Mac fan after a minute, and gave up.I added the VSCode plugin but it didn’t seem to help, likewise searching around yesterday I didn’t see anything surprisingly.
Looks like you have to build an index. When should it be rebuilt? Any support for automatic rebuilds?
I did look into the core features and I gotta say, that looked quite cool. It's like Google search, but for the codebase. What does it take to support other languages?