Executing programs inside transformers with exponentially faster inference
14 points by u1hcw9nx
by andy12_
0 subcomment
This seems a really interesting path for interpretability, specially if a big chunk of a model's behavior occurs pseudo-symbolically. This is an idea I had thought about, integrating tools into the main computation path of a model, but I never imagined that it could be done efficiently with just a vanilla transformer.
Truly, attention is all you need (I guess).
by galsapir
0 subcomment
one of the most interesting pieces I've read recently. Not sure I agree with all the statements there (e.g. without execution the system has no comprehension) - but extremely cool
by pennomi
0 subcomment
It makes sense that a next token predictor could execute assembly code. This is fascinating work, especially with the memory implementation.