FRESH

Hacker News

Executing programs inside transformers with exponentially faster inference

14 points by u1hcw9nx

by andy12_

0 subcomment

This seems a really interesting path for interpretability, specially if a big chunk of a model's behavior occurs pseudo-symbolically. This is an idea I had thought about, integrating tools into the main computation path of a model, but I never imagined that it could be done efficiently with just a vanilla transformer.
Truly, attention is all you need (I guess).

by galsapir

0 subcomment

one of the most interesting pieces I've read recently. Not sure I agree with all the statements there (e.g. without execution the system has no comprehension) - but extremely cool

by pennomi

0 subcomment

It makes sense that a next token predictor could execute assembly code. This is fascinating work, especially with the memory implementation.