> As a consequence of this succinctness, we show that basic verification problems for transformers, such as emptiness and equivalence, are provably intractable: specifically, EXPSPACE-complete.
If you were hoping to formally prove the correctness of a large transformer, it turns out that you're going to need an exponentially larger amount of space to do your verification, more than you could possibly afford.
Transformers Are Inherently Succinct (2025) - https://news.ycombinator.com/item?id=48014197 - May 2026 (9 comments)
Authors used LTL (linear temporal logic) to express, basically, non-reduced non-ordered binary decision diagrams. Or just binary decision diagrams, BDDs.
BDDs are almost guaranteed to have exponential size because they do not employ reduction (sharing of common expressions). Reduced BDDs are more succinct and reduced ordered BDDs are even more succinct.
Also, transformers in the paper are constructed, not trained. Training any model to express some truth table is very hard. They also did not perform comparison with, say, Kolmogorov-Arnold representation, which is also universal approximator.
So this paper is not as deep as one may think it is.