FRESH Hacker News
Home
Forcing Flash Attention onto a TPU and Learning the Hard Way
27 points by azhng