Show HN: Tiny Diffusion – A character-level text diffusion model from scratch
152 points by nathan-barry
by simonw
0 subcomment
This is really neat.
I noticed the diffusion-process.py demo was using matplotlib in a window, but I figured it would be cute if it used a terminal UI instead - so I had Claude Code convert it to use curses. Code and demo GIF here: https://gist.github.com/simonw/9033ebd8dd17b4c0ad101ddda7a54...
by mlmonkey
0 subcomment
I'm curious: has there been any work done on generating embedding vectors instead of discrete tokens via diffusion? What would that look like? Please point me to some references. Thanks!
by yugretcx
4 subcomments
Why do these text diffusion demos always look like the number of allowed tokens is fixed for a specific unfilled region?
Is this the case?
Ie. if the region only has four tokens(here characters) but calculates the best word is “forget” does it just abandon the best fit or truncate it to fit?
Are there text diffusion models with lax infill directives?
by Majromax
1 subcomments
The basic MLP block in this model uses a ReLU^2 activation function (x <- ReLU(x)^2). That seems to be copied from the nanochat project, and it's not present in nanoGPT. Is there some documentation on the choice of this activation function?
by gdiamos
2 subcomments
One year later and there is still no inference engine for diffusion LLMs
Students looking for a project to break into AI - please!
by embedding-shape
1 subcomments
Fun project, easy to understand and nice looking results, everything one could ask for! I played around with it locally, did some optimizations of low hanging fruits without making it much more complicated, and was gonna send over a PR. But then I noticed there is no license attached to the project. What are your plans regarding the licensing for this?