When you save a series of images as 16x16 JPEGs at the same JPEG quality level without optimization, you notice that there is a whole lot of common data between those files. Common data includes things like the file header (FF D8, FF E0 blocks), the Quantization tables, and the Huffman tables. If you cut away all the common data, the actual size of the image data is extremely tiny, usually under 64 bytes, though not a fixed size.
Here are the sizes of the four example images (just the unique image data) when resized to 16x16, then saved at quality 20:
First image: 48 bytes
Second image: 42 bytes
Third image: 31 bytes
Fourth image: 35 bytes
After appending back the 625 bytes of common data, you end up with a regular JPEG that can be decoded and displayed using fast native code from the browser.
ThumbHash page includes a comparison against "Potato WebP" which is probably a similar idea.
[1] I used my own avatars and icons as a test set. For example, https://avatars.githubusercontent.com/u/323836?s=400&v=4
Can you share what are the reasons someone may want to compress and image to 16 bytes?