FRESH

Hacker News

Show HN: I ran a language model on a PS2

43 points by xaskasdf

by randkyp

2 subcomments

Neat! While the physicality of having the CD spin while running inference is undeniably cool, I wonder if you could run larger models at higher speeds through the PS2 HDD accessory/Memory Card Micro SD adapter/the PS2's USB port.
I doubt the VUs can help with inference given their small scratchpad sizes and instruction set though, haha.

by mghackerlady

1 subcomments

I'm excited for the PS2 SDK. Currently there isn't a lot in that space that won't get you sued

by pooparse

0 subcomment

IIRC the EE had some interesting hardware with vector units. Were these of any use/benefit here?

by keremimo

0 subcomment

by mememememememo

0 subcomment

by Real_Egor

0 subcomment

by maltyxxx

0 subcomment

by SilentEditor

1 subcomments

Love this project. The CD streaming trick is such a smart constraint hack, and honestly the best part is you trained the model for the hardware instead of forcing a desktop recipe onto PS2.
Curious about 2 things if you can share:
whats your per-token latency on real hardware how much quality loss came from PSNT quantization vs fp16 baseline Either way this is peak hacker energy, shipping on actual hardware makes it 10x cooler.

by SachitRafa

0 subcomment

The CD-ROM streaming approach is the real insight here, keeping only activations and KV cache in RAM and streaming weights one matrix at a time sidesteps the 32MB constraint entirely. It's essentially the same trick modern edge inference does with flash storage, just on hardware from 2000. Curious about the latency profile, with CD-ROM read speeds around 1.6 MB/s on PS2, the 77MB SmolLM2 model being too slow makes sense, but how does the 10MB brandon-tiny feel in practice? Are you getting tokens per minute or more like tokens per several seconds? Also interested in the custom PSNT format decision, was the main motivation the PS2's MIPS alignment constraints, or was there something about the existing GGUF/llama.c formats that made them impractical to parse on the Emotion Engine?