Show HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch
51 points by vforno
by AndReics
1 subcomments
Wow that's really cool i'll definitely check it out!
have played around with machine learning algorithms built from scratch in c / cuda too, but once i hit the cuda part of it i kinda just left it to the side.
i'm curious how did you use CUDA to optimize the matrix multiplications?
how optimized is training, does it take much longer then using pytorch?
by
0 subcomment
by tdesilva
1 subcomments
Mentioning neural ODE doesn't make sense here, as this is unrelated. Basically any implementation of transformer uses residuals, but you're not really training a neural ODE here.
Also consider getting rid of the em-dashes. I don't know if you mostly vibe-coded this or not, but the README is pretty clearly AI generated.
by ali_chherawalla
1 subcomments
this is super interesting. Looking forward to trying this out!
by isatty
1 subcomments
I'm genuinely curious how much of this is LLM generated?
by ericb
1 subcomments
How long was it trained for? How many tokens?
by valentynkit
0 subcomment
[dead]
by Chu4eeno
3 subcomments
Very weird coding style, did you run astyle --style=python on C code?
Also, your LLM left a comment in the cuda source that it is untested, does the cuda stuff work?