FRESH

Hacker News

Making a vintage LLM from scratch

101 points by croqaz

by mg794613

4 subcomments

"The code is semi-vibe-coded with whatever LLM I had with VS-Code and PI (OpenRouter models)."
I appreciate the honesty, but now there's no journey, and that's what I'm interested in. I can ask a LLM myself.

by tancop

2 subcomments

> These samples have very good scores overall, but they are useless. I am guessing it's not English text... I counted a few hundred examples mostly from LOC-PD and other few hundred in the OTA datasets. Imagine if I feed that crap to my LLM, what will it learn?
im pretty sure its a real text in Welsh. there might be typos from ocr but yeah thats what the language really looks like, i dont speak it but its easy to recognize.

by dennysora-main

2 subcomments

by croqaz

1 subcomments

I am creating my tiny Llama 340M base model from scratch. If you're curious about the steps, challenges and cost, read on. I am still working on the instruct model.

by cyberge99

4 subcomments

There are certain things you can only truly learn by doing. I remember doing Linux From Scratch over a weekend and the depth of linux that I still understand to this day.
Thanks for the writeup. A more granular followup would be cool too.

by rxm

0 subcomment

by macwhisperer

1 subcomments

by HexPhantom

0 subcomment

Instead of always trying to make models more current and general, there may be value in making them deliberately narrow, historically constrained and weird in a well-defined way

by nnnnnmnnnnnn

0 subcomment