FRESH

Hacker News

Home

Show HN: Z80-μLM, a 'Conversational AI' That Fits in 40KB

506 points by quesomaster9000

by nineteen999

4 subcomments

This couldn't be more perfectly timed .. I have an Unreal Engine game with both VT100 terminals (for running coding agents) and Z80 emulators, and a serial bridge that allows coding agents to program the CP/M machines:
https://i.imgur.com/6TRe1NE.png
Thank you for posting! It's unbelievable how someone sometimes just drops something that fits right into what you're doing. However bizarre it seems.

by rahen

1 subcomments

I love it, instant Github star. I wrote an MLP in Fortran IV for a punched card machine from the sixties (https://github.com/dbrll/Xortran), so this really speaks to me.
The interaction is surprisingly good despite the lack of attention mechanism and the limitation of the "context" to trigrams from the last sentence.
This could have worked on 60s-era hardware and would have completely changed the world (and science fiction) back then. Great job.

by Dwedit

1 subcomments

In before AI companies buy up all the Z80s and raise the prices to new heights.

by giancarlostoro

4 subcomments

This is something I've been wondering about myself. What's the "Minimally Viable LLM" that can have simple conversations. Then my next question is, how much can we push it so it can learn from looking up data externally, can we build a tiny model with an insanely larger context window? I have to assume I'm not the only one who has asked or thought of these things.
Ultimately, if you can build an ultra tiny model that can talk and learn on the fly, you've just fully localized a personal assistant like Siri.

by andrepd

1 subcomments

We should show this every time a Slack/Teams/Jira engineer tries to explain to us why a text chat needs 1.5GB of ram to start up.

by vedmakk

2 subcomments

If one would train an actual secret (e.g. a passphrase) into such a model, that a user would need to guess by asking the right questions. Could this secret be easily reverse engineered / inferred by having access to models weights - or would it be safe to assume that one could only get to the secret by asking the right questions?

by bitwize

0 subcomment

Don't be surprised if you're paid a visit by the SCP Foundation: https://scp-wiki.wikidot.com/scp-079
(edit: change url)

by roygbiv2

1 subcomments

Awesome. I've just designed and built my own z80 computer, though right now it has 32kb ROM and 32kb RAM. This will definitely change on the next revision so I'll be sure to try it out.

by gcanyon

1 subcomments

So it seems like with the right code (and maybe a ton of future infrastructure for training?) Eliza could have been much more capable back in the day.

by orbital-decay

0 subcomment

Pretty cool! I wish free-input RPGs of old had fuzzy matchers. They worked by exact keyword matching and it was awkward. I think the last game of that kind (where you could input arbitrary text when talking to NPCs) was probably Wizardry 8 (2001).

by Peteragain

1 subcomments

There are two things happening here. A really small LLM mechanism which is useful for thinking about how the big ones work, and a reference to the well known phenomenon, commonly dismissively referred to as a "trick", in which humans want to believe. We work hard to account for what our conversational partner says. Language in use is a collective cultural construct. By this view the real question is how and why we humans understand an utterance in a particular way. Eliza, Parry, and the Chomsky bot at http://chomskybot.com work on this principle. Just sayin'.

by gwern

0 subcomment

So if it's not using attention and it processes the entire input into an embedding to process in one go, I guess this is neither a Transformer nor a RNN but just a MLP?

by bartread

1 subcomments

This is excellent. Thing I’d like to do if I had time: get it running on a 48K Spectrum. 10 year old me would have found that absolutely magical back in the 1980s.

by Zee2

2 subcomments

This is super cool. Would love to see a Z80 simulator set up with these examples to play with!

by MagicMoonlight

1 subcomments

What I really want is a game where each of the NPCs has a tiny model like this, so you can actually talk to them.

by vatary

1 subcomments

It's pretty obvious this is just a stress test for compressing and running LLMs. It doesn't have much practical use right now, but it shows us that IoT devices are gonna have built-in LLMs really soon. It's a huge leap in intelligence—kind of like the jump from apes to humans. That is seriously cool.

by anonzzzies

0 subcomment

Luckily I have a very large amount of MSX computers, zx, amstrad cpc etc and even one multiprocessor z80 cp/m machine for the real power. Wonder how gnarly this is going to perform with bankswitching though. Probably not good.

by alfiedotwtf

1 subcomments

An LLM in a .com file? Haha made my day

by jacquesm

0 subcomment

Between this and RAM prices Zilog stock must be up! Awesome hack. Now apply the same principles to a laptop and take a megabyte or so, see what that does.

by boznz

2 subcomments

Great work. What is your timeline to AGI ?

by a_t48

2 subcomments

Nice - that will fit on a Gameboy cartridge, though bank switching might make it super terrible to run. Each bank is only 16k. You can have a bunch of them, but you can only access one bank at a time (well, technically two - bank 0 is IIRC always accessible).

by jasonjmcghee

0 subcomment

For future projects and/or for this project, there are many LLMs available more than good enough to generate that kind of synthetic data (20 Qs) with permissive terms of use. (So you don’t need to stress about breaking TOS / C&D etc)

by Zardoz84

0 subcomment

Meanwhile, Eliza was ported to BASIC and was run on many home computers in the 80s.

by magicalhippo

1 subcomments

As far as I know, the last layer is very quantization-sensitive, and is typically not quantized, or quantized lightly.
Have you experimented with having it less quantized, and evaluated the quality drop?
Regardless, very cool project.

by coolius

0 subcomment

This is impressive, those are some very restrictive requirements. I wonder what we are able to run on more powerful hardware such as ESP32 or RP2040, has anyone tried this?

by pdyc

1 subcomments

interesting, i am wondering how far can it go if we remove some of these limitations but try to solve some extremely specific problem like generating regex based on user input? i know small models(270M range) can do that but can it be done in say < 10MB range?

by dirkt

0 subcomment

Eliza's granddaughter.

by DrNosferatu

0 subcomment

Awesome! Anyone for a port to the MSX?
A web version would also be cool.

by Y_Y

0 subcomment

Very cool. Did you consider using sparse weights?

by integricho

0 subcomment

Someone add it to collapseos please :)

by bytesandbits

0 subcomment

it's giving Eliza! Ha, fun

by NooneAtAll3

0 subcomment

did you measure token/s?

by lostmsu

0 subcomment

Did you train the model with quantization awareness? How?

by codetiger

5 subcomments

Imagine, this working on a Gameboy, in those days. Would've sounded like magic

by devhouse

0 subcomment

[dead]