Back in the day getting the 16KB expansion pack for my 1KB RAM ZX81 was a big deal. And I also wrote code for PIC microcontrollers that have 768 bytes of program memory [and 25 bytes of RAM]. It's just so easy to not think about efficiency today, you write one line of code in a high level language and you blow away more bytes than these platforms had without doing anything useful.
Instead - here's [0] Ben Daglish (on flute) performing "Wastelands" together with the Norwegian C64/Amiga tribute band FastLoaders. He unfortunately passed away in 2018, just 52 years old.
If that tickled your fancy, here's [1] a full concert with them where they perform all songs from The Last Ninja.
[0] https://www.youtube.com/watch?v=ovFgdcapUYI [1] https://www.youtube.com/watch?v=PTZ1O1LJg-k
Or a convincing representation of that. A lot of old tricks mean that the games are doing less than you think that they are, and are better understood when you stop thinking “how do they do that” and “how are they convincing my brain that is what they are doing”.
Look at how little RAM the original Elite ran in on a BBC Model B, with some swapping of code on disk⁰. 32KB, less the 7.75KB taken by the game's custom screen mode² and a little more reserved for other things¹. I saw breathy reviews at the time and have seen similar nostalgic reviews more recently talking about “8 whole galaxies!” when the game could easily have had far more than that and was at one point going to. They cut it down not for technical reasons but because having more didn't feel usefully more fun and might actually put people off. The galaxies were created by a clever little procedural generator so adding more would have only added a couple of bytes (to hold the seed and maybe other params for the generator) each.
Another great example of not quite doing what it looks like the game is doing is the apparently live-drawn 3D view in the game Sentinel on a number of 8-bit platforms.
--------
[0] There were two blocks of code that were swapped in as you entered or self a space station: one for while docked and one for while in-flight. Also the ship blueprints were not all in memory at the same time, and a different set was loaded as you jumped from one system to another.
[1] the CPU call stack (technically up to a quarter K tough the game code only needed less than half of that), scratch-space on page-zero mostly used for game variables but some of which was used by things like the disk controller ROM and sound generator, etc.
[2] Normal screen modes close to that consumed 10KB. Screen memory consumption on the BBC Master Enhanced version was doubled as it was tweaked to use double the bit depths (4ppb for the control panel and 2bbp for the exterior, instead of 2bbp and 1ppb respectively).
https://bunsen.itch.io/the-snake-temple-by-rax
We lost something in the bloat, folks. Its time to turn around and take another look at the past - or at least re-adjust the rearview mirror to actually look at the road and not ones makeup ..
The ZMachine games, ditto. A few kb's and an impressive simulated environment will run even under 8bit machines running a virtual machine. Of course z3 machine games will have less features for parsing/obj interaction than z8 machine games, but from a 16 bit machine and up (nothing today, a DOS PC would count) will run z8 games and get pretty complex text adventures. Compare Tristam Island or the first Zork I-III to Spiritwrak, where a subway it's simulated, or Anchorhead.
And you can code the games with Inform6 and Inform6lib with maybe a 286 with DOS or 386 and any text editor. Check Inform Beginner's Guide and DM4.pdf And not just DOS, Windows, Linux, BSD, Macs... even Android under Termux. And the games will run either Frotz for Termux or Lectrote, or Fabularium. Under iOS, too.
Nethack/lashem weights MB's and has tons of replayability. Written in C. It will even run under a 68020 System 7 based Mac... emulated under 9front with an 720 CPU as the host. It will fly from a 486 CPU and up.
Meanwhile, Cataclysm DDA uses C++ and it needs a huge chunk of RAM and a fastly CPU to compile it today. Some high end Pentium4 with 512MB of RAM will run it well enough, but you need to cross compile it.
If I had the skills I would rewrite (no AI/LLM's please) CDDA:BN into Golang. The compiling times would plummet down and the CPU usage would be nearly the same. OFC the GC would shine here prunning tons of unused code and data from generated worlds.
Feels like they were closer to programs, while modern games are closer to datasets.
We can't compare 40 KB image today to a 40 KB image from 1980 something, if the contemporary one relies on 100 MB of external cruft, like a rich programming language runtime (fetched and install separately) and packages.
Honestly though, I don't read much into the sizes. Sure they were small games and had lots of game play for some defintion of game play. I enjoyed them immensely. But it's hard to go back to just a few colors, low-res graphics, often no way to save, etc... for me at least, the modern affordances mean something. Of course I don't need every game to look like Horizon Zero Dawn. A Short Hike was great. It's also 400meg (according to steam)
Most of my games are roughly in that range though. I think my MMO was 32KB, and it had a sound effects generator and speech synth in it. (Jsfxr and SAM)
I built it in a few days for a game jam.
I'm not trying to brag, I'm trying to say this stuff is easy if you actually care. Just look at JS13K. Every game there is 13KB or below, and there's some real masterpieces there. (My game was just squares, but I've seen games with whole custom animation systems in them.)
Once you learn how, it's pretty easy. But you'll never learn if you don't care.
You have to care because there's nothing forcing you. Arguably The Last Ninja would have been a lot more than 40KB if there weren't the hardware limitations of the time.
They weren't trying to make it 40KB, they were just trying to make a game.
In my case, I enjoy the challenge! (Also I like it when things load instantly :)
I think I'll make a PS1 game next. I was inspired by this guy who made a Minecraft clone for Playstation:
https://youtu.be/aXoI3CdlNQc?is=sDNnrGbQGJt_qnV6
P.S. most Flash games were only a few kilobytes, if you remove the music!
By comparison, COD Modern Warfare 3 is 6,000,000 times larger at 240GB. Imagine telling that to someone in 1987.
Elite was £20 in 1984 and that would be £66 today, which is not very different from what a good game for the PS5 costs today.
Except that games then were made by one or two people and nowadays games are made by teams with coders, musicians, artists, etc.
I have got 1.1 GB of MP3s with just remixes of the music from the three games, some of which are from a Kickstarter from the composer for the second.
How times have changed. My best-selling program "Apple Writer", for the Apple II, ran in eight kilobytes. It was written entirely in 6502 assembly language.
It is a quite big game: the main executable is 117KB, plus around 50 overlay files of 1.5 KB each for the different dungeons and cities, plus the graphics files. I guess it was even too big for the average PC hardware at that time, or it was a limitation inherited from the original Apple II version: When you want to cast a spell you have to enter the number of the spell from the manual, maybe because there was not enough memory to fit the names of the 94 spells into RAM. Apart from that and the limited graphics and the lack of sound, the internal ruleset is very complete. You have all kind of spells and objects, capabilities, an aging mechanism, shops, etc.. The usual stuff that you also see in today's RPGs.
The modern uninstall.exe that came with it (I bought the game on GOG) was 1.3MB big.
Previously: https://news.ycombinator.com/item?id=38707095
Want to prove a point? Give me Skyrim in 64k of ram. Go ahead! I dare you!
And the loading screens were also amazing, particularly for tape loading.
Anybody remember this one?
I never finished the game, sadly.
Ofcourse luckily our SSDs got bigger too.
It is still impressive to me how much game they could squeeze out of the NES ROM chips.
[1] Or something like that, I don't remember the exact number.
Highly related: two videos covering exactly how they fit...
- Super Mario Bros 1 into 40KiB (https://www.youtube.com/watch?v=1ysdUajrhL8)
- and Super Mario Bros 2 into 256KiB (https://www.youtube.com/watch?v=UdD26eFVzHQ)
I highly advise watching the actual videos to best understand, since all the techniques used were very likely devised from a game-dev perspective, rather than by invoking any abstract CS textbook learning.
But if I did want to summarize the main "tricks" used, in terms of such abstract CS concepts:
1. These old games can be understood as essentially having much of their data (level data, music data, etc) "compressed" using various highly-domain-specific streaming compressors. (I say "understood as" because, while the decompression logic literally exists in the game, there was likely no separate "compression" logic; rather, the data "file formats" were likely just designed to represent everything in this highly-space-efficient encoding. There were no "source files" using a more raw representation; both tooling and hand-edits were likely operating directly against data stored in this encoding.)
2. These streaming compressors act similar to modern multimedia codecs, in the sense that they don't compress sequences-of-structures (which would give low sequence correlation), but instead first decompose the data into distinct, de-correlated sub-streams / channels / planes (i.e. structures-of-sequences), which then "compress" much better.
3. Rather than attempting to decompose a single lossless description of the data into several sub-streams that are themselves lossless descriptions of some hyperplane through the data, a different approach is used: that of each sub-channel storing an imperative "painting" logic against a conceptual mutable canvas or buffer shared with other sub-channels. The data stream for any given sub-channel may actually be lossy (i.e. might "paint" something into the buffer that shouldn't appear in the final output), but such "slop"/"bleed" gets overwritten — either by another sub-channel's output, or by something the same sub-channel emits later on in the same "pass". The decompressor essentially "paints over" any mistakes it makes, to arrive at a final flattened canvas state that is a lossless reproduction of the intended state.
4. Decompression isn't something done in its entirety into a big in-memory buffer on asset load. (There isn't the RAM to do that!) But nor is decompression a pure streaming operation, cleanly producing sequential outputs. Instead, decompression is incremental: it operates on / writes to one narrow + moving slice of an in-memory data "window buffer" at a time. Which can somewhat be thought of as a ring buffer, because the decompressor coroutine owns whichever slice it's writing to, which is expected to not be read from while it owns it, so it can freely give that slice to its sub-channel "painters" to fill up. (Note that this is a distinct concept from how any long, larger-than-memory data [tilemaps, music] will get spooled out into VRAM/ARAM as it's being scrolled/played. That process is actually done just using boring old blits; but it consumes the same ring-buffer slices the decompressor is producing.)
5. Different sub-channels may be driven at different granularities and feed into more or fewer windowing/buffering pipeline stages before landing as active state. For example, tilemap data is decompressed into its "window buffer" one page at a time, each time the scroll position crosses a page boundary; but object data is decompressed / "scheduled" into Object Attribute Memory one column at a time (or row at a time, in SMB2, sometimes) every time the scroll position advances by a (meta)tile width.
6. Speaking of metatiles — sub-channels, rather than allowing full flexibility of "write primitive T to offset Y in the buffer", may instead only permit encodings of references to static data tables of design-time pre-composed patterns of primitives. For tilemaps, these patterns are often called "meta-tiles" or "macro-blocks". (This is one reason sub-channels are "lossy" reconstructors: if you can only encode macro-blocks, then you'll often find yourself wanting only some part of a macro-block — which means drawing it and then overdrawing the non-desired parts of it.)
7. Sub-channels may also operate as fixed-function retained-mode procedural synthesis engines, where rather than specifying specific data to write, you only specify for each timestep how the synthesis parameters should change. This is essentially how modular audio synthesis encoding works; but more interestingly, it's also true of the level data "base terrain" sub-channel, which essentially takes "ceiling" and "ground" brush parameters, and paints these in per column according to some pattern-ID parameter referencing a table of [ceiling width][floor height] combinations. (And the retained-mode part means that for as long as everything stays the same, this sub-channel compresses to nothing!)
8. Sub-channels may also contain certain encoded values that branch off into their own special logic, essentially triggering the use of paint-program-like "brushes" to paint arbitrarily within the "canvas." For example, in SMB1, a "pipe tile" is really a pipe brush invocation, that paints a pipe into the window, starting from the tile's encoded position as its top-left corner, painting right two meta-tiles, and downward however-many meta-tiles are required to extend the pipe to the current "base terrain" floor height.
9. Sub-channels may encode values ("event objects") that do not decode to any drawing operation to the target slice buffer, but which instead either immediately upon being encountered ("decompression-time event objects") or when they would be "placed" or "scheduled" if they were regular objects ("placement-time event objects"), just execute some code, usually updating some variable being used during the decompression process or at game runtime. (The thing that prevents you from scrolling the screen past the end of map data, is a screen-scroll-lock event object dropped at just the right position that it comes into effect right before the map would run out of tiles to draw. The thing that determines where a "warp-enabled pipe" will take you, is a warp-pipe-targeting event object that applies to all warp-enabled pipes will take you after it runs, until the next warp-pipe-targeting event object is encountered.)
If at least some of these sub-channels are starting to sound like essentially a bytecode ISA for some kind of abstract machine — yes, exactly. Things like "event objects" and "brush invocations" can be more easily understood as opcodes (sometimes with immediates!); and the "modal variables" as the registers of these instruction streams' abstract machines.
[continued...]