The hardware was not very good. Too much wire wrap and slow, arrogant maintenance.
I once had a discussion with the developers of Franz LISP. The way it worked was that it compiled LISP source files and produced .obj files. But instead of linking them into an executable, you had to load them into a run-time environment. So I asked, "could you put the run time environment in another .obj file, so you just link the entire program and get a standalone executable"? "Why would you want to do that?" "So we could ship a product." This was an alien concept to them.
So was managing LISP files with source control, like everything else. LISP gurus were supposed to hack.
And, in the end, 1980s "AI" technology didn't do enough to justify that hardware.
A novice was trying to fix a broken Lisp machine by turning the power off and on.
Knight, seeing what the student was doing, spoke sternly: “You cannot fix a machine by just power-cycling it with no understanding of what is going wrong.”
Knight turned the machine off and on.
The machine worked.
Boss: Hey there, you like learning new things right?
Him (sensing a trap): Errr, yes.
Boss: But you don’t program in lisp do you?
Him (relieved, thinking he’s getting out of something): No.
Boss: Good thing they sent these (gesturing at a literal bookshelf full of manuals that came with the symbolics).
So he had to write a tcp stack. He said it was really cool because it had time travel debugging, the ability hit a breakpoint, walk the execution backwards, change variables and resume etc. This is in the 1980s. Way ahead of its time.
You can go read about the real differences on sites like Chips and Cheese, but those aren't pop-sciencey and fun! It's mostly boring engineering details like the size of reorder buffers and the TSMC process node and it takes more than 5 minutes to learn. You can't just pick it up one day like a children's story with a clear conclusion and moral of the story. Just stop. If I can acquire all of your CPU microarchitecture knowledge from a Linus Tech tips video, you shouldn't have an opinion on it.
If you look at the finished product and you prefer the M series, that's great. But that doesn't mean you understand why it's different from the Zen series.
They showed signs that some people there understood that their development environment was it, but it obviously never fully got through to decision-makers: They had CLOE, a 386 PC deployment story in partnership with Gold Hill, but they’d have been far better served by acquiring Gold Hill and porting Genera to the 386 PC architecture.
> No, it wasn’t.
I kind of think it was. The best argument I think is embodied in Kent Pitman's comments in this usenet thread [1] where he argues that for the Lisp Machine romantics (at least the subset that include him) what they are really referring to is the total integration of the software, and he gives some pretty good examples of the benefits they bring. He freely admits there's not any reason why the experience could not be reproduced on other systems, it's that it hasn't been that is the problem.
I found his two specific examples particularly interesting. Search for
* Tags Multiple Query Replace From Buffer
and * Source Compare
which are how he introduced them. He also describes "One of the most common ways to get a foothold in Genera for debugging" which I find pretty appealing, and still not available in any modern systems.[1] https://groups.google.com/g/comp.lang.lisp/c/XpvUwF2xKbk/m/X...
https://userpages.umbc.edu/%7Evijay/mashey.on.risc.html
explains a lot of "what happened in the 1980s?" particularly why VAX and 68k were abandoned by their manufacturers. The last table shows how processors that had really baroque addressing modes, particularly involving indirection, did not survive. The old 360 architecture was by no means RISC but it had simple addressing modes and that helped it survive.
A Lisp-optimized processor would be likely to have indirection and generally complex ways how instructions can fail which gets in the way of efficient pipelined implementations. People like to talk about "separation of specification and implementation" but Common Lisp was designed with one eye on the problem of running it efficiently on the "32-bit" architectures of the 1980s and did OK on the 68k which was big then and also with the various RISC architectures and x86 which is simple enough that it is practical to rewrite the instruction stream into microinstructions which can be easily executed.
I feel it would be cool to sometime run code on a radiation hardened Forth chip, or some obscure Lisp hardware, but would it be life changing? I doubt it.
It's not like it's the only system that suffers this, but "working well with others" is a big key to success in almost every field.
I'm absolutely fascinated by what worked and was possible in that venue, just like I find rust code fascinating. These days lisp is much more workable, as they slowly get over the "must coexist with other software". There are still things that are really hard to put in other computer languages.
It's hard to find where to draw the line when it comes to specialized hardware, and the line moves forth and back all the time. From personal experience it went from something like "multiple input boards, but handle the real time Very Fast interrupts on the minicomputer". And spend six months shaving off half a millisecond so that it worked (we're in the eighties here). Next step - shift those boards into a dedicated box, let it handle the interrupts and DMA and all that, and just do the data demuxing on the computer. Next step (and I wasn't involved in that): Do all the demuxing in the box, let the computer sit back and just shove all of that to disk. And that's the step which went too far, the box got slow. Next step: Make the box simpler again, do all of the heavy demuxing and assembling on the computer, computers are fast after all..
And so on and so forth.
i barely got to play with one for a few hours during an "ai" course, so i didn't really figure much of it out but ... oh yeah, it was "cool"! also way-way-way over my budget. i then kept an eye for a while on the atari transputer workstation but no luck, it never really took off.
anyway, i find this article quite out of place. what hordes of romantically spoiled lisp machine nostalgia fanatics harassed this poor guy to the extreme that he had to go on this (pretty pointless) disparaging spree?
What? They’re awesome. They present a vision of the future that never happened. And I don’t think anyone serious expects lisp machines to come back btw.
FWIW: Technology Connections did a teardown of why Betamax wasn't better than VHS: https://www.youtube.com/watch?v=_oJs8-I9WtA&list=PLv0jwu7G_D...
And the whole series if you actually enjoy watching these things: https://www.youtube.com/playlist?list=PLv0jwu7G_DFUrcyMYAkUP...
It's probably worth reading this Alan Kay comment, which I excerpted from https://www.quora.com/Papers-about-the-Smalltalk-history-ref... on Quora before it started always blocking me as a robot:
> The idea of microcode was invented by Maurice Wilkes, a great pioneer who arguably made the earliest programmable computer — the EDSAC (pace Manchester Baby). The idea depends partly on the existence of a “large enough” memory that is much faster (3–10 times) than the 1st level RAM of the computer.
> A milestone happened when the fast memory for microcoding was made reloadable. s Now programmable functions that worked as quickly as wired functions could be supplied to make a “parametric” meta-machine. This technique was used in all of the Parc computers, both mainframes and personal computers.
> Typical ratios of speed of microcode memory to RAM were about 5x or more, and e.g the first Altos had 4kbytes (1k microinstructions) that could be loaded on the fly. The Alto also had 16 program counters into the microcode and a shared set of registers for doing work. While running, conditions on the Alto — like a disk sector passing, or horizontal retrace pulse on the CRT — were tied to the program counters and these were concurrently scanned to determine the program counter that would be used for the next microinstruction. (We didn’t like or use “interrupts” … )
> This provided “zero-overhead tasking” at the lowest level of the machine, and allowed the Alto to emulate almost everything that used to be the province of wired hardware.
> This made the machine affordable enough that we were able to build almost 2000 of them, and fast enough to do the functionality of 10–15 years in the future.
> Key uses of the microcode were in making suitable “language machines” for the VHLLs we invented and used at Parc (including Smalltalk, Mesa, etc.), doing real time high quality graphical and auditory “animations/synthesis”, and to provide important systems functions (e.g. certain kinds of memory management) as they were invented.
> It’s worth looking at what could have been done with the early 16 bit VLSI CPUs such as the Intel 8086 or the Motorola 68K. These were CISC architectures and were fast enough internally to allow a kind of microcoding to support higher level language processing. This is particularly important to separate what is a kind of interpreter from having its code fetched from the same RAM it is trying to emulate in.
> The 68K in fact, used a kind of “nano-coding”, which could have been directed to reloadability and language processing.
> The big problem back then was that neither Intel nor Motorola knew anything about software, and they didn’t want to learn (and they didn’t).
> The nature of microcode is that architectures which can do it resemble (and anticipated) the RISC architectures. And some of the early supercomputers — like the CDC 6600 — were essentially RISC architectures as well. So there was quite a bit of experience with this way of thinking.
> In the 80s, the ratio between RAM and CPU cycles was closing, and Moore’s Law was starting to allow more transistors per chip. Accessing a faster memory off CPU chip started to pay off less (because going off chip costs in various ways, including speed).
> Meanwhile, it was well known that caching could help most kinds of architectures (a landmark study by Gordon Bell helped this understanding greatly), and that — if you are going to cache — you should have separate caches for instructions and for data.
> Up to a point, an instruction cache can act like a microcode memory for emulating VHLLs. The keys are for it (a) to be large enough to hold the inner loops of the interpreter, (b) to not be flushed spuriously, and (c) for the machine instructions to execute quickly compared to the cache memory cycle.
> Just to point the finger at Intel again, they did a terrible job with their cached architectures, in part because they didn’t understand what could be gained with VHLLs.
> A really interesting design was the first ARM — which was a pretty clean RISC and tidy in size. It could have been used as an emulator by wrapping it with fast instruction memory, but wasn’t. I think this was a “point of view” disconnect. It was a very good design for the purpose of its designers, and there wasn’t enough of a VHLL culture to see how it could be used at levels much higher than C.
> If we cut to today, and look at the systems that could be much better done, we find that the general architectures are still much too much single level ones, that ultimately think that it is good to have the lowest levels in a kind of old style machine code programmed in a language like C.
> A very different way to look at it might be to say: well, we really want zillions of concurrent and safe processes with very fast intermessaging programmed at the highest levels — what kind of architecture would facilitate that? We certainly don’t want either “interrupts” or long latency process switching (that seems crazy to “old Parc people”. We probably want to have “data” and “processing” be really close to each other rather than separated in the early von Neumann ways.
> And so forth. We won’t be able to be perfect in our hardware designs or to anticipate every future need, so we must have ways to restructure the lowest levels when required. One way to do this these days is with FPGAs. And given what it costs to go off chips, microcoding is far from dead as another way to help make the systems that we desire.
> The simple sum up here is that “hardware is just software crystallized early”, and a good systems designer should be able to design at all levels needed, and have the chops to make any of the levels if they can’t be purchased …
I don't know a lot of Lisp. I did some at school as a teenager, on BBC Micros, and it was interesting, but I never did anything really serious with it. I do know about Forth though, so perhaps people with a sense of how both work can correct me here.
Sadly, Forth, much as I love it and have done since I got my hands on a Jupiter Ace when I was about 9 or 10 years old, has not been a success, and probably for the same reasons as Lisp.
It just looks plain weird.
It does. I mean I love how elegant Forth is, you can implement a basic inner interpreter and a few primitives in a couple of hundred lines of assembler and then the rest is just written in Forth in terms of those primitives (okay pages and pages of dw ADDRESS_OF_PRIMITIVE instructions rather Forth proper). I'm told that you can do the same trick with Lisp, and maybe I'll look into that soon.
But the code itself looks weird.
Every language that's currently successful looks like ALGOL.
At uni, I learned Turbo Pascal. That have way to Modula-2 in "real" programming but by then I'd gotten my hands on an account on the Sun boxes and was writing stuff in C. C looked kind of like Pascal once you got round the idea that curly brackets weren't comments any more, so it wasn't a hard transition. I wrote lots of C, masses and masses, and eventually shifted to writing stuff in Python for doing webby stuff and C for DSP. Python... looks kind of like ALGOL, actually, you don't use "begin" and "end", you just indent properly, which you should be doing. Then Go, much later, which looks kind of like Pascal to me, which in turn looks kind of like ALGOL.
And so on.
You write line after line after line of "this thing does this to that", and it works. It's like writing out a recipe, even more so if you declare your ingredients^W variables at the top.
I love Forth, I really want to love Lisp but I don't know enough about it, but everyone uses languages that look like ALGOL.
In the late 1960s Citroën developed a car where the steering and speed were controlled by a single joystick mounted roughly where the steering wheel would be. No throttle, no clutch, no gears, just a joystick with force feedback to increase the amount of force needed to steer as the car sped up. Very comfortable, very natural, even more so when the joystick was mounted in the centre console like in some aircraft. Buuuuut, everyone uses steering wheels and pedals. It was too weird for people.
Author falls into the same trap he talks about in the article. AI is not going away, we are not going back to the pre-AI world.