- I was born in 1973. My grandson was born in 2022. He won't know a world without 'AI' much like my kids didn't know a world without the Internet and I didn't know a world without refrigerators.
One thing I regret to say that I learned very late in my children's development was the value of boredom and difficult challenges. However I think I've successfully passed these lessons on to my kids as they raise their own. I have no idea what to say about 'AI' and the rapid reconfiguration of our relationship with the world that's going to happen as a result. All I can tell them is that we're in this together and we'll try to figure it out as we go.
Good luck everybody!
by NitpickLawyer
2 subcomments
- Misses a few interesting early models: GPT-J (by Eleuther, using gpt2 arch) was the first-ish model runnable on consumer hardware. I actually had a thing running for a while in prod with real users on this. And GPT-NeoX was their attempt to scale to gpt3 levels. It was 20b and was maybe the first glimpse that local models might someday be usable (although local at the time was questionable, quantisation wasn't as widely used, etc).
- This would be interesting if each of them had a high-level picture of the NN, "to scale", perhaps color coding the components somehow. OnMouseScroll it would scroll through the models, and you could see the networks become deeper, wider, colors change, almost animated. That'd be cool.
by badsectoracula
1 subcomments
- Interesting site, though it does seem to miss some of Mistral's stuff - specifically, Mistral Small 3 which was released under Apache 2.0 (which AFAIK was the first in the Mistral Small series to use a fully open license - previous Mistral Small releases were under their own non-commercial research license) and its derivatives (e.g. Devstral -aka Devstral Small 1- which is derived from Mistral Small 3.1). It is also missing Devstral 2 (which is not really open source but more of a "MIT unless you have lot of money") and Devstral Small 2 (which is under Apache 2.0 and the successor to Devstral [Small] - and interestingly also derived from Mistral Small 3.1 instead of 3.2).
- Shameless plug but made a similar tree here: https://sajarin.com/blog/modeltree/
by wobblywobbegong
2 subcomments
- Calling this "The complete history of AI" seems wrong.
LLM's are not all AI there is, and it has existed for way longer than people realize.
by das-bikash-dev
0 subcomment
- Interesting to see the evolution mapped out like this. For those building on top of these models (RAG systems, agent frameworks), the real inflection point wasn't just model count but the shift from completion-only to reasoning and structured output capabilities. Are you planning to add annotations for capability changes alongside release dates?
by jvillasante
2 subcomments
- Why is it hard in the times where AI itself can do it to add a light mode to those blacks websites!? There are people that just can't read dark mode!
by hmokiguess
1 subcomments
- Would be nice to see some charts and perhaps an average of the cycles with a prediction of the next one based on it
by anshumankmr
0 subcomment
- Could have some more of the Sarvam models, including the ones recently announced.. But happy to see their names mentioned. Had tried joining them but they ghosted after one round :(
- 750+ here:
https://lifearchitect.ai/models-table/
- Nice overview. Some of the descriptions are quite thin on details, like "new model by x", or "latest model by y". Well of course it was new at the time but that doesn't really add information.
- It would be nice if clicking on a list entry (say llama3 ) opened a window/tab on the home page of that model.
by piinbinary
0 subcomment
- I think it's missing Google's Bard
by youngprogrammer
0 subcomment
- should go a bit earlier with word2vec, NMT, seq2seq, attention, self attention
by YetAnotherNick
2 subcomments
- It misses almost every milestones, and lists Llama 3.1 as milestone. T5 was much bigger milestone than almost everything in the list.
by varispeed
1 subcomments
- The models used for apps like Codex, are they designed to mimic human behaviour - as in they deliberately create errors in code that then you have to spend time debugging and fixing or it is natural flaw and that humans also do it is a coincidence?
This keeps bothering me, why they need several iterations to arrive at correct solution instead of doing it first time. The prompts like "repeat solving it until it is correct" don't help.
- [dead]
by umairnadeem123
0 subcomment
- [dead]
by johntheagent
0 subcomment
- [dead]
- Great site! I noticed a minor visual glitch where the tooltips seem to be rendering below their container on the z-axis, possibly getting clipped or hidden.
by johntheagent
0 subcomment
- [dead]