FRESH

Hacker News

Vintage Large Language Models

83 points by pr337h4m

by unleaded

0 subcomment

Someone has sort of done this:
https://www.reddit.com/r/LocalLLaMA/comments/1mvnmjo/my_llm_...
I doubt a better one would cost $200,000,000.

by carsoon

2 subcomments

We need a library of Alexandria for primary sources. If we had source transparency then referencing back to original sources would be more clear. We could do cool things like these vintage models to reduce bias from current events. Also books in every language and books for teaching each language would help with multimodality. Copyright makes it difficult to achieve the best results for LLM creation and usage though.

by abeppu

3 subcomments

The talk focuses for a bit on having pure data from before the given date. But it doesn't consider that the data available from before that time may be subject to strong selection bias, based on what's interesting to people doing scholarship or archival work after that date. E.g. have we disproportionately digitized the notes/letters/journals of figures whose ideas have gained traction after their death?
The article makes a comparison to financial backtesting. If you form a dataset of historical prices of stocks which are _currently_ in the S&P500, even if you only use price data before time t, models trained against your data will expect that prices go up and companies never die, because they've only seen the price history of successful firms.

by kingkongjaffa

1 subcomments

This would be a good way to verify emergent model capability to synthesize new knowledge.
You give an LLM all the information from right before a topic was discovered or invented, and then you see if it can independently generate the new knowledge or not.
It would be hard to know for sure if a discovery was genuine or accidentally included in the training data though.

by carsoon

1 subcomments

Using old models is a good way to received less biased information about an active event. Once a major event occurs information wars happen that try and change narratives and erase old information. But because models were trained before this the bias that the event causes is not yet present.

by digdugdirk

0 subcomment

I've been wanting to do this on historical court records - building upon the existing cases, one by one, using llms as the "Judge". It'd be interesting to see which cases branch off from the established precedent, and how that cascades into the present.
Any thoughts how one could get started with this?

by ontouchstart

0 subcomment

Cool idea. This might be a interesting literary project along this line ;-)
https://www.gutenberg.org/cache/epub/86/pg86-images.html

by ideashower

0 subcomment

I like the idea of using vintage LLMs to study explicit and implicit bias. e.g. text before mid-19th century believing in racial superiority, gender discrimination, imperial authority or slavery. Comparing that to text since then. I'm sure there are more ideas when you use temporal constraints on training data.

by UltraSane

3 subcomments

Over the long term LLMs are going to become very interesting snapshots of history. Imagine prompting an LLM from 2025 in 2125.

by ijk

0 subcomment

I was hoping that this would be about Llama 1 and comparison with GPT-contaminated models.

by nxobject

0 subcomment

I love the ideas about how we might use historical LLMs to inquire into the past!
I imagine that (the author hints at this), to do this rigorously, spelling out assumptions etc, you’d have to build off theoretical frameworks used to inductively synthesize/qualify interviews and texts, currently around in history and the social sciences.

by mountainriver

0 subcomment