I believe that the main reason for SO's decline starting around 2018 was that most of the core technical questions had been answered. There was an enormous existing corpus of accepted answers around fundamental topics, and technology just doesn't change fast enough to sustain the site. Then the LLMs digested the site's (beautifully machine-readable) corpus along with the rest of the internet and now the AIs can give users that info directly, resulting in a downward spiral of traffic to SO, fewer new questions, etc.
Vale, Stack Overflow. You helped me solve many tricky problems.
Stack Overflow peaked in 2014 before beginning it's downward decline. How is that at all related to GenAI? GPT4 is when we really started seeing these things get used to replace SO, etc., and that would be early 2023 - and indeed the drop gets worse there - but after the COVID era spike, SO was already crashing hard.
Tailwind's business model was providing a component library built on top of their framework. It's a business model that relies on the framework being good enough for people to want to use it to begin with, but being bad enough that they'd rather pay for the component library than build it themselves. The more comfortable it is to use, the more productive it is, the worse the value proposition is for the premium upsell. Even other "open core" business models don't have this inherent dichotomy, much less open source on the whole, so it's really weird to try and extrapolate this out.
The thing is, people turn to LLMs to solve problems and answer questions. If they can't turn to the LLM to solve that problem or answer that question, they'll either turn elsewhere, in which case there is still a market for that book or blog post, or they'll drop the problem and question and move on. And if they were willing to drop the problem or question and move on without investigating post-LLM, were they ever invested enough to buy your book, or check more than the first couple of results on google?
Like the irony is pretty deep with this one about this.
I am not sure if they could've gotten trademark from Inscryption/if they needed it but if they really wanted, I have found inscryption's ouroborous card to look the best and it was honestly how I discovered ouroborous in the first place! (became my favourite card, I love inscryption)
https://static1.thegamerimages.com/wordpress/wp-content/uplo...
Even just searching Ouroborous on internet gave me some genuinely beautiful Ouroborous illustrations (Some stock photos, some not) but even using a stock photo might have made a better idea than using AI generated Ouroboros photo itself?
Rather we became the product.
This is ridiculous - AI doesn't need to be fed a PDF of a Terraform book to know how to Terraform. Blowing out context with hundreds of OCR'd pages of generic text on how to terraform isn't going to help anything.
The model that is broken is really ultimately going to be "content for hire". That's the industry that is going to be destroyed here because it's simply redundant now. Actual artwork, actual literature, actual music... these things are all safe as long as people actually want to experience the creations of others. Corporate artwork, simple documentation, elevator music.... these things are done; I'm sorry if you made a living making them but you were ultimately performing an artisinal task in a mostly soulless way.
I'm not talking about video game artists, mind you, I'm talking about the people who produced Corporate Memphis and Flat Design here. We'll all be better off if these people find a new calling in life.
In fact, this might be overall good thing, because finally original content will be highly on demand since those companies now use to train their models. But we are probably just in a transition phase.
The other thing is that new sources of input will come, from LLM usage probably, so they cut the middle layer, users input in the LLM is also a form of input, and a hybrid co-creation between users/AI would generate content at much faster rater, which again would be used to train the model, and that would improve their quality.
The new world is one where someone can have an LLM assisted insight, post it on their blog for free, have it indexed by every agentic search engine, and it becomes part of the zeitgeist. That’s the new data that’ll feed the new models: a better information diet over time. And guess what else: models are getting better at identifying - at scale - the high quality info that’s worth using as training data.
If they did actually stumble on AGI (assuming it didn’t eat them too) it would be used by a select few to enslave or remove the rest of us.
I do not know what will replace it, but I will not miss websites trying to monetise my attention
Copyright was predicated on the notion that ideas and styles can not be protected, but that explicit expressive works can. For example, a recipe can't be protected, but the story you wrap around it that tells how your grandma used to make it would be.
LLMs are particularly challenging to wrangle with because they perform language alchemy. They can (and do) re-express the core ideas, styles, themes, etc. without violating copyright.
People deem this 'theft' and 'stealing' because they are trying to reconcile the myth of intellectual property with reality, and are also simultaneously sensing the economic ladder being pulled up by elites who are watching and gaming the geopolitical world disorder.
There will be a new system of value capture that content creators need to position for, which is to be seen as a more valuable source of high quality materials than an LLM, serving a specific market, and effectively acquiring attention to owned properties and products.
It will not be pay-per-crawl. Or pay-per-use. It will be an attention game, just like everything in the modern economy.
Attention is the only way you can monetize information.
1. I pay OpenAI 2. OpenAI rev shares to StackOverflow 3. StackOverflow mostly keeps that money, but shares some with me for posting 4. I get some money back to help pay OpenAI?
This is nonsense. And if the frontier labs are right about simulated data, as Tesla seems to have been right with its FSD simulated visualization stack, does this really matter anyway? The value I get from an LLM far exceeds anything I have ever received from SO or an O'Reilly book (as much as I genuinely enjoy them collecting dust on a shelf).
If the argument is "fairness," I can sympathize but then shrug. If the argument is sustainability of training, I'm skeptical we need these payment models. And if the argument is about total value creation, I just don't buy it at all.
Example 1 is bad, StackOverflow had clearly plateaued and was well into the downward freefall by the time ChatGPT was released.
Example 2 is apparently "open source" but it's actually just Tailwind which unfortunately had a very susceptible business model.
And I don't really think the framing here that it's eating its own tail makes sense.
It's also confusing to me why they're trying to solve the problem of it eating its own tail - there's a LOT of money being poured into the AI companies. They can try to solve that problem.
What I mean is - a snake eating its own tail is bad for the snake. It will kill it. But in this case the tail is something we humans valued and don't want eaten, regardless of the health of the snake. And the snake will probably find a way to become independent of the tail after it ate it, rather than die, which sucks for us if we valued the stuff the tail was made of, and of course makes the analogy totally nonsensical.
The actual solutions suggested here are not related to it eating its own tail anyway. They're related to the sentiment that the greed of AI companies needs to be reeled in, they need to give back, and we need solutions to the fact that we're getting spammed with slop.
I guess the last part is the part that ties into it "eating its own tail", but really, why frame it that way? Framing it that way means it's a problem for AI companies. Let's be honest and say it's a problem for us and we want it solved for our own reasons.
Actually we can. And we will.
The ONLY reason we are here today is because OpenAI, and Anthropic, by extension, took it upon themselves to launch chat bots trained on whatever datasources they could get in a short amount of time to quickly productize their investments. Their first versions didn't include any references to the source material, and just acted as if they knew everything.
When CoPilot was built as a better auto-complete engine, trained on opensource projects, it was an interesting idea, because it doing what people already did. They searched GitHub for examples of the solution or nudged them in that direction. However, the biggest difference, using other project code was stable, because it came with a LICENSE.md that you then agreed to, and paid it forward. (i.e. "I used code from this project").
CoPilot initially would just inject snippets for you, without you knowing the source. It was only later, they walked that back and if you did use CoPilot, it shows you the most-likely source of the code it used. This is exactly the direction all of the platforms seem headed.
It's not easy to walk back the free-for-all system (i.e. Napster), but I'm optimistic over time it'll become a more fair, pay to access system.