FRESH

Hacker News

Anthropic Education the AI Fluency Index

72 points by armcat

by mlpoknbji

8 subcomments

> But we know that any person who uses AI is likely to improve at what they do.
Do we?

by dmk

3 subcomments

So I guess the key takeaway is basically that the better Claude gets at producing polished output, the less users bother questioning it. They found that artifact conversations have lower rates of fact-checking and reasoning challenges across the board. That's kind of an uncomfortable loop for a company selling increasingly capable models.

by lukev

0 subcomment

This is a highly circular method of evaluation. It correlates "fluency behaviors" with longer conversations and more back and forth.
What it notably does not correlate any of these these behaviors with is external value or utility.
It is entirely possible that those people who are getting the most value out of LLMs are the ones with shorter interactions, and that those who engage in lengthier interactions are distracting themselves, wasting time, or chasing rabbit trails (the equivalent of falling in a wiki-hole, at the most charitable.)
I can't prove that either -- but this data doesn't weigh in one way or the other. It only confirms that people who are chatty with their LLMs are chatty with their LLMs.
In my own case, I find the longer I "chat" with the LLM the more likely I am to end up with a false belief, a bad strategy, or some other rabbit hole. 90% of the value (in my personal experience) is in the initial prompt, perhaps with 1-2 clarifying follow-ups.

by bargainbin

4 subcomments

I’m not alone in finding this against the claims of the product right?
Claude is meant to be so clever it can replace all white collar work in the next n-years, but also “you’re not using it right?” Which one is it?

by kseniamorph

0 subcomment

I feel like the authors make a logical inconsistency. They present the drop in "identify missing context" behavior in artifact conversations as potentially concerning, like people are thinking less critically. But their own data suggests a simpler explanation: artifact conversations show higher rates of upfront specification (clarifying goals +14.7pp, specifying format +14.5pp, providing examples +13.4pp). It's obvious that when you provide more context upfront, you end up with less missing context later. I'd be more sceptical about such research.

by Kye

2 subcomments

You could arrive at the essence of this by just having read and internalized Carl Sagan's The Demon-Haunted World. Especially the Baloney Detection Kit.
In my experience good prompting is mostly just good thinking.

by rickydroll

0 subcomment

While AI fluency is an important question to ask, affordability is another. Can a low-income person use AI to the same level of fluency as a high-income person? Will fluency become another force for income inequality?

by zahlman

0 subcomment

> In line with our recent Economic Index, we find that the most common expression of AI fluency is augmentative—treating AI as a thought partner, rather than delegating work entirely. In fact, these conversations exhibit more than double the number of AI fluency behaviors than quick, back-and-forth chats.
> But we also find that when AI produces artifacts—including apps, code, documents, or interactive tools—users are less likely to question its reasoning (-3.1 percentage points) or identify missing context (-5.2pp). This aligns with related patterns we observed in our recent study on coding skills.
Well, sure. If you're asking the AI to produce artifacts directly, it's likely because you pre-judged yourself less competent to do that kind of analysis.

0 subcomment

by bigstrat2003

0 subcomment

To the extent that this should be a thing, there are very few people I would want doing it less than the company who has repeatedly been caught lying about its product's achievements. Anthropic should not be taken seriously after their track record.

by MarcLore

0 subcomment

by sarkarghya

2 subcomments

Honestly to use llms properly all you need to know is that it’s a next word (or action) prediction model and like all models increased entropy hurts it. Try to reduce entropy to get better results. Rest is just sugarcoated nonsense. To use llms properly you need a physics class.