Nothing new here, this is called tool embodiment. A little time after assimilation, you stop consciously thinking about the tool. You are the hammer, the hammer is you.
So what is being missed here is that the brain operates on, well, mental constructs. Ideas and ways of thinking. But those are not the world or the brain, those are tools.
The higher level processes into your mind use ways of thinking the same way it uses the body. Its unconscious (because it has been doing this for enough time) and automatic. The brain just gets guided by the tools. It wants to hammer that nail.
So, what does it have to do with crossing the street and not being able to transmit this knowledge?
You can’t transmit the incorporation. You can describe how to do things, how to think about things, but you can’t reconnect other people’s neurons to establish a way of thinking or a tool as part of the brain image of the self. Yet.
You can’t teach a baby how to embody his spine. You can’t teach someone how to become his thoughts. But you can’t certainly guide then on the use and this usage will build the neural networks. Once established, they’ll get it.
I've never seen the word calibration used this way:
different modes of learning. The first is instruction: the transfer of explicit models, rules, and relationships from one person to another through language. The second is calibration: the development of internal models through repeated exposure to feedback in a specific environment.
Judgement is learnable through calibration. It is not transmissible through instruction.
Unfortunately the word "intuition" has been debased.One of the implications is that at any given point in time, the vast majority of human knowledge is living in people's brains and cannot be stored. The seemingly ineluctable and almost mechanical progression of technology is happening on a thin margin between generational losses.
I disagree to a degree. Yes, what the author says is accurate about people dismissing street-smarts as a lower level of intelligence than it deserves. But a sufficiently skilled communicator can absolutely articulate many of the factors being evaluated when they judge a situation and how their descision-making process works.
> They evaluate intelligence through the lens of articulacy
There was an earlier instance of the author using a word such as unability (or similar) and it should have been inability and I let it go, but this misuse of language is making my head hurt. However, I confess that I thought the word should have been articularity and it turns out that’s not a real word either. But I at least pay attention to spellcheck. I don’t understand how someone could take the time to write a long and thoughtful essay about intelligence and not use spellcheck to proof it.
Science is empirical knowledge and processes which can be transferred, art is gut feeling and subconscious knowledge applied automatically, which can’t be transferred.
Roughly I think this corresponds to how our minds perform cognitive offloading of repeated tasks. New tasks that require instruction following occupy our attention, but the more we do them, the more our minds wire the behavior into our “muscle memory”. Practitioners of the arts (or even the art of science, one might say) have a built a neural network that offloads tasks so that higher cognitive functions can focus on applying those tasks in expert ways.
It’s sort of like when we start out our brains have to bitbang all tasks (muscle movement, speech, etc.) but over time our brains develop their own TCP offloading, or UART peripherals. And you can’t just download a TCP offloading engine, it has to be built into the silicon. Hence why “expert knowledge” isn’t transmissible.
Which is why spaced repetition is an effective learning method. You’re hacking your brain to wire facts into the hardware.
Kinda has me wondering about the implications of the BAR being the end all be all of a law school. Contrast it with a Doctor's residency and i think law school is very much crafting an overly binary right/wrong profession, and perhaps they should have something beyond it more akin to something like officiating a sports game, where they see potential implications of being too stringent applying their rule system when there is certainly room for being charitable. It is a complex issue though, because the charitable interpretation of a law gives way for bad actors to abuse that interpretation.
Now bridge this all with all the weird 1st level and 2nd level stuff surrounding medicine that is placed there by people outside of the field of medicine and imposed on an expert of the medical field. They have to apply their expertise to the patient, decide a course of action, and then describe that action in those 1st/2nd level terms to a non expert who for some reason is the deciding authority, despite downgrading the expert's actual thoughts by design. I know im all over the place, but it was a pretty good article that made me think about a lot of different applications of the ideas.
1. Formally calling out a concept for judgment-based skills that cannot be easily taught. I think everybody understands this, but having a word for this would be useful.
2. Opening up the conversation on the topic of which types of skills can/should be codified and how.
That said, everything else in the article is suss. "Dimensionality" is largely a distraction to try to sound smart, most of the claims in the article are unwarranted (e.g. processes & checklists can be great, even for disciplines with true experts like airline pilots).
For example saying that skill learning cannot be accelerated is just patently false in many domains -- take something like learning chess. If you have a coach and other tools you'll learn a LOT faster. But certainly I've worked places that wished they could automate away reliance on experts because it gives organization power to a few non-management individuals.
What we are generally getting though is a network with extremely high dimensionality trained on many domains at once, at least as far as the commonly used ones like LLMs and VLMs.
We do have mixture of experts which I guess helps to compress things.
Going back to the idea that this stuff just can't be represented by language, I wonder if someday there could be a type of more concise representation than transmitting for example a LoRA with millions of bytes.
Maybe if we keep looking at distillation of different models over and over we might come up with some highly compressed standardized hierarchical representation that optimizes subdomain or expert selection and combination to such a degree that the information for a type of domain expertise can be transmitted maybe not orally between humans but at least in very compact and standard way between models.
I guess if you just take something like a 1B 1 bit model and build a LoRA for a very narrow domain and then compress that. That's something like the idea. Or maybe a quantized NOLA.
But I wonder if someday there will be a representation that is more easily interpretable like language but is able to capture high dimensional complex functions in a standard and concise way.
That said, procedural knowledge remains suspicious because it could hinge on a cheap mental shortcut. A very experienced pedestrian may unconsciously make terrible decisions based on overfitted train data or causally irrelevant variables. Expressing the model "formally" can help expose those terrible decisions (e.g. I feel good about 1980s Fiat Pandas so I cross more confidently when I see them...). Problem is that introspection does not always work well.
"Language is a serial, low-bandwidth channel. It transmits one proposition at a time, sequentially. Each proposition can relate a small number of variables: “if X and Y, then Z.” Complex conditionals can extend this to perhaps five or six variables before the sentence becomes unparseable: “if X and Y but not Z, unless W and V, then Q.” ... "This is not a claim about human cognitive limitations. It is an information-theoretic claim about the channel capacity of natural language relative to the complexity of the models being transmitted."
No.
This definitely is about human cognitive limitations. Consider *why* that 6-variable rule is unparseable: Human working memory is typically about 7 items. His unparseable example has by a generous accounting 10 entities--6 variables and 4 relationships. (And by a strict accounting you need to count the "then", for 11 entities.) Very, very few humans could learn that. Knowledge must be chopped up into chunks small enough to fit into our working memory to be understood and incorporated into our models.
Once one chunk has been incorporated into our model we can then add another chunk to it, repeating until we have built up a complex model. And it's not just read, store, read another--each bit must be worked with to actually be modeled. (But this process does not need to be strictly linear--to take his stupid pedestrian, they can separately learn speed, distance, stopping distance etc and tie them together. But if his pedestrian cares if the driver is inattentive his model is wrong--why are you doing something that involves a moving driver being aware of you in the fist place??)
active knowledge i can produce on command. passive knowledge only comes up when it is triggered from the outside.
a lot of things we know are only accessible when they are triggered. i could not describe the path through a forest, but i know it when i walk it. same for the process to solve a particular problem. i could not describe the whole way. only if i follow the steps can i remember the next steps. and if there are multiple possible steps at some point, only for the choices that are actually triggered will i remember what comes after.
you can follow me and learn the steps by observing me, but you will only learn the steps that we actually do. and i couldn't possibly give you a list of all other potential steps.
in the latest video by tom scott, he watches people creating bells. the guy observing the molten metal knows when it has the right temperature by just looking at it. he learned that from years of practice, and his future replacement will learn it by observing him.
Or how many people know that many CEOs and leaders are idiots, but cannot say it, as they would face retribution.
Yes.
"This is the deepest reason why experience cannot be compressed."
No. Finding relevant features and compressing their valuation is what ML systems do. Especially in vision systems. Early attempts at machine vision had human-chosen features and detectors for them. There used to be papers with suggested features - horizontal lines, vertical lines, diagonal lines, patterns of dots, colors, and such. That's where things were in the 1990s. Those have now been replaced by learning-generated low level feature detectors, which work better.
This does need a lot of training content. The training process is inefficient. It definitely compresses, though.
I agree with the article’s main point, but it’s flagrantly AI and (as usual with AI) way too verbose.