Neural networks in both humans and LLMs are hierarchical. There are lower layers and higher layers1. In humans, lower layers process things like blobs of colors and parallel lines. It’s rather complicated in reality, see for example this diagram of different lower parts of the visual cortex from my favorite paper:
But the gist of it is pretty simple - in the lower layers, you have sensory data. In the higher layers, you have abstract concepts. For humans, language isn’t part of the low level sensory input - it’s a…. metastructure? A learned framework emerging from multiple sensory streams. I’m sure linguists have some fancy word for it, but I’m just going to use “metastructure”.
Now for LLMs, language is their entire sensory stream. It’s not a metastructure. Their lower layers learn language like humans learn visual shapes. But it’s reasonable to suppose that at some point, LLMs learn their own metastructures! I’m wondering what they might be.
The first option is fairly easy to grasp and test - LLMs might have some kind of a metastructure for visual things. Concepts like “blue”, “parallel”, “round” can’t make sense for them on a sensory level, but it might be that they learn some things about the visual world that comprise a metastructure. This theory is fairly testable - the mechanistic interpretability methods that Anthropic has been using can be used to investigate if LLMs have higher layer features that encode something like “two parallel lines”. It would be fascinating to learn what tuning those features up would to to e.g. code generation! In a way, this option can be thought of as a mirror - language mirrors vision, vision mirrors language.
The second option is not immediately testable or even something I can really wrap my head about. It’s possible that LLMs have entirely new metastructures. So instead of mirroring the metastructure we have, LLMs can have an metastructure ON TOP of the metastructure of language. A metametastructure, if you will. Theoretically, it’s quite likely, but how would we look for such a thing? What would it even mean to us?
The terminology gets confusing here already - in brains, the hierarchy is composed of cortical regions, while layers mean something else entirely. But I’ll use the word “layers” for both for aesthetic reasons
Hi Sergey, I totally agree that language is not a modality, but metastructures you are talking about are concepts, which in human brain mostly refer to clusters formed from sensory inputs. It's very hard to track down refs to these clusters because they are fundamentally fuzzy and fluid. I think all learning can be conceptualized as clustering. Currently most such clustering (including backprop and Hebbian learning) is effectively centroid-based. I think that's a major handicap, it should be connectivity-based.