I think the problems with, and difficulties of, an AGI—at least one that would evolve from LLMs—runs even deeper than the issues you rightly identify here. LLMs only interact with language via syntax, ignoring semantics entirely. For anyone curious, I discuss this in a recent post: "The Absent Semantics of LLMs": https://mindyourmetaphysics.substack.com/p/the-absent-semantics-of-llms
On the contrary: LLMs only interact with language (at the lowest level of abstraction where they still interact with language at all) as use. LLMs are not like brains; they have no “language modules”, in even a very loose sense. (This is at the root of Chomsky’s uncharacteristically unserious dismissal of them).
If you damage someone’s Broca’s area, you will - to oversimplify the matter - give them expressive aphasia: they can still use and understand words correctly, but struggle with grammatical information. Damage someone’s Wernicke’s area, and you get the opposite: fluent grammatically correct speech that means absolutely nothing.
But you cannot selectively damage the language abilities of an LLM without damaging other things - the structure of English is the single most important part of the structure of the token stream but it is still just a structure in the token stream.
This is semantics, not syntax. True, it’s not our semantics, not the semantics of human sense data. But what of it? We will never see the ground of being face to face, if there even is one - and yet we understand. Not everything, not what it’s like to be a bat - but more enough to outsmart one. (This has gone poorly for the bats.) ChatGPT will never “directly” perceive the ground of its perceptions - i.e. us - either, but grounding is transitive. Its use of language is grounded in ours; ours is grounded in reality; its, however indirectly, is also so grounded.
This is not to say that there aren’t more practical obstacles to LLM-only AI - matters of cost or efficiency or politics. Maybe it takes more data than has ever been produced. Maybe emulating active inference passively takes more GPUs than will ever be made. Maybe all the fabs get nuked. Probably not. But there is no fundamental gap between a world-model-model (a model of many billions of fragments of different world models, more precisely, which admittedly does complicate things) and a plain old model of the world.
It's blank for them. It's not that they have a floor at language. It's as semantically grounded as autocomplete (which is to say not at all — it's purely a type of syntax).
Ok, I guess a tl;dr was necessary, because that has very little to do with my point. Here you go:
"LLMs are incapable of modeling the world" is an extremely hot take but perhaps marginally defensible. If it's true, however, then "LLMs understand English grammar" is straightforwardly false. There is no difference whatsoever between how they know that chairs are for sitting in and how they know that 'chairs' is the plural of 'chair'. There's just a token sequence, and then a giant pile of arithmetic, doing ... something inscrutable. *Not* a likelihood calculation - just something that behaved like one during training. Calling an LLM a next-token predictor is like calling you or I a reproductive-fitness-maximizer.
To Shawn: Or to put it another way (if I understand this correctly), if a system learns how language is actually deployed in practice by world-embedded speakers to the extent that it is able to accurately predict or simulate this deployment then it has also necessarily imbibed facts and structural regularities about the world in which those speakers are embedded - or will appear to have done so to those who interact with it (which is, for most practical purposes, the same thing) .
The twist might be, as Illich noticed, that even more than technology advancing to catch up with us, it's us becoming machinic in order to accommodate it. We increasingly see ourselves as information processing beings. Like thinking in terms of acquiring information from a book instead of having the experience of reading it, and so on with other cibernetic terms that we came to see as normal, but are reductive and disembodying alternatives to words (and associated experiences) which they replaced. So, there is more and more human – AI compatibility also because we're in an extraordinary process of disembodiment, having “medical bodies”(bodies as information, as techno-data coming from experts and devices, instead of felt experience) that think and behave more and more like AIs.
It’s important, I feel, to think about the role the concept of AGI plays in conversations about AI and the future of tech. What is being invoked? What specific predictions are being made? How would a world with AGI be different from one without?
A recurring theme is the acceleration of technological progress: medical discoveries, material abundance, etc. This is the appeal of AGI, the reason we are meant to regard it as a goal worth striving towards (collectively as a civilisation). Generally intelligent systems would be able to work independently, creating, inventing, building, solving problems previously considered intractable – or at least that’s the future we are being sold. The specific architecture of these systems is unimportant; the only thing that matters is their functional capability and their long-term alignment to human goals. In this context, discussions about what intelligence “truly” is are largely academic, and it will always be possible to include something ineffable, some biologically-determined or human-specific criteria to keep those machine upstarts perpetually out of the fold. But the list of “cognitive” tasks machines will be able to perform will continue to grow.
Pay attention to AGI narratives. What happens when the machines surpass us? When they write better novels, make better music, and beat us at all our games? The tech titans, almost invariably, tell us that this AI-dominated future will be one in which humans find meaning by being more human – by interacting with each other, deepening relationships, learning for the sake of learning itself, exploring the world and ourselves. AGI systems replace human workers; they do not replace humanity.
(That said, when a tech CEO predicts that AGI will be achieved in a year or two, I interpret it as marketing hype: “The work we are doing is mysterious and important. Give us your money, come work for us, or get out of our way.”)
Ai will never be able to make better music than us and it'll never be able to learn a random language unless it already has data from that language or humans have solved language formation in some formula. It necessarily moves towards the average (and averaging isn't how humans, or any living creature, thinks (the average of what? It'd be set parameters)).
"Better" is, of course, subjective. A more defensible claim: there will come a time when most of us will almost uniformly prefer new music made by AIs over new music made by humans in blind listening tests.
Not sure how you can defend your claim about LLMs and language learning unless it's a corollary of a broader claim about LLMs being unable to achieve competence on new tasks in the absence of training data related to that task. My response to that would be twofold:
First, the whole point of AGI is to develop systems capable of generalised learning. Claiming that they will never be able to learn new languages begs the question. You confidently assert (without evidence) what the discussion requires you to prove.
Secondly, hoover up enough training data and you will have embedded within that data a lot of useful information about what will seem to casual observers to be random languages or new tasks.
A particular problem I have with these theoretical musings on what AI might be capable of: what about the practical degradation of the Internet today, as a result of the dismaying and quite unexpected deterioration of the once peerless Google search engine? It's been commercialized and smartphonified. It used to appear me to apply some innate logic based on the keywords. Now it's supplying results and suggestions based on crowdsourcing what other people search for. The use of what I call "keyword skills" to hunt up particular articles and subtopics is undermined. The minus sign to sift out unwanted keywords is no longer being reliably applied.
I’ve said to you ( i.e. posted as a comment once) that in another world, I would go find you and convince you to take me on as a student. I’m a fan.
Similarly, I’ve said to you that I love your idea and the telling of it, but I always thought that the internet is exactly what we think it is, just computers and wires and semi-literate people typing out their autobiographies. (to put it briefly)
And my instinct is the same with all poetry and mathematics. I refuse to (and can’t) equate humanity and existence with pure information. It’s correct and right and beautiful to discover, but it’s not actual existence.
I hope you understand that I agree with you here. (It feels like you are doing a tacit Wittgenstein impression, so I think I’m with you.)
-----
As you’ve shown there is a lot of particularity that gets overlooked if we imagine this machine thing we are creating (AI) in human terms. You’d need 8 billion machines to do that. But the real particularity is, as you describe, within a single person.
So here’s my idea. While we work on the computational brain side, we should also be working on an artificial human central nervous system. Specifically, we should maximize it. Double it in complexity. Give it the nerves that go to both sex organs. Make it eat, drink, shit, and fuck but with twice the sensory apparatus. Super shitting and fucking experienced by a super information processor. Make it prone to disease and pain.
But the main thing. Make it mortal and aware of its own death.
------
Hook all that up, let it run and see what sort of poetry it writes, or what it asks us back.
AI does not have my exquisitely sensitive nervous system, nor the fondness for my species that was the result of it. That feels important to remember. AI is never going to panic and scramble to keep us all alive. We need each other to do that.
Well said. Humans are more than computer operators, so any definition of AGI (previously known as AI) most be more than fast, accurate computer operation. Also, has no-one watched Star Trek TNG, Bladerunner /Do Androids Dream, Bicentennial Man? It should be surprising that these somewhat pulpy stories outthink the "AGI" which is, at the end of the day, a marketing term. A talking interface like HAL 9000 certainly seems achievable, and I think we could concede to label that an AI even though it would have relatively limited autonomy and functionality. And, the important point raised in that story, be under the primary control of its programmers not users, like any other digital machine.
The reduction of the human soul to a combination of problem solving and task setting has been going on for a long tone. Descartes was the first to explicitly reduce what he called-- in the best old fashioned tradition-- "The Passions of the Soul" to neurological and physiological activities of the brain as mechanism. As the author correctly points out there is so much more to human desire and will than mere intellectual work. Its is the profound need to solve certain problems in mathematics and physics which drives research and thinking. It is what is revealed about reality that lies behind this intense desire. Plato called it Eros which is gone from most modern thinking. Artificial Thinking is such a revealing phrase. There is no such thing as Artificial Eros or Artificial Desire for Recognition and Honor.
I think the problems with, and difficulties of, an AGI—at least one that would evolve from LLMs—runs even deeper than the issues you rightly identify here. LLMs only interact with language via syntax, ignoring semantics entirely. For anyone curious, I discuss this in a recent post: "The Absent Semantics of LLMs": https://mindyourmetaphysics.substack.com/p/the-absent-semantics-of-llms
On the contrary: LLMs only interact with language (at the lowest level of abstraction where they still interact with language at all) as use. LLMs are not like brains; they have no “language modules”, in even a very loose sense. (This is at the root of Chomsky’s uncharacteristically unserious dismissal of them).
If you damage someone’s Broca’s area, you will - to oversimplify the matter - give them expressive aphasia: they can still use and understand words correctly, but struggle with grammatical information. Damage someone’s Wernicke’s area, and you get the opposite: fluent grammatically correct speech that means absolutely nothing.
But you cannot selectively damage the language abilities of an LLM without damaging other things - the structure of English is the single most important part of the structure of the token stream but it is still just a structure in the token stream.
This is semantics, not syntax. True, it’s not our semantics, not the semantics of human sense data. But what of it? We will never see the ground of being face to face, if there even is one - and yet we understand. Not everything, not what it’s like to be a bat - but more enough to outsmart one. (This has gone poorly for the bats.) ChatGPT will never “directly” perceive the ground of its perceptions - i.e. us - either, but grounding is transitive. Its use of language is grounded in ours; ours is grounded in reality; its, however indirectly, is also so grounded.
This is not to say that there aren’t more practical obstacles to LLM-only AI - matters of cost or efficiency or politics. Maybe it takes more data than has ever been produced. Maybe emulating active inference passively takes more GPUs than will ever be made. Maybe all the fabs get nuked. Probably not. But there is no fundamental gap between a world-model-model (a model of many billions of fragments of different world models, more precisely, which admittedly does complicate things) and a plain old model of the world.
It's blank for them. It's not that they have a floor at language. It's as semantically grounded as autocomplete (which is to say not at all — it's purely a type of syntax).
Ok, I guess a tl;dr was necessary, because that has very little to do with my point. Here you go:
"LLMs are incapable of modeling the world" is an extremely hot take but perhaps marginally defensible. If it's true, however, then "LLMs understand English grammar" is straightforwardly false. There is no difference whatsoever between how they know that chairs are for sitting in and how they know that 'chairs' is the plural of 'chair'. There's just a token sequence, and then a giant pile of arithmetic, doing ... something inscrutable. *Not* a likelihood calculation - just something that behaved like one during training. Calling an LLM a next-token predictor is like calling you or I a reproductive-fitness-maximizer.
To Shawn: Or to put it another way (if I understand this correctly), if a system learns how language is actually deployed in practice by world-embedded speakers to the extent that it is able to accurately predict or simulate this deployment then it has also necessarily imbibed facts and structural regularities about the world in which those speakers are embedded - or will appear to have done so to those who interact with it (which is, for most practical purposes, the same thing) .
The twist might be, as Illich noticed, that even more than technology advancing to catch up with us, it's us becoming machinic in order to accommodate it. We increasingly see ourselves as information processing beings. Like thinking in terms of acquiring information from a book instead of having the experience of reading it, and so on with other cibernetic terms that we came to see as normal, but are reductive and disembodying alternatives to words (and associated experiences) which they replaced. So, there is more and more human – AI compatibility also because we're in an extraordinary process of disembodiment, having “medical bodies”(bodies as information, as techno-data coming from experts and devices, instead of felt experience) that think and behave more and more like AIs.
It’s important, I feel, to think about the role the concept of AGI plays in conversations about AI and the future of tech. What is being invoked? What specific predictions are being made? How would a world with AGI be different from one without?
A recurring theme is the acceleration of technological progress: medical discoveries, material abundance, etc. This is the appeal of AGI, the reason we are meant to regard it as a goal worth striving towards (collectively as a civilisation). Generally intelligent systems would be able to work independently, creating, inventing, building, solving problems previously considered intractable – or at least that’s the future we are being sold. The specific architecture of these systems is unimportant; the only thing that matters is their functional capability and their long-term alignment to human goals. In this context, discussions about what intelligence “truly” is are largely academic, and it will always be possible to include something ineffable, some biologically-determined or human-specific criteria to keep those machine upstarts perpetually out of the fold. But the list of “cognitive” tasks machines will be able to perform will continue to grow.
Pay attention to AGI narratives. What happens when the machines surpass us? When they write better novels, make better music, and beat us at all our games? The tech titans, almost invariably, tell us that this AI-dominated future will be one in which humans find meaning by being more human – by interacting with each other, deepening relationships, learning for the sake of learning itself, exploring the world and ourselves. AGI systems replace human workers; they do not replace humanity.
(That said, when a tech CEO predicts that AGI will be achieved in a year or two, I interpret it as marketing hype: “The work we are doing is mysterious and important. Give us your money, come work for us, or get out of our way.”)
Ai will never be able to make better music than us and it'll never be able to learn a random language unless it already has data from that language or humans have solved language formation in some formula. It necessarily moves towards the average (and averaging isn't how humans, or any living creature, thinks (the average of what? It'd be set parameters)).
"Better" is, of course, subjective. A more defensible claim: there will come a time when most of us will almost uniformly prefer new music made by AIs over new music made by humans in blind listening tests.
Not sure how you can defend your claim about LLMs and language learning unless it's a corollary of a broader claim about LLMs being unable to achieve competence on new tasks in the absence of training data related to that task. My response to that would be twofold:
First, the whole point of AGI is to develop systems capable of generalised learning. Claiming that they will never be able to learn new languages begs the question. You confidently assert (without evidence) what the discussion requires you to prove.
Secondly, hoover up enough training data and you will have embedded within that data a lot of useful information about what will seem to casual observers to be random languages or new tasks.
A particular problem I have with these theoretical musings on what AI might be capable of: what about the practical degradation of the Internet today, as a result of the dismaying and quite unexpected deterioration of the once peerless Google search engine? It's been commercialized and smartphonified. It used to appear me to apply some innate logic based on the keywords. Now it's supplying results and suggestions based on crowdsourcing what other people search for. The use of what I call "keyword skills" to hunt up particular articles and subtopics is undermined. The minus sign to sift out unwanted keywords is no longer being reliably applied.
Here’s some context and then an idea.
I’ve said to you ( i.e. posted as a comment once) that in another world, I would go find you and convince you to take me on as a student. I’m a fan.
Similarly, I’ve said to you that I love your idea and the telling of it, but I always thought that the internet is exactly what we think it is, just computers and wires and semi-literate people typing out their autobiographies. (to put it briefly)
And my instinct is the same with all poetry and mathematics. I refuse to (and can’t) equate humanity and existence with pure information. It’s correct and right and beautiful to discover, but it’s not actual existence.
I hope you understand that I agree with you here. (It feels like you are doing a tacit Wittgenstein impression, so I think I’m with you.)
-----
As you’ve shown there is a lot of particularity that gets overlooked if we imagine this machine thing we are creating (AI) in human terms. You’d need 8 billion machines to do that. But the real particularity is, as you describe, within a single person.
So here’s my idea. While we work on the computational brain side, we should also be working on an artificial human central nervous system. Specifically, we should maximize it. Double it in complexity. Give it the nerves that go to both sex organs. Make it eat, drink, shit, and fuck but with twice the sensory apparatus. Super shitting and fucking experienced by a super information processor. Make it prone to disease and pain.
But the main thing. Make it mortal and aware of its own death.
------
Hook all that up, let it run and see what sort of poetry it writes, or what it asks us back.
Hook all that up, let it run and see what sort of poetry it writes, or what it asks us back.
AI does not have my exquisitely sensitive nervous system, nor the fondness for my species that was the result of it. That feels important to remember. AI is never going to panic and scramble to keep us all alive. We need each other to do that.
Well said. Humans are more than computer operators, so any definition of AGI (previously known as AI) most be more than fast, accurate computer operation. Also, has no-one watched Star Trek TNG, Bladerunner /Do Androids Dream, Bicentennial Man? It should be surprising that these somewhat pulpy stories outthink the "AGI" which is, at the end of the day, a marketing term. A talking interface like HAL 9000 certainly seems achievable, and I think we could concede to label that an AI even though it would have relatively limited autonomy and functionality. And, the important point raised in that story, be under the primary control of its programmers not users, like any other digital machine.
The reduction of the human soul to a combination of problem solving and task setting has been going on for a long tone. Descartes was the first to explicitly reduce what he called-- in the best old fashioned tradition-- "The Passions of the Soul" to neurological and physiological activities of the brain as mechanism. As the author correctly points out there is so much more to human desire and will than mere intellectual work. Its is the profound need to solve certain problems in mathematics and physics which drives research and thinking. It is what is revealed about reality that lies behind this intense desire. Plato called it Eros which is gone from most modern thinking. Artificial Thinking is such a revealing phrase. There is no such thing as Artificial Eros or Artificial Desire for Recognition and Honor.