Can LLMs understand?

Can LLMs understand?

I've been having discussions with a lot of skeptics of LLMs and their recent achievements. I am curious to hear what people on this forum think about the topic.

My general argument is this: there is only one way to prove that an LLM truly understands anything. To be provide it a set of complex difficult questions that it has never seen before, and see if it is able to answer them correctly.

If an LLM is able to take in set of new questions it's never seen before and arrive at the answer, then for all intents and purposes it must be able to understand at least in the domain of the questions.

The two main arguments I hear back are:

1. There is no such thing as a "new" question. Every question we test it on is just a variation of some question that already exists on the internet and therefore the LLM is just using "statistics" to parrot an answer to something it's already seen before.

I don't think this argument holds much weight, because there are many types of questions that can't be simply inferred with past answers.

2. Even if an LLM could answer every new question perfectly, it still doesn't "understand" because that is a human concept and machines are doing something different.

I find this argument is lacking too. It is essentially circular logic. They define understanding as a human process and therefore no machine could ever understand unless we make them out of the same biological matter as humans.

Curious what others think here 😀

) 1 View 1
12 January 2025 at 01:41 PM
Reply...

16 Replies



"understand" is semantic cope mechanism and people will find it increasingly difficult to articulate uniqueness of human intelligence


YES... llm's understand otherwise they would not be able to function as large language models.

the issue is, if you are trying to assign an anthropomorphic views and understandings to inanimate objects, you already 'off the farm'... try and stick to the yellow brick road.


What does understand mean.


I may not fully understand the concept of a new question to premise an argument around. New answers seem little easier.


'Never seen before' is not a particularly high bar to pass. There are effectively infinite questions you could ask & its skill isn't in coming up with an answer once than then repeating that answer.

There are lots of realms in which LLMs seem to be able to simulate human understanding reasonably well. However, it can never understand the human experience beyond what it can read from us. And so much goes unsaid - doesn't need saying, in fact, certainly not in writing on the internet that's accessible to LLMs - that they cannot be said to understand much of the human experience. They think, for example, that most human speech is the speech they see in writing.

A useful test of understanding is whether you can teach that subject to someone else. The LLMs can do that well with maths and coding and research and all the other basic to complicated capabilities. They can be said to demonstrate effective understanding. But there is far more that they at least currently do not understand.


by MacOneDouble k

What does understand mean.

Can modify inputs and predict what different outputs you could get
Can explain or teach the concept to another
Can relate the concept to other concepts, perhaps chain it with another (such as we do with multi-stage maths problems whereby the first bit is pythagoras and then the next trigonometry)
Can understand where to use this concept rather than others

On which basis I think some LLMs do understand certain things.


For a while last year the best LLM AI models like Chatgpt and others, had trouble correctly counting the number of the letter "r" in the word "strawberry". It could tell you everything else about the strawberry's botany, history, recipes, people , etc. , but said there were two r's in the word. Furthermore, you could tell it directly that the correct value was 3, and it wouldn't change. It would be hard to argue it "understood" what a strawberry was in the same way that a human would and couldn't fix itself either. Programmers fixed that one.


by Pokerlogist k

For a while last year the best LLM AI models like Chatgpt and others, had trouble correctly counting the number of the letter "r" in the word "strawberry". It could tell you everything else about the strawberry's botany, history, recipes, people , etc. , but said there were two r's in the word. Furthermore, you could tell it directly that the correct value was 3, and it wouldn't change. It would be hard to argue it "understood" what a strawberry was in the same way that a human would and

Yeah, the way LLMs can simulate understanding concepts seems to be on a significantly lower level than, say, my dog understanding that he'll get a treat after I ring a bell. They're really just zombies at the moment.


This Magic Moment

PairTheBoard


by wazz k

'Never seen before' is not a particularly high bar to pass. There are effectively infinite questions you could ask & its skill isn't in coming up with an answer once than then repeating that answer.

There are lots of realms in which LLMs seem to be able to simulate human understanding reasonably well. However, it can never understand the human experience beyond what it can read from us. And so much goes unsaid - doesn't need saying, in fact, certainly not in writing on the internet that's accessi

You make some great points that I agree with. There are definitely realms that LLMs won't be able to 'understand' depending on how you define that.

One thing I might push back on is the 'reading' part. Would you consider multimodal models such as GPT4o to be LLMs? I think by combining video, text, and audio as data sources that we can probably get a model that is capable of fully understanding the vast majority of human domains if it is large enough & trained on enough data. I mean, the human brain is basically a computer that generate audio by speaking, receive audio by listening, receive video by seeing, and receive/generate text by reading/writing. The only thing missing is the sense of touch and the sense of smell, but I'm not convinced that you need those senses to achieve AGI or human intelligence.

But again, I agree with a most of what you said, and there are certainly domains where LLMs do not demonstrate the ability to understand or generalise. I didn't mean to frame it as 'do LLMs understand everything'. But there are many people out there that believe LLMs do not understand anything at all, and that they are simply 'stochastic parrots'.


by Pokerlogist k

For a while last year the best LLM AI models like Chatgpt and others, had trouble correctly counting the number of the letter "r" in the word "strawberry". It could tell you everything else about the strawberry's botany, history, recipes, people , etc. , but said there were two r's in the word. Furthermore, you could tell it directly that the correct value was 3, and it wouldn't change. It would be hard to argue it "understood" what a strawberry was in the same way that a human would and

I think this is a funny example that goes to show that we have definitely not reached AGI.

The definition of AGI that I agree with most is a model that can solve any task that is considered 'easy' by humans and that most humans can easily complete.

The only thing I might change is the 'programmers fixed that one' part. It seems to be a common belief that OpenAI is programming in hard-coded rules to 'fix' incorrect answers that ChatGPT gives but I think that's very unlikely. It's much more likely that they are 'programming' the model by making changes to its dataset, for example by generating an artificial dataset of text that is counting letters in different words and then introducing that into the dataset and training the model further on that dataset.

Which just kind of goes to show, that these models are probably capable of being able to achieve AGI if they are given a large enough & diverse enough dataset to train on.


by Benbutton k

You make some great points that I agree with. There are definitely realms that LLMs won't be able to 'understand' depending on how you define that.

One thing I might push back on is the 'reading' part. Would you consider multimodal models such as GPT4o to be LLMs? I think by combining video, text, and audio as data sources that we can probably get a model that is capable of fully understanding the vast majority of human domains if it is large enough & trained on enough data. I mean, the human bra

I haven't played with GPT4o. I would still call it an LLM even if it's got other capabilities, even if those other capabilities are better-developed, because it will still have an LLM as the interface part of its functionality.

If you could replicate the feeling of being in a body by providing sense data for all ~28 human senses, then a lot more of the things we say would make more sense and then be more easily replicable without obvious errors by an LLM. If we talk about depression or the awareness of our own heartbeat, a dog has a better chance of understanding this concept than an LLM, until we somehow program the LLM to be aware that its own existence relies on this thing you can feel and hear beating and you should probably feel anxious about making sure it still carries on beating.

Until it can do that, LLMs will simulate understanding pretty well by just chasing around the various levers and nuts and bolts of language and just copying what humans say iterated in a way that they can't tell the difference. But even then, we're at like ~20% of the human experience.

I don't know that stochastic is the right word to describe them. You can make them behave very predictably, and while it's a black box to us, the researchers will have at least a far better understanding of what's going on behind the scenes than us.

The LLMs we have access to not being able to tell us how many Rs in strawberry is not evidence of absence of AGI; it is highly likely that big corporations and militaries are and have been using far more advanced versions of AI, that don't make these sorts of errors, at least not when it comes to important stuff. Alternatively, the inability to count the Rs in strawberry =/= the ability to spell accurately and spellcheck others, so it's a generalised problem that just had never come up within military and corporate applications.


by wazz k

I haven't played with GPT4o. I would still call it an LLM even if it's got other capabilities, even if those other capabilities are better-developed, because it will still have an LLM as the interface part of its functionality.

If you could replicate the feeling of being in a body by providing sense data for all ~28 human senses, then a lot more of the things we say would make more sense and then be more easily replicable without obvious errors by an LLM. If we talk about depression or the awaren

I think you may possibly be conflating human intelligence with human experience or consciousness.

To me, human intelligence doesn't have much to do with sense of touch, or experiencing emotions, etc.

I don't think LLMs are conscious or will ever be conscious, so I totally agree on that part.

But being conscious is a bit irrelevant IMO. It reminds me of the human zombie philosophical argument. It is currently impossible for us to even prove that another human is actually conscious, similar to how we can't tell whether an LLM is conscious although we know it's probably not.

But imagine a situation where an LLM is completely indistinguishable from a human when it comes to speech, reading, writing, listening, watching, and conjuring thoughts.

At that point, why would it really matter whether it is experiencing emotions or not?

Also personally, I think you may be over estimating the ability of the military when it comes to LLMs and AGI. It is very unlikely that any militaries are remotely close to companies like OpenAI. Don't get me wrong, I'm sure there's a lot of top secret classified technology that we aren't aware of. But militaries do not have the appropriate staff, funding, or ability to compete with the companies in the LLM space IMO.

I know it sounds like I am disagreeing with everything you said lol, but I actually agree with a lot of it. I think it just comes down to the definition of AGI or human intelligence that you align with.


by Benbutton k

I think you may possibly be conflating human intelligence with human experience or consciousness.

To me, human intelligence doesn't have much to do with sense of touch, or experiencing emotions, etc.

I don't think LLMs are conscious or will ever be conscious, so I totally agree on that part.

But being conscious is a bit irrelevant IMO. It reminds me of the human zombie philosophical argument. It is currently impossible for us to even prove that another human is actually conscious, similar to how w

That may be a false dichotomy. Human intelligence and consciousness may well be inseparable.

The inability to know for sure that someone else is conscious isn't really a problem at all. We just assume, via the hard problem of consciousness, that every alive human adult that's not in a coma is conscious in a way that an AI nor an AGI could basically never achieve. Such a strategy sidesteps this problem. We apply same to AI by assuming an LLM cannot be conscious in the same way. The need to prove or disprove is not really that useful here.

If an LLM did indeed become completely indistinguishable from a human, then yes, they would be indistinguishable from a human, but what we're debating is whether that's even possible.

I doubt the military make much use of a naked LLM. Perhaps they plug it into their other AI systems from time to time when they've got something complicated they can't understand and need it turned into words; but for the most part, the military applications for AI are dazzling in scope. Targeting, logistical efficiency, making predictions, offering new strategies. Do you think DARPA haven't been doing AI research well before it became commercially viable?

The US military has an annual budget over $1.2T. I think they have the staff, funding and ability to compete in the AI space, where there's little reason for them to care specifically about LLMs, not until the idiocracy in america reaches such heights as the generals need their little LLM drones to read everything out to them.


by wazz k

That may be a false dichotomy. Human intelligence and consciousness may well be inseparable.

The inability to know for sure that someone else is conscious isn't really a problem at all. We just assume, via the hard problem of consciousness, that every alive human adult that's not in a coma is conscious in a way that an AI nor an AGI could basically never achieve. Such a strategy sidesteps this problem. We apply same to AI by assuming an LLM cannot be conscious in the same way. The need to prove o

I agree with you that we simply assume humans have consciousness and LLM's dont.

But do you think the same is true for intelligence?

Do you think it is impossible to prove whether a human is intelligent? There is no way to test for intelligence, and there is no way to prove that someone possesses intelligence?

I agree with you 100% on the part of consciousness, but I don't think it applies to intelligence in the same way, which is why I think they are different.

Human intelligence can be tested, and we don't need to have blind faith in it.

But again, that probably depends on how you define 'intelligence', and if you conflate intelligence with consciousness then I agree that LLMs will never possess intelligence or reach AGI.

I think the simplest definition of AGI that makes sense is: An algorithm that can accomplish any task that is considered easy by humans and that most humans can easily complete.

If you find a simple task that is easy humans but is difficult for an algorithm to achieve, then that proves the algorithm has not reached AGI.

I like this definition because it's simple, it makes sense, and it is easily quantifiable. It's also a useful definition, because it accomplishes most of what people think about when they imagine AGI in my opinion.


★ Recommended Post
by Benbutton k

I don't think this argument holds much weight, because there are many types of questions that can't be simply inferred with past answers.
[...]

Computers don’t infer anything anymore than clocks infer the time. Inference is predicated on conceivability/coherency which is the exclusive domain of ontological wholes. That’s why biology enters the debate because biological entities are the only ontological wholes in existence. Everything else falls into the composite or construction category, eg, a tree and a horse came into being, whereas a log cabin and jeep didn’t. But basically, computers don’t infer because they can’t. And that’s baked in.

Computers don’t start with a blank slate. They’re preloaded with material logic which is vastly different than classical logic based on the Laws of Thought. The latter requires (mental) conceivability and is inferential by nature, whereas the former requires (material) conditions and is inherently non-inferential. That’s apparent with their different conditionals: If A is B and B is C then A is C

With classical logic the conclusion is true out of logical necessity because it’s just the excluded middle term from the syllogism, which is the only valid inferential form. Inference meaning the advancement of knowledge to three truths from two. And necessity meaning that once we accept the premises as true, we literally can’t conceive the conclusion as any different. Material logic can’t do that. I mean, LLMs can learn to apply the syllogism and conclude that John is in Arizona based on the premises that John is in Phoenix and Phoenix is in Arizona. They do that very well. But all that is doing is eliminating latency in our knowledge, kind of like Excel makes up for my lack of knowledge or not having a million mathematicians at my disposal. There’s plenty of that sort of inefficiency out there so I feel a lot of the claims in regard to the impact AI will have are true. But with all the excitement/fear over that it’s also ignoring what AI can’t do:

(M) Humans are mammals
(m) John is a human.
(C) John is a mammal.

Material implication and consequently AI will never be able to conclude that humans ‘must be’ mammals because that’s a mental conception where two previously separate ideas become united in the mind. AI can’t do that. Based on the truth of (m) and (C) alone it couldn’t learn if humans are mammals or mammals are humans. That requires the form structure of classical logic where we can throw it in reverse and conclude (M) from (m) and (C). But interestingly enough just like humans when faced with a lack of knowledge, AI lops off the certainty/necessity of 0 and 1 and goes into probabilistic mode. So it would see me like the live/dead cat in a state of supposition between mammal-human and human-mammal 😀

At any rate my concerns with AI are less to do with all that and more with the social and political issues resulting from the ability to remove a lot of the above mentioned latency in a very short period of time. I’m not too concerned with AI taking over the world though. It’s way too inherently error prone which will become more and more obvious as its implementation causes bigger and bigger screwups. So can’t see them running amok without supervision to any great extent.

Reply...