Practical use of LLMs

Posted Aug 8, 2025 16:08 UTC (Fri) by khim (subscriber, #9252)
In reply to: Practical use of LLMs by anselm
Parent article: On the use of LLM assistants for kernel development

> Not really. LLMs don't deal in facts, they deal in probabilities, as in “what word is most likely to complete this partial sentence/paragraph/text?”

Yes, but that's entirely different kettle of fish: humans have the world models, in fact they start form in human brain before humans learn to speak, starting from peekaboo and hide-and-seek games.

LLMs doesn't have anything remotely similar to that, that's why they couldn't say “I don't know how to do that”: humans say that when their world model shows the “hole” and LLMs couldn't do that since there are no world model, it's all probabilities all the way down.

In rare cases where says “I don't know” or “this thing is probably doesn't exist” (it happens sometimes, if rarely) they simply found it highly probably, based on their training set, that this response would be, probably, the most appropriate one.

The only “memory” LLM have are related to probabilities… that doesn't mean that there are long-term memory, it just means that it's different from what humans have.

> If you ask Sam Altman he will tell you that OpenAI is only a year or so (and a few tens of billions of dollars) away from “artificial general intelligence”, but he's been doing that for years now and it's very hard to see how that would work given what they've been doing so far.

That's yet another kettle of fish. I think AGI would soon be relegated to the annals of history, anyway: it's obvious that pure scaling wouldn't give anything similar to “average human worker” any time soon – and AGI is something that have marketable appeal only in that “scale is all you need” world.

If we would be forced to match human capabilities by slowly and painstakingly adding more and more specialized modules, then AGI loses it's appeal: is still achievable – but somewhere in XXII or XXIII century where last 0.01% of something that human was doing better than pre-AGI system is conquered.

By that time our AI is so drastically superhuman at everything else that saying that we have reached AGI no longer makes sense. It's more of “ oh yeah, finally… at arrived… what else is new?” moment, rather that something to talk about.

Practical use of LLMs

Posted Aug 8, 2025 20:45 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (12 responses)

The main problem with "matching humans" is that they'll have to pass by empathy, emotions, self-conciousness and some may even develop their own religions etc. Then at this point we'll have laws explaining how it's inhumane to treat an AI assistant badly by making it work endlessly and ignoring its suffering. So in the end these assistants will end up being new workers with all the same limitations and problems as other ones in the real world and will not solve that many issues for enterprises, except that they'll eat more power ;-)

Practical use of LLMs

Posted Aug 8, 2025 21:09 UTC (Fri) by khim (subscriber, #9252) [Link] (10 responses)

> The main problem with "matching humans" is that they'll have to pass by empathy, empathy, self-conciousness and some may even develop their own religions etc.

Surprisingly enough that's already covered. Existing chatbots don't have “empathy”, “empathy” or “self-conciousness”, but they imitate them well enough to achieve pretty disturbing results. And I'm pretty sure they would do a superb job working as missionaries for various religious sects. No problem there at all: nefarious uses of LLMs scale surprisingly well.

LLMs fail utterly when long chains of logical reasoning is needed, though.

> So in the end these assistants will end up being new workers with all the same limitations and problems as other ones in the real world and will not solve that many issues for enterprises, except that they'll eat more power ;-)

Highly unlikely. In fact the biggest obstacle to the use of LLMs is the fact that people try to apply what they have learned from books and movies about how “sentient robots” would behave over last century or so. Which is understandable, but also incredibly wrong.

In books and movies “sentient robots” are always logical, correct and precise and it's a big problem for them to express emotion or simulate empathy… in real world LLMs can do all these things that “sentient robots” from all these countless books and movies struggled pretty easily… what they couldn't do are things that people expect them to do: logical reasoning, precision, reproducibility…

That's another thing that plagues the whole industry: what all the presentations and demos portray and “sell” and what CEO expect to buy… are these “sentient robots” from movies. What they get… is something entirely different, something totally unsuitable for the role where “sentient robots” would fit perfectly.

That's why Klarna rehires people back, and IBM hires people for “critical thinking” focused domains and Duolingo puts people back… it's all because LLMs are the total opposite from “sentient robots” in movies.

If you read a summary papers written by press-people then you would hear how people are rehired because robots “lack empathy” or “emotions”, but that's a big fat lie: robots have more than enough empathy and emotions, spammers just simply love that aspect of LLMs… what they lack are “common sense” and “logic”.

Practical use of LLMs

Posted Aug 8, 2025 22:41 UTC (Fri) by Wol (subscriber, #4433) [Link]

> what they lack are “common sense”

Given that "common sense" isn't common, and rarely makes sense, maybe that's just as well!!!

Cheers,
Wol

Practical use of LLMs

Posted Aug 9, 2025 5:57 UTC (Sat) by wtarreau (subscriber, #51152) [Link] (8 responses)

I don't know how robots behave in movies (I don't have a TV) but I'm well aware that you cannot expect from an LLM to be precise/exact, because it should not be seen as a sequential computer program that can be debugged and made reliable, but as something trying to imitate our brain with many inter-connections based on what was learned, probabilities and noise. I.e. in order to think like us they have to be as unreliable. Their value however is in having abilities to directly use computer-based tools without having to physically move fingers, so they can use calculators and web search faster than us. But the risk of inaccurately recopying a result remains non-null... like with humans who can get distracted as well.

Practical use of LLMs

Posted Aug 9, 2025 9:09 UTC (Sat) by excors (subscriber, #95769) [Link]

> Their value however is in having abilities to directly use computer-based tools without having to physically move fingers, so they can use calculators and web search faster than us.

One example I've seen is giving ChatGPT 5 - which was announced as having "PhD-level intelligence" - the prompt "Solve: 5.9 = x + 5.11". When I repeated it myself, 50% of the time it said x=0.79, and 50% of the time it said x=-0.21.

In both cases it gave a superficially reasonable step-by-step explanation, saying things like "5.90 - 5.11 = (5.90 - 5.11) = 0.79 but since 5.90 is less than 5.11 in the hundredths place, the result will be negative: x = -0.21". That's nonsense, but it's confidently-stated half-correct nonsense, which engenders undeserved levels of trust.

In theory the system could make use of external calculators and powerful, reliable, human-designed algebraic tools. In practice it doesn't - it does probabilistic calculations on language tokens, resulting in something that sounds like a mathematical calculation but actually isn't, making it untrustworthy for even trivial tasks like this. (And somehow this is worth half a trillion dollars.)

Practical use of LLMs

Posted Aug 9, 2025 9:42 UTC (Sat) by khim (subscriber, #9252) [Link] (2 responses)

> I.e. in order to think like us they have to be as unreliable.

Nope. It order to think you need reliable world model somewhere under all these words. Half-century old SHRDLU may think while ChatGPT-5 couldn't.

Sure, humans make mistakes (especially when they are distracted), but they may also notice them automatically and fix them. Thus doesn't work with LLMs, in fact if you try to push them they become even less accurate, then when they are not “thinking”.

> as something trying to imitate our brain with many inter-connections based on what was learned, probabilities and noise

That's not imitation of human brain, though. That's imitation of insect brain or, maybe, a chimps brain (although a chimps have world model even if they are less complicated than humans world model). It's pure reaction with nothing to control the “train of though” and to stop it from derailing.

The best illustration to what is happening with “reasoning” LLMs is picture from Wikipedia in the article Taylor series where it shows “sin x and its Taylor approximations by polynomials of degree 1, 3, 5, 7, 9, 11, and 13 at x = 0”.

It's very easy to see that as “as the degree of the Taylor polynomial rises, it approaches the correct function” – but if you actually look on picture you'll notice how it does that: it becomes ever more precise in the small, but growing area around zero, but, simultaneously, also become ever more absurdly wrong in the area around that central part.

And that's what is happening with LLMs: they are becoming ever more impressive at “one-shotting” things, yet. simultaneously, ever more helpless with attempts to handle a long series of tasks.

This is similar to how very small kids behave, but eventually human learns to double-check and self-verify things… LLMs couldn't learn that, they simply have no mechanisms suitable for that.

The latest fad in AI is to attach “tools” to LLMs and hope that Python interpreter would be able to work a reliable replacement for a world model. It wouldn't work: this would slightly expand area where LLMs would be able to “one-shot” things, but wouldn't fix the fundamental flaw in their construction.

Practical use of LLMs

Posted Aug 9, 2025 10:32 UTC (Sat) by Wol (subscriber, #4433) [Link] (1 responses)

> That's not imitation of human brain, though. That's imitation of insect brain or, maybe, a chimps brain (although a chimps have world model even if they are less complicated than humans world model). It's pure reaction with nothing to control the “train of though” and to stop it from derailing.

It's not an imitation of ANY brain. Think about it. The brain has a lot of dedicated hardware, be it visual recognition, auditory recognition, whatever. And a small veneer of general purpose hardware over the top. AI runs on pure general purpose hardware.

And has been pointed out, a lot of the brain's special-purpose hardware is survival-ware - if the hardware gets it wrong, it's likely to end up as a lion's lunch, or whatever ...

Cheers,
Wol

Practical use of LLMs

Posted Aug 9, 2025 10:40 UTC (Sat) by khim (subscriber, #9252) [Link]

> The brain has a lot of dedicated hardware, be it visual recognition, auditory recognition, whatever.

Isn't that GPT-5 “tools” and voice recognition in Gemini Live is for?

> AI runs on pure general purpose hardware.

Not really. It can be run, in theory, on general purpose hardware, but it's not clear if GPT-5 run on general purpose hardware would be at all practical.

Even if you just think about BF16… it's pretty specialized thingie.

> And has been pointed out, a lot of the brain's special-purpose hardware is survival-ware - if the hardware gets it wrong, it's likely to end up as a lion's lunch, or whatever ...

Sure, but do we actually use that hardware where we are writing code? Somehow I doubt it. It's like arguing that LLM couldn't write good code because it doesn't have liver… sure, liver is very important for human, but lack of liver is not what stops LLM from being good software designer.

Practical use of LLMs

Posted Aug 11, 2025 10:11 UTC (Mon) by paulj (subscriber, #341) [Link] (3 responses)

> I'm well aware that you cannot expect from an LLM to be precise/exact, because it should not be seen as a sequential computer program that can be debugged and made reliable,

The problem is (invent plausible stat and confidently handwave it about - highly appropriate in a thread on LLMs!) 99.999% of the populace doesn't know this, and lack the combination of technical background, curiosity and time to come to understand this. They think - because the hype machine (the effective combination of companies wanting to sell stuff and non-technical media jumping on pushing the latest buzz) has told them so - that this stuff is "intelligent" and will solve all problems.

Practical use of LLMs

Posted Aug 12, 2025 11:48 UTC (Tue) by wtarreau (subscriber, #51152) [Link]

Absolutely!

But in addition starting to be careful about LLMs also teaches people to be careful of other people looking too smart. There is a huge confusion between knowledge and intelligence in general. Lots of people use the term "smart" or "intelligent" to describe a very knowledgeable person, and consider that someone lacking culture is "dumb". But I've seen people who, once explained the details of a problem, would suggest excellent ideas on how to solve them. *This* is intelligence. Those who only know everything and cannot use it except to look smart in conversations are just parrots. Of course it's way better when you have the two at once in the same person, and often smart people like to learn a lot of new stuff. But each profile has its uses. Right now LLMs solve only one part of the deduction needed for intelligence, and know a little bit of everything but nothing deeply enough to express a valid opinion or advice. Yes most people (as you say, 99.999% to stick with this thread) tend to ask them advices and opinions on stuff they are expected to know well since coming from the internet, but that they only superficially know.

Practical use of LLMs

Posted Aug 12, 2025 17:16 UTC (Tue) by raven667 (subscriber, #5198) [Link] (1 responses)

I was saying on Mastodon the other day that when the OpenAI board tried to oust Sam Altman after he offered ChatGPT to the public that was probably the right call, they already had tested and knew the weaknesses of LLM models as a path to "AGI", but once the hype-train had left the station it proved impossible to stop. It's unlikely that any of this "AI" fever will work out to our betterment in the long run, and it's going to be used to destroy trillions of dollars of human capital in the meantime, at a time when we probably cannot afford it.

Practical use of LLMs

Posted Aug 12, 2025 17:21 UTC (Tue) by pizza (subscriber, #46) [Link]

> and it's going to be used to destroy trillions of dollars of human capital in the meantime, at a time when we probably cannot afford it.

To me it's not the "destruction" of so much [human] capital but the wasted/squandered opportunities.

Practical use of LLMs

Posted Aug 11, 2025 10:06 UTC (Mon) by paulj (subscriber, #341) [Link]

Asimov has (at least) 1 story around this. E.g. Bicentennial Man (later made into a film, with Robbin Williams as the robot Andrew).