LLMs Will Always Hallucinate, and We Need to Live with This

lolinder

> By establishing the mathematical certainty of hallucinations, we challenge the prevailing notion that they can be fully mitigated

Having a mathematical proof is nice, but honestly this whole misunderstanding could have been avoided if we'd just picked a different name for the concept of "producing false information in the course of generating probabilistic text".

"Hallucination" makes it sound like something is going awry in the normal functioning of the model, which subtly suggests that if we could just identify what went awry we could get rid of the problem and restore normal cognitive function to the LLM. The trouble is that the normal functioning of the model is simply to produce plausible-sounding text.

A "hallucination" is not a malfunction of the model, it's a value judgement we assign to the resulting text. All it says is that the text produced is not fit for purpose. Seen through that lens it's obvious that mitigating hallucinations and creating "alignment" are actually identical problems, and we won't solve one without the other.

leobg

Isn’t hallucination just the result of speaking out loud the first possible answer to the question you’ve been asked?

A human does not do this.

First of all, most questions we have been asked before. We have made mistakes in answering them before, and we remember these, so we don’t repeat them.

Secondly, we (at least some of us) think before we speak. We have an initial reaction to the question, and before expressing it, we relate that thought to other things we know. We may do “sanity checks“ internally, often habitually without even realizing it.

Therefore, we should not expect an LLM to generate the correct answer immediately without giving it space for reflection.

In fact, if you observe your thinking, you might notice that your thought process often takes on different roles and personas. Rarely do you answer a question from just one persona. Instead, most of your answers are the result of internal discussion and compromise.

We also create additional context, such as imagining the consequences of saying the answer we have in mind. Thoughts like that are only possible once an initial “draft” answer is formed in your head.

So, to evaluate the intelligence of an LLM based on its first “gut reaction” to a prompt is probably misguided.

Let me know if you need any further revisions!

jampekka

I'm of the opinion that the current architectures are fundamentally ridden with "hallucinations" that will severely limit their practical usage (including very much what the hype thinks they could do). But this article puts an impossible limit to what it is to "not-hallucinate".

It essentially restates well known fundamental limitations of formal systems and mechanistic computation and then presents the trivial result that LLMs also share these limitations.

Unless some dualism or speculative supercomputational quantum stuff is invoked, this holds very much to humans too.

ninetyninenine

Incomplete training data is kind of a pointless thing to measure.

Isn’t incomplete data the whole point of learning in general? The reason why we have machine learning is because data was incomplete. If we had complete data we don’t need ml. We just build a function that maps the input to output based off the complete data. Machine learning is about filling in the gaps based off of a prediction.

In fact this is what learning in general is doing. It means this whole thing about incomplete data applies to human intelligence and learning as well.

Everything this theory is going after basically has application learning and intelligence in general.

So sure you can say that LLMs will always hallucinate. But humans will also always hallucinate.

The real problem that needs to be solved is: how do we get LLMs to hallucinate in the same way humans hallucinate?

davesque

The way that LLMs hallucinate now seems to have everything to do with the way in which they represent knowledge. Just look at the cost function. It's called log likelihood for a reason. The only real goal is to produce a sequence of tokens that are plausible in the most abstract sense, not consistent with concepts in a sound model of reality.

Consider that when models hallucinate, they are still doing what we trained them to do quite well, which is to at least produce a text that is likely. So they implicitly fall back onto more general patterns in the training data i.e. grammar and simple word choice.

I have to imagine that the right architectural changes could still completely or mostly solve the hallucination problem. But it still seems like an open question as to whether we could make those changes and still get a model that can be trained efficiently.

Update: I took out the first sentence where I said "I don't agree" because I don't feel that I've given the paper a careful enough read to determine if the authors aren't in fact agreeing with me.

simonw

A key skill necessary to work effectively with LLMs is learning how to use technology that is fundamentally unreliable and non-deterministic.

A lot of people appear to find this hurdle almost impossible to overcome.

feverzsj

Maybe, it's time for the bubble to burst.

ndespres

We don’t need to “live with this”. We can just not use them, ignore them, or argue against their proliferation and acceptance, as I will continue doing.

namaria

LLMs will go the way of the 'expert systems'. We're gonna wonder why we ever thought that was gonna happen.

I just recommend you don't pidgeonhole yourself and an AI professional because it's gonna be awfully cold outside pretty soon.

gdiamos

Disagree - https://arxiv.org/abs/2406.17642

We cover halting problem and intractable problems in the related work.

Of course LLMs cannot give answers to intractable problems.

I also don’t see why you should call an answer of “I cannot compute that” to a halting problem question a hallucination.

willcipriano

When will I see AI dialogue in video games? Imagine a RPG where instead of picking from a series of pre recorded dialogues, you could just talk to that villager. If it worked it would be mind blowing. The first studio to really pull it off in the AAA game would rake in the cash.

That seems like the lowest hanging fruit to me, like we would do that long before we have AI going over someone's medical records.

If the major game studios aren't confident enough in the tech to have it write dialogue for a Disney character for fear of it saying the wrong thing, I'm not ready for it to anything in the real world.

bicx

I treat LLMs like a fallible being, the same way I treat humans. I don’t just trust output implicitly, and I accept help with tasks knowing I am taking a certain degree of risk. Mostly, my experience has been very positive with GPT-4o / ChatGPT and GitHub copilot with that in mind. I use each constantly throughout the day.

irrational

I prefer confabulate over hallucinate.

Confabulate - To fill in gaps in one's memory with fabrications that one believes to be facts.

Hallucinate - To wander; to go astray; to err; to blunder; -- used of mental processes

Confabulation sounds a lot more like what LLMs actually do.

fsndz

We can't get rid of hallucinations. Hallucinations are a feature not a bug. A recent study by researchers Jim Waldo and Soline Boussard highlights the risks associated with this limitation. In their analysis, they tested several prominent models, including ChatGPT-3.5, ChatGPT-4, Llama, and Google’s Gemini. The researchers found that while the models performed well on well-known topics with a large body of available data, they often struggled with subjects that had limited or contentious information, resulting in inconsistencies and errors.

This challenge is particularly concerning in fields where accuracy is critical, such as scientific research, politics, or legal matters. For instance, the study noted that LLMs could produce inaccurate citations, misattribute quotes, or provide factually wrong information that might appear convincing but lacks a solid foundation. Such errors can lead to real-world consequences, as seen in cases where professionals have relied on LLM-generated content for tasks like legal research or coding, only to discover later that the information was incorrect. https://www.lycee.ai/blog/llm-hallucinations-report

data_maan

Just from the way this paper is written (badly, all kinds of LaTeX errors), my belief that something meaningful was proved here, that some nice mathematical theory has been developed, is low.

Example: The first 10 pages are meaningless bla

zer00eyz

Shakes fist at clouds... Back in my day we called these "bugs" and if you didn't fix them your program didn't work.

Jest aside, there is a long list of "flaws" in LLMS that no one seems to be addressing. Hallucinations, Cut off dates, Lack of true reasoning (the parlor tricks to get there don't cut it), size/cost constraints...

LLM's face the same issues as expert systems, without the constant input of experts (subject matter) your llm becomes quickly outdated and useless, for all but the most trivial of tasks.

advael

It's crazy to me that we managed to get such an exciting technology both theoretically and as a practical tool and still managed to make it into a bubbly hype wave because business people want it to be an automation technology, which is just a poor fit for what they actually do

It's kind of cool that we can make mathematical arguments for this, but the idea that generative models can function as universal automation is a fiction mostly being pushed by non-technical business and finance people, and it's a good demonstration of how we've let such people drive the priorities of technological development and adoption for far too long

A common argument I see folks make is that humans are fallible too. Yes, no shit. No automation even close to as fallible as a human at its task could function as an automation. When we automate, we remove human accountability and human versatility from the equation entirely, and can scale the error accumulation far beyond human capability. Thus, an automation that actually works needs drastically superhuman reliability, which is why functioning automations are usually narrow-domain machines

lsy

LLM and other generative output can only be useful for a purpose or not useful. Creating a generative model that only produces absolute truths (as if this was possible, or there even were such a thing) would make them useless for creative pursuits, jokes, and many of the other purposes to which people want to put them. You can’t generate a cowboy frog emoji with a perfectly reality-faithful model.

To me this means two things:

1. Generative models can only be helpful for tasks where the user can already decide whether the output is useful. Retrieving a fact the user doesn’t already know is not one of those use cases. Making memes or emojis or stories that the user finds enjoyable might be. Writing pro forma texts that the user can proofread also might be.

2. There’s probably no successful business model for LLMs or generative models that is not already possible with the current generation of models. If you haven’t figured out a business model for an LLM that is “60% accurate” on some benchmark, there won’t be anything acceptable for an LLM that is “90% accurate”, so boiling yet another ocean to get there is not the golden path to profit. Rather, it will be up to companies and startups to create features that leverage the existing models and profit that way rather than investing in compute, etc.

mrjin

LLMs can neither understand nor hallucinate. All LLMs are just picking tokens based on probability. So doesn't matter how plausible the outputs look, the reasons lead to the output are absolutely NOT what we expect them to be. But such ugly fact cannot be admitted or the party would be stopped.

danenania

Perplexity does a pretty good job on this. I find myself reaching for it first when looking for a factual answer or doing research. It can still make mistakes but the hallucination rate is very low. It feels comparable to a google search in terms of accuracy.

Pure LLMs are better for brainstorming or thinking through a task.

jongjong

The hallucinations seem to be related to AI's agreeableness. They always seem to tell you what you want to hear except when it goes against significant social narratives.

It's like LLMs know all possible alternative theories (including contradictory ones) and which one it brings up depends on how you phrase the question and how much you already know about the subject.

The more accurate information you bring into the question, the more accurate information you get out of it.

If you're not very knowledgeable, you will only be able to tap into junior level knowledge. If you ask the kinds of questions that an expert would ask, then it will answer like an expert.

Animats

Oh, not that again. Didn't we see this argument about three weeks ago.

A 100% correct LLM may be impossible. A LLM checker that produces a confidence value may be possible. We sure need one. Although last week's proposal for one wasn't very good.

When someone says something practical can't be done because of the halting problem, they're probably going in the wrong direction.

The authors are all from something called "UnitedWeCare", which offers "AI-Powered Holistic Mental Health Solutions". Not sure what to make of that.

badsandwitch

Due to the limitations of gradient descent and training data we are limited in the architectures that are viable. All the top LLM's are decoder-only for efficiency reasons and all models train on the production of text because we are not able to train on the thoughts behind the text.

Something that often gives me pause is the consideration that it is actually possible to come up with an architecture which has a good chance of being capable of being an AGI (RNNs, transformers etc as dynamical systems) but the model weights that would allow it to happen cannot be found because gradient descent will fail or not even be viable.

seydor

Isn't that obvious without invoking godel's theorem etc?

rapatel0

Been saying this from the beginning. Let's look at comparitor of a human result.

What is the likelihood that a junior college student with access to google will generate a "hallucination" after reading a textbook and doing some basic research on a given topic. Probably pretty high.

In our culture, we're often told to fake it till you make it. How many of us are probabilistic-ly hallucinating knowledge we've regurgitate from other sources?

zyklonix

We might as well embrace them: https://github.com/DivergentAI/dreamGPT

reliableturing

I’m not sure what this paper is supposed to prove and find it rather trivial.

> All of the LLMs knowledge comes from data. Therefore,… a larger more complete dataset is a solution for hallucination.

Not being able to include everything in the training data is the whole point of intelligence. This also holds for humans. If sufficiently intelligent it should be able to infer new knowledge, refuting the very first assumption at the core of the work.

99112000

Humans try to replicate the human brain in software and are surprised it sometimes spits out dumb things a human could have said.

renjimen

Models are often wrong but sometimes useful. Models that provide answers couched in a certain level of confidence are miscalibrated when all answers are given confidently. New training paradigms attempt to better calibrate model confidence in post-training, but clearly there are competing incentives to give answers confidently given the economics of the AI arms race.

throwawaymaths

LLMs hallucinate because probs -> tokens erase confidence values and it's difficult to assign confidences to strings of tokens, especially if you don't know where to start and stop counting (one word? one sentence?)

Is there a reason to believe this is not solvable as literally an API change? The necessary data are all there.

nybsjytm

Does it matter that, like so much in the Math for AI sphere, core details seem to be totally bungled? e.g. see various comments in this thread https://x.com/waltonstevenj/status/1834327595862950207

pkphilip

Hallucinations in LLM will severely affect its usage in scenarios where such hallucinations are completely unacceptable - and there are many such scenarios. This is a good thing because it will mean that human intelligence and oversight will continue to be needed.

aptsurdist

But humans also hallucinate.

And humans habitually stray from the “truth” too. It’s always seemed to me that getting AI to be more accurate isn’t a math problem, it’s getting AI to “care” about what is true - aka better defining what truth is- aka what sources should be cited with what weights.

We can’t even keep humans in society from believing in the stupidest conspiracy theories. When humans get their knowledge from sources indiscriminately, they also parrot stupid shit that isn’t real.

Now enter Gödel’s incompleteness Theorem: there is no perfect tie between language and reality. Super interesting. But this isn’t the issue. Or at least it’s not more of an issue for robots than it is for humans.

If/when humans deliver “accurate” results in our dialogs, it’s because we’ve been trained to care about what is “accuracy” (as defined by society’s chosen sources)

Remember that AI “doesn’t live here.” It’s swimming in a mess of noisy context without guidance for what it should care about.

IMHO, as soon as we train AI to “care” at a basic level about what we culturally agree is “true” the hallucinations will diminish to be far smaller than the hallucinations of most humans.

I’m honestly not sure if that will be a good thing or the start of something horrifying.

russfink

Sounds like a missed STTNG story line. I can imagine that such a “Data,” were we ever to build one, would hallucinate from time to time.

nailuj

To tire a comparison to human thinking, you can conceive of it as hallucinations too, we just have another layer behind the hallucinations that evaluates each one and tries to integrate them with what we believe to be true. You can observe this when you're about to fall asleep or are snoozing, sometimes you go down wild thought paths until the critical thinking part of your brain kicks in with "everything you've been thinking about these past 10 seconds is total incoherent nonsense". Dream logic.

In that sense, a hallucinating system seems like a promising step towards stronger AI. AI systems simply are lacking a way to test their beliefs against a real world in the way we can, so natural laws, historical information, art and fiction exist on the same epistemological level. This is a problem when integrating them into a useful theory because there is no cost to getting the fundamentals wrong.

m3kw9

They did say each token is generated using probability, not certainty, given that there is a chance it produces wrong tokens

mxwsn

OK - there's always a nonzero chance of hallucination. There's also a non-zero chance that macroscale objects can do quantum tunnelling, but no one is arguing that we "need to live with this" fact. A theoretical proof of the impossibility of reaching 0% probability of some event is nice, but in practice it says little about whether we can exponentially decrease the probability of it happening or not to effectively mitigate risk.

mrkramer

So hallucinations are something like cancer, it will have sooner or later, in another words, it is inevitable.

ramshanker

Few Humans (all?) will always Hallucinate, and we already live with this. ;)

TMWNN

How goes the research on whether hallucinations are the AI equivalent of human imagination, or daydreaming?

treebeard901

If everyone else can hallucinate along with it then problem solved.

rw_panic0_0

since it doesn't have emotions I believe

jmakov

Maybe time for getting synth minds from guessing to reason.

cynicalpeace

Humans do it too. We just call it “being wrong”

OutOfHere

This seems to miss the point, which is how to minimize hallucinations to a desirable level. Good prompts refined over time can minimize hallucinations by a significant degree, but they cannot fully eliminate them.

9733105200

[flagged]

reilly3000

Better given them some dried frog pills.

_cs2017_

Useless trash paper. It's like saying any object can disappear and reappear anywhere in the universe due to quantum physics, so there's no point studying physics or engineering. Just maybe we care if the probability of that happening is 10%, 0.00001%, or 1e-1000%.