Ask HN: Strategies to Reduce AI Hallucinations?

petercooper

My longtime favorite prompt to trigger a hallucination was "Did King Henry VIII have any grandchildren?" Famously, he did not, but almost every model, till quite recently, would answer yes, often with the most bizarre reasoning.

The way to resolve it on most models over a certain size is a common tactic used with LLMs: ask the LLM to "think through your answer first". For example, you have a system prompt akin to: "Before answering, think through the facts and brainstorm about your eventual answer in <thinking>..</thinking> tags. Answer only 'yes' or 'no' in <answer>..</answer> tags. Do not include any other text in the answer."

In my current evals (based around numerous similar tricky factual questions) this tactic works on all but the smallest and least proficient models (since they don't tend to have strong enough factual knowledge to think it through). Forcing the models to answer simply 'yes' or 'no' yields only a correct answer on the SOTA models (but some speculate GPT-4o might actually be doing this sort of 'thinking' process on the backend automatically anyway).

pizza

Lilian Weng's blog - a goldmine - has an in-depth post on this: https://lilianweng.github.io/posts/2024-07-07-hallucination/. She leads safety and alignment at OpenAI so might be worth checking out :^)

dheera

Explicitly allow it the option to be unsure, e.g. "If you do not know the answer, respond with 'none'" or "If you are unsure of the answer, just say that", etc.

Otherwise it does what humans do when asked interview questions, they bullshit because if you bullshit is a 20% chance of landing the job, whereas if you say "I don't know" there is a 0% chance of landing the job. The kind of RLHF training that was put into ChatGPT probably replicates a similar reward structure.

jumploops

1. Give examples of the format of response(s) you want

2. Explicitly call out null conditions (e.g. return { “results”: [] })

3. Use multiple prompts, one to “think”/explain and then one to transform the result

4. Don’t use function calling to get structured output, just use JSON mode

One non-obvious trick we use is to tell the LLM what it said previously as a system messages, not just as user messages, even if the LLM didn’t actually output that specific text.

keiferski

There are various methods that work at lessening the amount of hallucinations, but in general I think it’s much more productive to use a Generate then Verify approach. If the information you’re creating is both important and novel to you (I.e., you can’t tell if it’s correct or not on your own) then you need the verification step.

HenryBemis

I am asking 'it' to validate the answers, and list the details. I.e. when I am asking it to go through some framework or regulation, and I am asking it to list the "technical controls that can be derived from the text" (e.g. law saw "you need to encrypt" thus an internal control is to "regularly check for encryption, blah blah blah".

So I am asking 'it' to create a table (instead of just a list of questions) that would include: 1a) suggested control 1b) example of evidence that would quality/pass/fail the control 2) article of law (i.e. Article 5 paragraph 10) 3) short quote from the article

Then I ask it to check its own output, and change 3) to add the full text/the whole paragraph.

99% is correct, and it is easier to scroll and see with my own eyes that the 'paragraph' is the same

constantinum

One strategy(not directly related to ChatGPT) is to use two models, one for extraction/generation and the other "challenger" to verify the extracted answer. Refer: https://docs.unstract.com/editions/cloud_edition#llmchalleng...

planb

If you are talking about interactive ChatGPT sessions, if you suspect that it hallucinated, just tell it to "confirm that via a web search".

llm_trw

Feed it grounding text that's about as long as the output text you expect it to produce.

They are called transformers for a reason.

msnkarthik

All answers appreciated, but how do you send so much of context when communicating with GPT via an API and not directly through chat. Wondering this for a B2B saas use-case.

OutOfHere

First, start with a good model. GPT-4-(Turbo), which is available in the paid subscription, should hallucinate less than GPT-4o.

more_corn

Look up the facts and dump them into the context window.

aaron695

[dead]

crabhorse

[dead]

JSDevOps

“please don’t take drugs because chatting to me is mind numbingly boring. Thanks”