• hanni@lemmy.ml
    link
    fedilink
    arrow-up
    10
    arrow-down
    1
    ·
    edit-2
    3 days ago

    Maybe design the AI to be honest and admit that it is not sure or doesn’t know?

    Edit: thank you for all your interesting and thorough answers.

    • drspod@lemmy.ml
      link
      fedilink
      arrow-up
      56
      ·
      3 days ago

      The problem is that an LLM is a language model, not an objective reality model, so the best it can do is estimate the probability of a particular sentence appearing in the language, but not the probability that the sentence represents a true statement according to our objective reality.

      They seem to think that they can use these confidence measures to filter the output when it is not confident of being correct, but there are an infinite number of highly probable sentences in a language which are false in reality. An LLM has no way of distinguishing between unlikely and false, or between likely and true.

      • nymnympseudonym@lemmy.world
        link
        fedilink
        arrow-up
        3
        arrow-down
        13
        ·
        3 days ago

        language model, not an objective reality model

        Sort of. It’s a generic prediction model. Multi-modal models work the same way as text-only models in this sense.

        So do organic brains.

        Right now, you are “hallucinating” most of your visual field outside the fovea centralis.

        This aspect of your conscious perceptual system is exactly the same kind of high-dimensional interpolation that ML neural networks do.

    • brucethemoose@lemmy.world
      link
      fedilink
      arrow-up
      22
      ·
      edit-2
      3 days ago

      Maybe design the AI to be honest and admit that it is not sure or doesn’t know?

      That’s literally what it does!

      Under the hood, LLMs output 1 ‘word’ at a time.

      Except they don’t. It’s actually the probabilities of the thousands of words in its vocabulary being the most likely word in the block of text its given. It’s literally just 30% "and", 20% "but", '5%, "uh" and so on, for thousands of words.

      In other words, for literally every word, they’re spitting out ‘here’s a table of what I think is most likely the next word, with this confidence.’

      Thing is:

      • This is hidden from users, because the OpenAI standard and such is to treat users like children with a magic box instead of giving them a peek under the hood.

      • The ‘confidence’ is per word, not for the whole answer.

      • It’s just a numerical model. It’s simply a guess of confidence, it doesn’t really know and has no way to reason its own correctness out.

      • What’s more, there’s no going back. If an LLM gets a word obviously ‘wrong,’ it has to choice but to roll with it like an improv actor. It has no backspace button. The only sort-of exception is a reasoning block, where it can follow up an error with a ‘No, wait…’

      • This output is randomly ‘sampled’ so the most likely prediction isn’t even always chosen! It literally means, even if the LLM is gets an answer right, there’s a chance the wrong answer will appear from a pure roll of the dice, which is something OpenAI does not like to advertise.

      • This all seems stupid, right? It is! There are all sort of papers on alternatives to sampling or self correction or getting away from autoregressive architectures entirely, all mostly ignored by the Big Tech offerings you see. There are even ‘oldschool’ sampling methods like beam search or answer trees that have largely been forgotten, because they aren’t orthodoxy anymore.

      EDIT: If you want to see this for yourself, see mikupad: https://github.com/lmg-anon/mikupad

      Or its newer incarnation in ik_llama.cpp: https://github.com/ikawrakow/ik_llama.cpp

      Its UI will show all the ‘possible tokens’ of every word as well as highlight the confidence of what was chosen, with this example showing a low-probability word that was randomly picked. It won’t work with OpenAI, of course, as they now hide the output’s logit probabilities for ‘safety’ (aka being anticompetitive Tech Bro jerks).

      • INeedMana@piefed.zip
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 days ago

        Wait, so the tokens are not “2 to 4 characters” cut as the input goes, anymore? Those can be whole words too?

        • brucethemoose@lemmy.world
          link
          fedilink
          arrow-up
          2
          ·
          edit-2
          2 days ago

          Pretty much. And more.

          “The end.”

          Might be a mere 3 tokens total:

          ‘“The ‘ ‘end.”’ ‘/n/n’

          I don’t know about ClosedAI, but the Chinese models in particular (like Qwen, GLM and Deepseek) went crazy optimizing their tokenizers for English, Chinese, or code, with huge vocabs for common words/phrases and even common groupings of words + punctuation/spacing as single tokens. It makes the models more efficient, as the same text counts as far fewer tokens.

          “About 1 token per word” is a decent estimate for a block of text, even including spaces and punctuation.

    • Voroxpete@sh.itjust.works
      link
      fedilink
      arrow-up
      20
      ·
      edit-2
      3 days ago

      There’s some really good answers here already, but I want to try to key in on one part of your question in particular to try to convey why this idea just fundamentally doesn’t work.

      The problem, put very simply, is that the AI never, ever “knows” anything. For it to be able to admit when it doesn’t know, it would first have to have the ability to know things, and to discern the difference between knowing and not knowing.

      This is what I’ve been getting at with something I’ve been saying for a while now; LLMs don’t hallucinate some answers, they hallucinate every answer.

      An LLM is basically a mathematical model whose job is to create convincing bullshit. When that bullshit happens to align with reality, we humans go “Wow, that’s amazing, how did it know that?” and when it happens to not align we go "Stupid machine hallucinated again. But this is just our propensity for anthropomorphism at work.

      In reality what’s happening is closer to how “psychics” do their shtick. I can say “I’m sensing that someone here recently lost a loved one” and it looks like I have supernatural powers but really I’m just playing the odds. The only difference is that the psychic knows they’re bullshitting. The AI doesn’t, because it does not have a mind, it cannot think, so there is noting there to perceive the concept of objective reality at all. It’s just a really, really large bingo ball tumbler spitting out balls.

      It’s really hard to get your head around this, because LLMs fucking crush the Turing test; it really does feel like we’re talking, if not to a human, than at least to a machine that is capable of thought. Typing a question and getting a meaningful answer back makes it really hard to digest that we’re having a conversation with a machine that has no more capacity for thought than a deck of cards.

      • tiramichu@sh.itjust.works
        link
        fedilink
        arrow-up
        13
        ·
        edit-2
        3 days ago

        Exactly.

        When the predictive text gives the right answer we label it “fact”

        When the predictive text gives the wrong answer we label it “hallucination”

        Both were arrived at by the exact same mechanism. It’s not a hallucination in the sense that “something went wrong in the mechanism” - both good and bad outputs are functionally identical. it’s only a hallucination because that’s what us humans - as actually thinking creatures - decided to name the outputs we don’t like.

    • Ech@lemmy.ca
      link
      fedilink
      arrow-up
      7
      ·
      edit-2
      3 days ago

      That’s the crux of the issue - it’s not AI. They’re not “sure” of anything. They don’t know anything. That’s why they can’t be modified to look like they do.

      “Hallucinating” is what LLMs were built to do. At their very core that’s what they still do and, without a ground-up redesign, that’s what they’ll do forever.

    • skisnow@lemmy.ca
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 days ago

      Despite what OP and most of the comments here would have you believe, that is actually the crux of what was in OpenAI’s recent paper. They observed that most benchmarks and loss functions used for LLMs had a lower penalty overall for guessing than for admitting ignorance, and called for this to change across the industry.

      • JcbAzPx@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        3 days ago

        I suppose answering “I don’t know” to every prompt is at least more accurate than what we have now, but I don’t think they’ll want to risk that.

        • skisnow@lemmy.ca
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 days ago

          Of course. What the paper is suggesting is that during training and evaluation you should reward correct answers, punish wrong answers, and treat abstentions as somewhere in between. Current benchmarks punish abstentions and wrong answers equally, therefore models that guess instead of abstaining score higher on average.