• Modern_medicine_isnt@lemmy.world
    link
    fedilink
    English
    arrow-up
    21
    ·
    2 days ago

    It’s easy, just ask the AI “are you sure”? Until it stops changing it’s answer.

    But seriously, LLMs are just advanced autocomplete.

    • jj4211@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 day ago

      I kid you not, early on (mid 2023) some guy mentioned using ChatGPT for his work and not even checking the output (he was in some sort of non-techie field that was still in the wheelhouse of text generation). I expresssed that LLMs can include some glaring mistakes and he said he fixed it by always including in his prompt “Do not hallucinate content and verify all data is actually correct.”.

      • Passerby6497@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 day ago

        Ah, well then, if he tells the bot to not hallucinate and validate output there’s no reason to not trust the output. After all, you told the bot not to, and we all know that self regulation works without issue all of the time.

        • jj4211@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          ·
          1 day ago

          It gave me flashbacks when the Replit guy complained that the LLM deleted his data despite being told in all caps not to multiple times.

          People really really don’t understand how these things work…

          • Modern_medicine_isnt@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 day ago

            The people who make them don’t really understand how they work either. They know how to train them and how the software works, but they don’t really know how it comes up with the answers it comes up with. They just do a ron of trial and error. Correlation is all they really have. Which of course is how a lot of medical science works too. So they have good company.

    • Lfrith@lemmy.ca
      link
      fedilink
      English
      arrow-up
      10
      arrow-down
      1
      ·
      2 days ago

      They can even get math wrong. Which surprised me. Had to tell it the answer is wrong for them to recalculate and then get the correct answer. It was simple percentages of a list of numbers I had asked.

      • jj4211@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 day ago

        Fun thing, when it gets the answer right, tell it is was wrong and then see it apologize and “correct” itself to give the wrong answer.

      • GissaMittJobb@lemmy.ml
        link
        fedilink
        English
        arrow-up
        9
        arrow-down
        2
        ·
        1 day ago

        Language models are unsuitable for math problems broadly speaking. We already have good technology solutions for that category of problems. Luckily, you can combine the two - prompt the model to write a program that solves your math problem, then execute it. You’re likely to see a lot more success using this approach.

        • jj4211@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          1 day ago

          Also, generally the best interfaces for LLM will combine non-LLM facilities transparently. The LLM might be able to translate the prose to the format the math engine desires and then an intermediate layer recognizes a tag to submit an excerpt to a math engine and substitute the chunk with output from the math engine.

          Even for servicing a request to generate an image, the text generation model runs independent of the image generation, and the intermediate layer combines them. Which can cause fun disconnects like the guy asking for a full glass of wine. The text generation half is completely oblivious to the image generation half. So it responds playing the role of a graphic artist dutifully doing the work without ever ‘seeing’ the image, but it assumes the image is good because that’s consistent with training output, but then the user corrects it and it goes about admitting that the picture (that it never ‘looked’ at) was wrong and retrying the image generator with the additional context, to produce a similarly botched picture.

      • saimen@feddit.org
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 day ago

        I once gave some kind of math problem (how to break down a certain amount of money into bills) and the llm wrote a python script for it, ran it and thus gave me the correct answer. Kind of clever really.