There have been multiple things which have gone wrong with AI for me but these two pushed me over the brink. This is mainly about LLMs but other AI has also not been particularly helpful for me.
Case 1
I was trying to find the music video from where a screenshot was taken.
I provided o4 mini the image and asked it where it is from. It rejected it saying that it does not discuss private details. Fair enough. I told it that it is xyz artist. It then listed three of their popular music videos, neither of which was the correct answer to my question.
Then I started a new chat and described in detail what the screenshot was. It once again regurgitated similar things.
I gave up. I did a simple reverse image search and found the answer in 30 seconds.
Case 2
I wanted a way to create a spreadsheet for tracking investments which had xyz columns.
It did give me the correct columns and rows but the formulae for calculations were off. They were almost correct most of the time but almost correct is useless when working with money.
I gave up. I manually made the spreadsheet with all the required details.
Why are LLMs so wrong most of the time? Aren’t they processing high quality data from multiple sources? I just don’t understand the point of even making these softwares if all they can do is sound smart while being wrong.
The first time I ever used it I got a bugged response. I asked it to give me a short summary of the 2022 Super Bowl, and it told me Patrick Mahomes won the Super Bowl with a field goal kick.
Now, those two things separately are true. Mahomes won. The game was won on a field goal.
The LLM just looks at the probability that the sentences it’s generating are correct based on its training data, and it smashed two correct statements together thinking that was the most probable reasonable response.
It does that a lot. Don’t use GenAI without checking its output.
I have noticed that it is terrible when you know at least a little about the topic.
Ooh! Now do the press!
Or a more accurate way to say it is AI is terrible all the time but it is easier to notice when you know at least a little about the topic.