They can even get math wrong. Which surprised me. Had to tell it the answer is wrong for them to recalculate and then get the correct answer. It was simple percentages of a list of numbers I had asked.
Language models are unsuitable for math problems broadly speaking. We already have good technology solutions for that category of problems. Luckily, you can combine the two - prompt the model to write a program that solves your math problem, then execute it. You’re likely to see a lot more success using this approach.
Also, generally the best interfaces for LLM will combine non-LLM facilities transparently. The LLM might be able to translate the prose to the format the math engine desires and then an intermediate layer recognizes a tag to submit an excerpt to a math engine and substitute the chunk with output from the math engine.
Even for servicing a request to generate an image, the text generation model runs independent of the image generation, and the intermediate layer combines them. Which can cause fun disconnects like the guy asking for a full glass of wine. The text generation half is completely oblivious to the image generation half. So it responds playing the role of a graphic artist dutifully doing the work without ever ‘seeing’ the image, but it assumes the image is good because that’s consistent with training output, but then the user corrects it and it goes about admitting that the picture (that it never ‘looked’ at) was wrong and retrying the image generator with the additional context, to produce a similarly botched picture.
I once gave some kind of math problem (how to break down a certain amount of money into bills) and the llm wrote a python script for it, ran it and thus gave me the correct answer. Kind of clever really.
They can even get math wrong. Which surprised me. Had to tell it the answer is wrong for them to recalculate and then get the correct answer. It was simple percentages of a list of numbers I had asked.
Fun thing, when it gets the answer right, tell it is was wrong and then see it apologize and “correct” itself to give the wrong answer.
In my experience it can, but it has been pretty uncommon. But I also don’t usually ask questions with only one answer.
Language models are unsuitable for math problems broadly speaking. We already have good technology solutions for that category of problems. Luckily, you can combine the two - prompt the model to write a program that solves your math problem, then execute it. You’re likely to see a lot more success using this approach.
Also, generally the best interfaces for LLM will combine non-LLM facilities transparently. The LLM might be able to translate the prose to the format the math engine desires and then an intermediate layer recognizes a tag to submit an excerpt to a math engine and substitute the chunk with output from the math engine.
Even for servicing a request to generate an image, the text generation model runs independent of the image generation, and the intermediate layer combines them. Which can cause fun disconnects like the guy asking for a full glass of wine. The text generation half is completely oblivious to the image generation half. So it responds playing the role of a graphic artist dutifully doing the work without ever ‘seeing’ the image, but it assumes the image is good because that’s consistent with training output, but then the user corrects it and it goes about admitting that the picture (that it never ‘looked’ at) was wrong and retrying the image generator with the additional context, to produce a similarly botched picture.
I once gave some kind of math problem (how to break down a certain amount of money into bills) and the llm wrote a python script for it, ran it and thus gave me the correct answer. Kind of clever really.