I burned down a forest to confirm
Don’t ask it to name an NFL team that doesn’t end with ‘s’
DeepSeek eventually gets it, but it’s DeepThink takes a good ten minutes of racing ‘thoughts’ and loops to figure it out.
what else is new? It also shit the bed when you asked it how many R’s are in the word strawberry.
If that is what takes our jerbs we are all more fucked than we thought.
I don’t remember what book it was, but in some sci-fi future genetically engineered pigeons took the jobs of secretaries and other computer tasks. I could see ChatGPT taking our jobs even if it’s terrible and fails at tasks all the time, the capitalist will at least be relieved he isn’t giving money to the poors.
That’s likely an inevitable part of it. Labor is the most expensive part so removing the human, and using AI once it becomes sophisticated enough to train pigeons and just shrugging about the error rate (those infernal pigeons! We’re doing the best we can to train them but alas! Errors!) despite what the error rate does. Just wait til your prescription for insulin is approved via pigeon and instead of insulin you get heparin.
Mortalkombatwhoopsie.mp3
I love how the entire economy hinges on taking a machine that is able to statistically approximate a mediocre plagiarism of different styles of text and insisting that it is an omniscient magic fairy capable of complex reasoning.
I use AI a lot for my programming hobby becaus I don’t have anyone to work with, my stack is fairly unique/new, and StackOverflow is dead. Over the past 2 years, this is also my assessment. AI does a very bad job at guestimating a close to correct response. It’s almost never correct, and if it is, it’s the most inefficient way of being correct. It’s plagued with hallucinations and the fact that whole industries are not only replacing programmers way smarter than me with it, but relying on “Token Plinko” for decision making for these industries is truly terrifying.
Not to sound too doomer, but this AI bubble isn’t just going to pop, it’s going to implode. And it’s probably either gonna take whole sectors down with it or it’s gonna be absolute hell on companies restructuring and going back to human labor.
ETA: If you ask ChatGPT to rewrite a chunk of text without any emdashes, it will keep them in and either reword the rest, or just spit out the exact same thing you fed it. It’s a nice ironic bit of info I stumbled across.
I use it for code too and ive noticed the same problems. Sometimes it does really help (me save a search on StackOverflow) but other times it gives me odd spaghetti code or misunderstands the question, or like you said if does something in an odd and inefficient way. But when it works it’s great. And you can give it a “skeleton” of the code you want, of sorts, and have it fill it out.
But if it doesn’t get it on the first try, I’ve found that it will never get it. It’ll just go in circles. And it has just rewritten my code and turned it to mush, and rewriten parts I tell it not to touch etc.
I’m not as big on the anti-LLM train that the rest of Hexbear is on. Its a very specific tool, I’ve gotten some use out of it, but it’s no general intelligence. And I do like joining in the occasional poking fun at it.
This makes it sound like this only happens in this case but this shit happens ALL THE FUCKING TIME
Why do people constantly rediscover the same limitation of LLMs every other week since 2023?
because the snake oil salesmen keep telling us it’s great and improved and keep gobbling up larger and larger proportions of EVERYTHING to run their hallucination machines.
They’re not talking about how great it is at counting letters. This is just using a technology for something it wasn’t meant for and then going on about how it’s useless. If you want to disprove the hype, using evidence that hadn’t been known for the entire production run of commercial LLMs would probably be better.
If it cannot be used for something it wasn’t intended, then it isn’t intelligence. And since language processing is both what it is made from and intended for, this shows that there is no emergent intelligent understanding of its actual speciality function, it’s just a highly refined autocomplete with some bolted-on extras.
Not that more research couldn’t necessarily find that mysterious theoretical threshold, but the focus on public-facing implementations and mass application is inefficient to the point of worthlessness for legitimate improvement. Good for killing people and disrupting things though.
If it cannot be used for something it wasn’t intended, then it isn’t intelligence.
no shit. death to ad men. but LLMs aren’t for most of these stunts. that’s part of the problem but it’s like saying my bike is bad at climbing trees. at least the bike isn’t being advertised for arbory
But what is it for? Other than be a bottomless pit for resources.
It does seem cultish to present the thinking machine but when it is presented with easily verifiable tasks it regularly completely shits the bed but we are supposed to blindly trust it with more complicated matters that aren’t easily verified.
It sucks at other things too. Counting errors are just really easy to objectively verify.
People like Altman claim they can use LLM for creating formal proofs, advancing our knowledge of physics and shit. Fat chance when it can’t even compete with a toddler at counting.
It’s like a cat and mouse game between AI evangelicals and people with actual brains.
I have no idea what the article is on about cause when I ask the question to GPT5 I get: “None currently. Every NFL team name end with an s”
A nice portion of the technical work that goes into these models is maintenance of the facade.
If there’s an article written about a specific question, they will really quickly go in and just hardcode an answer to it.
Situations like this shouldn’t be taken as specific, but as general criticism of the reasoning methodology behind these models that still hasn’t been solved, because the system itself is built in a way that monkey patching statistical anomalies is the only way.
ChatGPT-5 though the web has a temperature greater than 0. The correct behavior is more likely a result of being a non-deterministic system than a concentrated effort to crawl the Internet for articles about bugs and hardcore solutions. The first token in this question will be “yes” or “no” and all further output is likely to support that. Because gpt-5 isn’t a CoT model, it can’t mimic knowledge of future tokens and almost has to maintain previous output, so there’s a good chance of going either way.
very briefly they’re was one nfl football team, the washington Football Team, that didn’t but now they’re the Commanders.
iirc there are only 7-8 teams in all of American pro sports that don’t end in ‘s’. i don’t follow sports besides football but let’s see if i can remember some for fun:
orlando magic
utah jazz
Boston Red Sox
Chicago White Sox
uhhhh i know there’s at least 4 more. definitely another basketball team and maybe some hockey teams but I’m not gonna cheat and look it up
Im almost scared to ask but what are the teams that don’t end with S?
After thinking for 128 seconds and going through a very similar loop, DeepSeek eventually manages to get out and come up with this
The NFL team whose name doesn’t end with the letter “s” is the Washington Football Team, which was the temporary name used by the franchise from 2020 to 2022 before becoming the Washington Commanders. The name “Washington Football Team” ends with “m” in “Team,” not “s.” This is the only exception in recent NFL history, as all other current and historical team names end with “s.”
None in the NFL. But there ARE a few teams in the NBA that don’t end in S, so the same question is common to see for a different sport. I bet this is what’s causing ChatGPT to melt down.
Thanks! Yeah like I don’t follow any USA sports but even I know of Miami Heat, so mahbe that’s why it’s getting hung up on Miami?
top secret
None they’re all pluralized
Ty! I could have sworn Miami Dolphins didn’t have an s but I appear to have been mistaken
Miami Pod
The last letter is an s
deleted by creator
I don’t know what is funnier between the “[team that clearly end with S] doesn’t end with “S” (actually it does)” or the fact that it keeps trying the Miami Dolphins over and over again
ok ok ok ok ok ok i got it this time
MIAMI DOLPHINs fuck!
The local llm of my faculty was smart enough to recognize the s in Miami Dolphins on a second attempt and corrected itself to Miami Dolfin
Bit idea: the Miami Dolphin and it’s just an aging, decrepit Dan Marino being pushed out onto the field in a wheelchair
It doesn’t end with S if you remove the S
Someone should make an LLM that curses when it gets things wrong
there was that one that started a self-degradation cycle, that was pretty neat.
gilbert gottfried in a star trek spoof
I’d guess that got something to do with how the underlying token processing and the “reasoning models” work.
Tokenisation probably turns plurals into singular forms, most of the teams loose the s at the end.
Reasoning mode looks at the previous output, sees that there is an s at the end actually.
You can actually inspect the tokenization using their API. It can get pretty wild.
The reason it keeps going back to Miami is that they do have a team in the NBA that doesn’t end in S - the Miami Heat.