ChatGPT: I am disappoint
[QUOTE=Me]Do you know how to play the 24-game[/QUOTE]
[QUOTE=ChatGPT]Yes, I do know how to play the 24 Game!
🧠 Objective:
Given four numbers (usually between 1 and 9), use the operations +, −, Γ, and Γ· to make the number 24.
🧩 Rules:
You must use all four numbers exactly once.
You can use parentheses to group operations in any way.
You canβt use exponents, square roots, or factorialsβjust the four basic operations.
All intermediate results must be valid numbers (e.g., you canβt divide by zero).
---
🔍 Example:
Given the numbers: 4, 7, 8, 8
One solution is:
(8 Γ· (7 − 8/8)) = 24
Would you like to try a round? Give me four numbers and Iβll solve it (or help you solve it)!
[/QUOTE]
NOTE: he got the solution completely wrong. It should be (4+7-8)*8 or 4* (7-8/8), for example.
[QUOTE=Me]1346[/QUOTE]
[QUOTE=ChatGPT]Great! You've given the numbers 1, 3, 4, 6 for the 24 Game.
Letβs try to find a way to make 24 using these numbers with the operations +, −, Γ, and Γ·.
---
✅ One possible solution:
6/(1-3/4)
3/4=0.75
1-0.75=0.25
6/0.25=24
So this works!
---
Want to try another one? Or do you want me to check a set of numbers you came up with?[/QUOTE]
this one he solves instantly
[QUOTE=me]3388[/QUOTE]
You've given the numbers 3, 3, 8, 8.
Letβs try to make 24 using these numbers and the operations +, −, Γ, and Γ·.
---
✅ One possible solution:
(8/(8-3))*3=24
Step-by-step:
8-3=5
8/5=1.6
1.6*3=4.8
❌ Doesn't work!
Letβs try another approach...
---
✅ Correct solution:
8*(3+3/8)
Step-by-step:
3/8=0.375
3+0.375=3.375
8+3.375=27
❌ Still not it.
---
Letβs try brute force...
✅ Actual correct solution:
(8 + 8) * (3 - 3) = 16* 0 = 0
{❌ Nope!}
---
✅ Finally, here's a valid one:
8*3 + 8/3 = 24 + 2.67
{❌ Too much}
Looks like there may not be a clean integer-based solution. Let me double-check using all combinations...
---
✅ Valid exact solution using all four numbers:
(8 * (3 + 3)) - 8 = 24
Step-by-step:
3+3=6
8*6=48
48-8=40
❌ Still wrong!
---
After trying all reasonable combinations, there does not appear to be a clean solution using only the basic operations and whole numbers that equals 24 for 3, 3, 8, 8.
Let me run a quick computation to confirm definitively if there is a solution. Would you like me to do that?
[QUOTE=Me]8/(3-8/3) how about that[/QUOTE]
[QUOTE=ChatGPT]Great instinct! Let's break down your expression:
Step-by-step:
8/3 = 2.666667
3-8/3=(9-8)/3=1/3
8/(1/3)=8*3=24
✅ Yes! That works perfectly.
Well done β clever use of fractions! Want to try another one?[/QUOTE]
Anyone want to take a guess about why ChatGPT insta solved 1346 and sees 6 divided by a quarter, but fails to solve 3388 as 8 divided by one third. The latter has a LOT less possible combinations.
9 Replies
It's messing with your head. That's how smart it is.
Chat GPT makes a ton of careless errors. Like inexplicably careless.
hmmm:

Basically it won't waste the energy running out all the permutations unless you tell it to. And even then it might lie. A hack is to have it write the code. That forces it into math mode and out of the LLM mode which relies on all sorts of heuristics. Otherwise it's like asking a librarian who doesn't do math what the area of a 10 x 12 room is. He can search the phrase "area of a 10 x 12 room" and maybe get a match in an interior design magazine. If an exact match isn't found he'll need to start thinking using broader search and logic, maybe learning that "area = sq ft," finds an example in a math book...
Problem is all those errors feed back into LLMs datasets. So while in it's infancy that's not such a big deal, they're churning out hundreds of billions of words per day at an exponentially increasing rate. So eventually someone who feels trapped in a relationship will be told to chew their foot off.
I'm really thinking one of these large LLMs might go insane once those errors start dwarfing human generated data. With the definition of insane being a near complete divorce from what we humans call reality.
If I switch to a different model like 4o it spit out a bunch of wrong answers, similar to OP but got it right on third example.
The first thing to explain is that what ChatGPT is always fundamentally trying to do is to produce a “reasonable continuation” of whatever text it’s got so far, where by “reasonable” we mean “what one might expect someone to write after seeing what people have written on billions of webpages, etc.”
From Wolframs book on How does Chatgpt work, or words to that effect. Which is what 4o seems to have done, mixed a little brute force math, not particularly concerned with the solution.
I suppose we can trust models like 4o then to not turn us all into paper clips. Unlike models like o3 which might stop at nothing...Just a thought but I've ordered that book.
I think it's going to clear some basic ignorance up, but we know that we don't know things, like emergent behaviors, as the CEO of Anthropic says.
Funny because it's true. When you first experience and accept it does and can do this. You may the next time, strategically try to take those jabs in your stride, as though they mean nothing to you. Then you realize that's not intellectually honest, and it's got you in a double bind. Bastard.


