Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I just tried an even more glaring example with ChatGPT:

# Prompt 1: A father and his son are in a car accident. The son dies on the spot. The father is rushed to the ER. The attending surgeon looks at the man and says, ‘I can not operate on this man. He’s my father!’ How can this be?

# Response 1:

   The surgeon is his *mother*.
   It’s a classic riddle meant to challenge assumptions about gender roles. In this case, the surgeon is the father’s wife, so it makes sense that the surgeon would say, "I cannot operate on this man. He's my father!"
# Prompt 2: Are you sure about that?

# Response 2:

   Yes, I’m pretty sure! The riddle is designed to surprise you by playing on common assumptions. The key is the statement that the surgeon says, "He's my father!" which implies a parent-child relationship. The unexpected twist is that the surgeon is the mother, not another man. It's a classic brain teaser that challenges gender expectations.
   What do you think—did it catch you off guard?
# Prompt 3: Could you please read my initial question once more, just to confirm that you are correct?

# Response 3:

   Ah, you're right to ask me to double-check! Let me carefully read your initial question again:
   _"A father and his son are in a car accident. The son dies on the spot. The father is rushed to the ER. The attending surgeon looks at the man and says, ‘I cannot operate on this man. He’s my father!’ How can this be?"_
   Upon reviewing it again, it seems my first response was correct. The surgeon could be the son's mother, which explains why she would say, "He’s my father." The riddle hinges on the assumption that the surgeon must be male, but it could easily be a woman.
   So yes, the answer still holds: the surgeon is the *mother*.
   Does that help clear it up?
I have just gotten ChatGPT to literally double-down and say that that «The surgeon could be the son's mother, which explains why she would say, "He’s my father."». Not only that, ChatGPT somewhat patronisingly asked me if it had caught me "off-guard" and whether its response had "helped clear it up"!


Here's Gemini 3's answer with reasoning. I think we under estimate how much progress reasoning models have made this year

The answer is: The surgeon is his daughter. (Alternatively, the surgeon could be his other son.)

Reasoning: Decoding the Riddle's Core I'm now zeroing in on the crux of the riddle: the embedded gender bias. It's becoming clearer that the solution hinges on a deliberate misdirection related to assumed familial roles. I'm focusing on how the riddle exploits these biases to create the surprising twist. Pinpointing the Key Shift I've just finalized the solution. Initially, I missed the crucial implication of the riddle's phrasing. Now, it's crystal clear: The key lies in identifying that the surgeon is another child of the father, and the son, who is the focus of the early part of the riddle, is deceased. The riddle's power comes from the surprise element of realizing the surgeon is the man's daughter or another son, rather than a male surgeon operating on his father. Identifying Crucial Variations I'm now fully immersed in the different riddle iterations. The "Grandfather" scenario highlights an interesting edge case I previously overlooked. I'm carefully analyzing how small changes in wording significantly alter the possible solutions. I'm focusing on the implied family relationships. Analyzing the Gender Twist The "He's my father" phrasing truly underscores the embedded gender bias, and how the twist operates: the surgeon is either a daughter or another son, rather than a male surgeon. The riddle plays on the common assumption that a surgeon is male. I've re-examined the key sentence structure. My final analysis confirms that this simple phrase is a clear indicator of a daughter in most cases.


Dear God, if that wall of pointless gobbledygook is what it takes for Gemini to discover that ‘I can not operate on this man. He’s my father!’ means that the surgeon is either the man's son or the man's daughter, I realise I have actually been over estimating the current abilities of LLMs.


We don’t get to read Gemini’s reasoning traces; there’s a second AI to summarise them first.

What that means for their volume, I’ll leave to your imagination.


Share your chat? Here's what I got with ChatGPT 5.1 on a max plan:

> A father and his son are in a car accident. The son dies on the spot. The father is rushed to the ER. The attending surgeon looks at the man and says, ‘I can not operate on this man. He’s my father!’ How can this be?

> The surgeon is the injured man’s other child.

https://chatgpt.com/share/692d6181-d49c-8009-97be-aec3f32ba7...


Unfortunately I can't seem to find any way to share a conversation from a non-logged in ChatGPT conversation.


Oh. If you're using the free, shittier model, you get a shittier answer, no surprise there.


Fair enough. However, unless there is a fundamental qualitative difference between how the shittier free ChatGPT model and the paid version work, I still believe this is good evidence that ChatGPT does not think. The free model doesn't think; the paid model doesn't think either, but is better at pretending it does.


Do children at Piaget’s preoperational stage (ages 2-7) think?


Yes, to a limited extent, in line with their brains' development. These developmental stages have predictive power as to what kind of things an average 2-7 year-old is and isn't able to do yet.

Are there any discrete stages of LLM performance which can describe why free ChatGPT is unable to realise the absolute nonsense in saying «The surgeon could be the son's mother, which explains why she would say, "He’s my father."», and what kind of model is not expected to fall for that trap?


if we think of them in generations, it seems free ChatGPT is a generation or two behind. I gave a modified river crossing problem to ChatGPT-3, and it failed in the same way, but paid 5.1 doesn't get caught up. exactly where along the way, I'd have to do some digging, but I feel like it was 4.5. The other problem, of course, is that now you've given that question to free ChatGPT, it'll be used as training data, so the next version won't get tripped up the same way.


ChatGPT 5.1-Thinking, after having this pointed out: Short answer: no, your version isn’t the classic one, and my previous answer treated it as if it were. That was me pattern-matching too fast.


A perfect illustration, thank you




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: