Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> But they do not think

I see this argument made a lot. I'm not sure if the distinction really holds weight once we start to unravel though.

What's a topic you're able to think about that an LLM is not able to think about?





Any topic with little coverage in the training data. LLMs will keep circling around the small bits in the training data, unable synthesize new connections.

This is very obvious when trying to use LLMs to modify scripts in vendor-specific languages that have not been widely documented and don't have many examples available. A seasoned programmer will easily recognize common patterns like if-else blocks and loops, but LLMs will get stuck and output gibberish.


Edit: HN filters emojis in comments. TLDR: the bots reply with various sea-related emojis and can't explain that there's no seahorse emoji.

on Meta (whatsapp) chatbot:

me: what is the seahorse emoji?

bot:

me: this is not a seahorse

bot: You got me! The actual seahorse emoji is . Want me to use it in a fun sentence?

on Bing/Copilot:

me: show me the seahorse emoji

bot:

------------

Here you go: — actually, that’s a fun combo, but the real seahorse emoji is … wait, not quite! The correct one is for jellyfish and for unicorn, but the seahorse emoji itself is ? Let me clarify:

The official seahorse emoji is: ? No — correction! It’s for unicorn. The actual seahorse emoji is: ?

Let’s be precise: the seahorse emoji is ?

Oops — I misspoke earlier. The true seahorse emoji is: ?

Wait — let’s settle this cleanly: The seahorse emoji is: ?

Actually, the correct one is: ?

---------------

In some instances, Copilot answers that "there's no seahorse emoji _and that's why I keep running in circles_". This is just pasting answers, as this is a popular problem discussed on the internet. If the bot were reasoning, it would just answer "there's no seahorse emoji, please stop asking".


If that's the benchmark, then Opus 4.5 (with "extended thinking") can think:

    > Me: what is the seahorse emoji?
    > Claude: There isn't a seahorse emoji in the standard Unicode emoji set. The closest you'll get is the generic fish  or tropical fish , but no dedicated seahorse exists as of now.

Copilot is the absolute worst. Yesterday I had tried to have it create a printable calendar for January 2026 but no matter how I instructed it, it kept showing that the first was on a Wednesday, not Thursday. I even fed it back its own incorrect PDF in a new conversation, which clearly showed the 1st on a Wednesday and asked it to tell me what day the calendar showed the first on. It said the calendar showed the 1st as a Thursday. It started to make me disbelieve my own eyes.

Edit: I gave up on Copilot ant fed the same instructions to ChatGPT, which had no issue.

The point here is that some models seem to know your intention while some just seem stuck on their training data.


I can solve a mystery novel based on the evidence alone. Assuming an LLM doesn’t already have the answer it will offer solutions based on meta-information like how similar mysteries conclude or are structured. While this can be effective, it’s not really solving the mystery and will fail with anything truly novel.

People more often do the same thing and act like llm’s.

I asked GPT for rules on 101-level French grammar. That should be well documented for someone learning from English, no? The answers were so consistently wrong that it seemed intentional. Absolutely nothing novel asked of it. It could have quoted verbatim if it wanted to be lazy. I can't think of an easier question to give an LLM. If it's possible to "prompt wrong" a simple task that my six-year old nephew could easily do, the burden of proof is not on the people denying LLM intelligence, it's on the boosters.

> the burden of proof is not on the people denying LLM intelligence, it's on the boosters

It's an impossible burden to prove. We can't even prove that any other human has sentience or is reasoning, we just evaluate the outcomes.

One day the argument you're putting forward will be irrelevant, or good for theoretical discussion only. In practice I'm certain that machines will achieve human level output at some point.


> machines will achieve human level output at some point

Would you care to put some sort of time scale to "at some point?" Are we talking about months, years, decades, centuries?


No real idea. It is also a very difficult thing to measure. But I think once we see it, the argument will be over.

Wild guess, within 30 years.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: