Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Information Flows Through Transformers (twitter.com/repligate)
3 points by frozenseven 4 months ago | hide | past | favorite | 1 comment


If this is supposed to a simple or approachable (or even correct) explanation of LLMs, I think it misses the mark, especially in the last paragraph where the author seems to confuse the transformer's work of putting values into a model with the later, predictive steps of returning tokens when prompted.

Folks that say an LLM cannot "introspect on itself" are correct because the model's "learning" process consists of a series of combinations of assignments and adjustments to the model data. In other words, it's predictive soup all the way down.

I'm biased because I wrote it, but I think this is a better article[0]. I did so specifically because most explanations are awful, and on that point I agree with this author.

[0] Something From Nothing: A Painless Approach to Understanding AI -- https://medium.com/gitconnected/something-from-nothing-d755f...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: