Hacker Newsnew | past | comments | ask | show | jobs | submit | tinco's commentslogin

To be fair, working on the Rust compiler is also something to do when you're learning Rust. I guess this person is killing two birds with one stone.


If something is literally incredible, then it's prudent to stop and consider whether it should be believed or that you have made an incorrect assumption. In this case, you wrongly assume that Musk is somehow being rewarded for something that happened in the past, or for something that might not even happen. The reality is that the pay package will only have value if Elon manages to dig Tesla out of the hole.

Despite how much conning you believe Musk has done (I won't refute it), Tesla is a company that actually builds cars, and while the Cybertruck flopped and anyone could see that coming from a mile away, that doesn't really affect the Tesla bottom line. That Musk grifted the government into buying them doesn't really do anything besides saving Tesla some money.

I wouldn't buy Tesla shares, I still don't really see their crazy valuation, but I would buy a Tesla car, as they are ostensibly awesome. If you disregard all the lying Musk has done it's still an epic car with unrivaled self driving capabilities.

That he starts talking about something historically has been a sign that some part of it is going to be a reality. You can stand apart from the crazy people who worship the ground he walks on, and still appreciate that he accomplishes great things. Whether it's through conning and grifting, or hard work and keen insight, there are still an electric car company and a rocket company where there weren't before.

Just stop reacting to people believing or shouting things or grotesque behaviors, and just look at the actual reality. It'll do you a lot better than just believing everything Musk says is BS.


The GP was talking about Unreal Engine 5 as if that engine doesn't optimize for low end. That's a wild take, I've been playing Arc Raiders with a group of friends in the past month, and one of them hadn't upgraded their PC in 10 years, and it still ran fine (20+ fps) on their machine. When we grew up it would be absolutely unbelievable that a game would run on a 10 year old machine, let alone at bearable FPS. And the game is even on an off-the-shelf game engine, they possibly don't even employ game engine experts at Embark Studios.


>And the game is even on an off-the-shelf game engine, they possibly don't even employ game engine experts at Embark Studios.

Perhaps, but they also turned off Nanite, Lumen and virtual shadow maps. I'm not a UE5 hater but using its main features does currently come at a cost. I think these issues will eventually be fixed in newer versions and with better hardware, and at that point Nanite and VSM will become a no-brainer as they do solve real problems in game development.


> it still ran fine (20+ fps)

20 fps is not fine. I would consider that unplayable.

I expect at least 60, ideally 120 or more, as that's where the diminishing returns really start to kick in.

I could tolerate as low as 30 fps on a game that did not require precise aiming or reaction times, which basically eliminates all shooters.


On 10 years old hardware?


I wonder what's worse, the SFBA-style software development, but also with SFBA-style 2 hour response window to serious bugs like Discord showed, or the old fashioned enterprise report your bug and within 2 months you'll receive an e-mail confirming your report if you're lucky and a letter from a lawyer if you're not.


The author doesn't mean that the technologies weren't inevitable in the absolute sense. They mean that it was not inevitable that anyone should use those technologies. It's not inevitable that they will use Tiktok, and it is not inevitable for anyone, I've never used Tiktok, so the author is right in that regard.

If you disavow short form video as a medium altogether, something I'm strongly considering, then you can. It does mean you have to make sacrifices, for example Youtube doesn't let you disable their short form video feature so it is inevitable for people who choose they don't want to drop Youtube. That is still a choice though, so it is not truly inevitable.

The larger point is that there are always people pushing some sort of future, sketching it as inevitable. But the reality is that there always remains a choice, even if that choice means you have to make sacrifices.

The author is annoyed at people throwing the towel in the ring and declaring AI is inevitable, when the author apparently still sees a path to not tolerating AI. Unfortunately the author doesn't really constructively show that path, so the whole article is basically a luddite complaint.


Has that ever worked at scale in history? This strikes me as the same as people who take a stand by not ordering from Amazon or not using whichever service, they make their life somewhat harder and the world doesn't notice. Even worse, the people taking a stand signal to others that they do it, but most others think that the cost outweighs the benefit, and don't like being judged. Groups in which everyone signals and judges like that suck and devolve into purity spiraling, so few people sustain it, and the people taking a stand get bitter.

Co-ordination problems are the hardest problems.


Yeah it has on occasion, you're right in that it usually doesn't have much of an effect but every once in a while it does. If there's enough of the self-sacrificing users they'll together save a business or a way of doing things. Like running Linux on consumer hardware, or using cash in retail stores.

They don't necessarily have to coordinate, they can use a thousand different linux distro's and literally never talk to each other, and still cause PC manufacturers to keep to a standardized boot process and largely documented hardware so that Linux remains viable.


Unsafe deserialization is a very 2010 Ruby on Rails sort of vulnerability. It is strangely interesting that such a vulnerability was introduced so late in the lifetime of these frameworks. It must be a very sneaky vulnerability given how cautious we have become around deserialization since then.


The React Server Components wire format (Flight) is relatively novel and very new (it has existed in React stable for just a year). This is not a simple JSON parsing bug.


The rails bugs weren't about Json parsing, they were deserializing into Ruby objects of classes that had side effects, and those side effects led to RCE possibilities. Since those happened, you'll find any deserialization library, especially in dynamic languages, will have a safe (or conversely unsafe) deserialize function to make it more explicit that there's risks involved.


I'm willing to bet that this is linked to the magic __proto__ object namespace in JavaScript


You win!


This seems really irrelevant to the browser. I wonder why this was standardized, JavaScript is easily powerful enough to express this. Surely it hasn't been a performance bottleneck for it needing a native api?

I agree with the other two comments, surely almost every frontend webdev has implemented a router in their career unless they never strayed from the major frameworks. It's really not a complicated thing to have to build. I'm not one to look a gift horse in the mouth but I don't see why we're being given this one.


Anytime I end up wondering "Why do we have feature X on the web?" I tend to up end reading through proposals and I always end up finding a suitable answer that makes me wonder no more. For this specific feature, there is lots of prior discussions about why this is needed in the first place here: https://github.com/whatwg/urlpattern


Based on https://github.com/whatwg/urlpattern/blob/main/explainer.md it looks like they specifically wanted it as a way to scope service workers so it's easy to make them only run on certain parts of a site, and viewed giving people something easy to use for other URL matching as a nice bonus.


Easy way to know if Roblox is safe for children is to ask the following question: Is Roblox available on Nintendo?

Nintendo would never turn away money, unless they feel it would damage their reputation as a company you can trust your children with.

If they had believed Roblox was safe, they would have 100% taken that money.


Interesting thought experiment - making a Nintendo approved version would put the finger on exactly all problems of "real" Roblox. (But strictly speaking, the logic only works in one direction "If Nintendo, then Safe". But "if not on Nintendo, then unsafe" is not strictly true but maybe good shorthand.)

The interview makes me think of Dupont and Tǝflon. "You are giving thousands of people cancer in your community."

"That's alright, think of the millions who love our products."

"Carry on, sir."


They kinda tried with game builder garage a few years back, It keeps the whole game making and sharing aspect while limiting the damage user generated content can do such as limiting custom assets and needing to get the games number to download it to your system


It may be a good enough simple test, especially for huge titles like Roblox, but I’ve worked with game developers whose titles would have been allowed by Nintendo but who decided it wasn’t worth the investment to create a port to publish for Nintendo devices - lots of games don’t release on all possible platforms for business reasons.


Yeah, it's obviously just a crude razor. What I meant is that if it's not on Nintendo, it's a sign you should be more diligent as a parent. Obviously you should always monitor and ideally play the games your kids play yourself, but I can imagine not all parents do that. The Nintendo test is a reasonable alternative, my mom did it in the 90s and I had enough fun games even if I missed out on a bunch of cool PC and Sony titles.


Agreed, just thought worth pointing out to anyone reading your comment that a game not being on Nintendo doesn’t necessarily mean that it’s not child-friendly (lots of people don’t realise that making a game work on different platforms involves actual dev work rather than just an equivalent of “file -> save as -> choose platform”)


> Nintendo would never turn away money, unless they feel it would damage their reputation as a company you can trust your children with.

There's Doom 3 on the Nintendo Switch... so the lines are a bit blurry.


Perhaps, maybe I'm a bad parent but I don't see anything particularly dangerous in Doom 3. It's PEGI 18, but probably fine for teenagers.


Fortnite says hello!

(But yes I generally agree with your point.)


Most games out there are both simultaneously a) safe for kids and b) not available on Nintendo products.


Yeah I think LeCun is underestimating the impact that LLM's and Diffusion models are going to have, even considering the huge impact they're already having. That's no problem as I'm sure whatever LeCun is working on is going to be amazing as well, but an enterprise like Facebook can't have their top researcher work on risky things when there's surefire paths to success still available.


I politely disagree - it is exactly an industry researcher's purpose to do the risky things that may not work, simply because the rest of the corporation cannot take such risks but must walk on more well-trodden paths.

Corporate R&D teams are there to absorb risk, innovate, disrupt, create new fields, not for doing small incremental improvements. "If we know it works, it's not research." (Albert Einstein)

I also agree with LeCun that LLMs in their current form - are a dead end. Note that this does not mean that I think we have already exploited LLMs to the limit, we are still at the beginning. We also need to create an ecosystem in which they can operate well: for instance, to combine LLMs with Web agents better we need a scalable "C2B2C" (customer delegated to business to business) micropayment infrastructure, because as these systems have already begun talking to each other, in the longer run nobody would offer their APIs for free.

I work on spatial/geographic models, inter alia, which by coincident is one of the direction mentioned in the LeCun article. I do not know what his reasoning is, but mine was/is: LMs are language models, and should (only) be used as such. We need other models - in particular a knowledge model (KM/KB) to cleanly separate knowledge from text generation - it looks to me right now that only that will solve hallucination.


Knowledge models, like ontologies, always seem suspect to me; like they promise a schema for crisp binary facts, when the world is full of probabilistic and fuzzy information loosely categorized by fallible humans based on an ever slowly shifting social consensus.

Everything from the sorites paradox to leaky abstractions; everything real defies precise definition when you look closely at it, and when you try to abstract over it, to chunk up, the details have an annoying way of making themselves visible again.

You can get purity in mathematical models, and in information systems, but those imperfectly model the world and continually need to be updated, refactored, and rewritten as they decay and diverge from reality.

These things are best used as tools by something similar to LLMs, models to be used, built and discarded as needed, but never a ground source of truth.


>Knowledge models, like ontologies, always seem suspect to me; like they promise a schema for crisp binary facts, when the world is full of probabilistic and fuzzy information loosely categorized by fallible humans based on an ever slowly shifting social consensus.

I don't disagree that the world is full of fuzziness. But the problem I have with this portrayal is that formal models are often normative rather than analytical. They create reality rather than being an interpretation or abstraction of reality.

People may well have a fuzzy idea of how their credit card works, but how it really works is formally defined by financial institutions. And this is not just true for software products. It's also largely true for manufactured products. Our world is very much shaped by artifacts and man-made rules.

Our probabilistic, fuzzy concepts are often simply a misconception. That doesn't mean it's not important of course. It is important for an AI to understand how people talk about things even if their idea of how these things work is flawed.

And then there is the sort of semi-formal language used in legal or scientific contexts that often has to be translated into formal models before it can become effective. Law makers almost never write algorithms (when they do, they are often buggy). But tax authorities and accounting software vendors do have to formally model the language in the law and then potentially change those formal definitions after court decisions.

My point is that the way in which the modeled, formal world interacts with probabilistic, fuzzy language and human actions is complex. In my opinion we will always need both. AIs ultimately need to understand both and be able to combine them just like (competent) humans do. AI "tool use" is a stop-gap. It's not a sufficient level of understanding.


> People may well have a fuzzy idea of how their credit card works, but how it really works is formally defined by financial institutions.

> Our probabilistic, fuzzy concepts are often simply a misconception.

How eg a credit card works today is defined by financial institutions. How it might work tomorrow is defined by politics, incentives, and human action. It's not clear how to model those with formal language.

I think most systems we interact with are fuzzy because they are in a continual state of change due to the aforementioned human society factors.


To some degree I think that our widely used formal languages may just be insufficient and could be improved to better describe change.

But ultimately I agree with you that this entire societal process is just categorically different. It's simply not a description or definition of something, and therefore the question of how formal it can be doesn't really make sense.

Formalisms are tools for a specific but limited purpose. I think we need those tools. Trying to replace them with something fuzzy makes no sense to me either.


I believe the formalisms can be constructed by something fuzzy. Humans are fuzzy; they create imperefect formalisms that work until they break, and then they're abandoned or adapted.

I don't see how LLMs are significantly different. I don't think the formalisms are an "other". I believe they could be tools, both leveraged and maintained by the LLM, in much the same way as most software engineers, when faced with a tricky problem that is amenable to brute force computation, will write up a quick script to answer it rather than try and work it out by hand.


I think AI could do this in principle but I haven't seen a convincing demonstration or argument that Transformer based LLMs can do it.

I believe what makes the current Transformer based systems different to humans is that they cannot reliably decide to simulate a deterministic machine while linking the individual steps and the outcomes of that application to the expectations and goals that live in the fuzzy parts of our cognitive system. They cannot think about why the outcome is undesirable and what the smallest possible change would be to make it work.

When we ask them to do things like that, they can do _something_, but it is clearly based on having learned how people talk about it rather than actually applying the formalism themselves. That's why their performance drops off a cliff as soon as the learned patterns get too sparse (I'm sure there's a better term for this that any LLM would be able to tell you :)

Before developing new formalisms you first have to be able to reason properly. Reasoning requires two things. Being able to learn a formalism without examples. And keeping track of the state of a handful of variables while deterministically applying transformation rules.

The fact that the reasoning performance of LLMs drops off a cliff after a number of steps tells me that they are not really reasoning. The 1000th rules based transformation only depending on the previous state of the system should not be more difficult or error prone than the first one, because every step _is_ the first one in a sense. There is no such cliff-edge for humans.


You're basically describing the knowledge problem vs model structure, how to even begin to design a system which self-updates/dynamically-learns vs being trained and deployed.

Cracking that is a huge step, pure multi-modal trained models will probably give us a hint, but I think we're some ways from seeing a pure multi-modal open model which can be pulled apart/modified. Even then they're still train and deploy not dynamically learning. I worry we're just going to see LSTM design bolted onto deep LLM because we don't know where else to go and it will be fragile and take eons to train.

And less said about the crap of "but inference is doing some kind of minimization within the context window" the better, it's vacuous and not where great minds should be looking for a step forwards.


I have vague notions of there being an entire hidden philosophical/political battlefield (massacre?) behind the whole "are knowledge models/ontologies a realistic goal" debate.

Starting with the sophomoric questions of the optimist who mistakes the possible for the viable: how definite of a thing is "the world", how knowable is it, what is even knowledge... and then back through the more pragmatic: by whom is it knowable, to what degree, and by what means. The mystics: is "the world" the same thing as "the sum of information about the world"? The spooks: how does one study those fields of information which are already agentic and actively resist being studied by changing themselves, such as easily emerge anywhere more than n(D) people gather?

Plenty of food for thought from why ontologies are/aren't a thing. The classical example of how this plays out in the market being search engines winning over internet directories. But that's one turn of the wheel. Look at what search engines grew into quarter century later. What their outgrowths are doing to people's attitude towards knowledge. Different timescale, different picture.

Fundamentally, I don't think human language has sufficient resolution to model large spans of reality within the limited human attention span. The physical limits of human language as information processing device have been hit at some point in the XX century. Probably that 1970s divergence between productivity and wages.

So while LLMs are "computers speak language now" and it's amazing if sad that they cracked it by more data and not by more model, what's more amazing is how many people are continually ready to mistake language for thought. Are they all P-zombies or just obedience-conditioned into emulating ones?!?!?

Practically, what we lack is not the right architecture for "big knowing machine", but better tools for ad-hoc conceptual modeling of local situations. And, just like poetry that rhymes, this is exactly what nobody has a smidgen of interest to serve to consumers, thus someone will just build it in their basement in the hope of turning the tables on everyone. Probably with the help of LLMs as search engines and code generators. Yall better hurry. They're almost done.


Nice commentary and I enjoyed the poetic turn of phrase. I had to respond to it with my own thoughts if only to bookmark it for myself.

> how many people are continually ready to mistake language for thought

This is a fundamental illusion - where, rote memory and names and words get mistaken for understanding. This was wonderfully illustrated here [1]. Few really grok what understanding actually is. This is an unfortunate by-product of our education system.

> Are they all P-zombies or just obedience-conditioned into emulating ones?!?!?

Brilliant way to state the fundamental human condition. ie, we are all zombies conditioned to imitate rather than understand. Social media amplifies the zombification, and now LLMs do that too.

> Starting with the sophomoric questions of the optimist who mistakes the possible for the viable

This is the fundamental tension between operationalized meaning and imagination. A grokking soul gathers mists from the cosmic chaos and creates meaning and operationalizes it for its own benefit and then continually adapts it.

> it's amazing if sad that they cracked it by more data and not by more model

I was speaking to experts in the sciences (chemistry). They were shocked that the underlying architecture is brute force. They expected a compact information-compressed theory which is able to model independent of data. The problem with brute-force approaches are that they dont scale, and dont capture the essences which are embodied in theories.

> The physical limits of human language as information processing device have been hit at some point in the XX century

2000 years back when humans realized that formalism was needed to operationalize meaning, and natural language was too vague to capture and communicate it. Because the world model that natural language captures encompasses "everything" whereas for making it "useful" requires to limit it via formalism.

[1] https://news.ycombinator.com/item?id=2483976


I disagree with most of what you said.


Is it that fuzzy though? If it was would language not adequately grasp and model our realities? And what about the physical world itself: animals are modeling the world adequately enough to navigate it. There's significant gains to make from modeling _enough_ of the world, without falling into hallucinations of purely statistical associations of an LLM.


World models are trivial. eg narratives are world models and they provide only pre frontal simulation, ie they are synthetically prey-predation. No animal uses world models for survival and doubtful they exist (maps are not models), a world model doesn't conform to optic flow, ie instantaneous use and response. Anything like a world model isn't shallow, the basic premise of oscillatory command, it's needlessly deep, nothing like brains. This is just a frontier hail-mary to the current age.


> it is exactly a researcher's purpose to do the risky things that may not work

Maybe at university, but not at a trillion dollar company. That job as chief scientist is leading risky things that will work to please the shareholders.


They knew what Yann LeCun was when they hired him. If anything, those brilliant academics who have done what they're told and loyally pursued corporate objectives the way the corporation wanted (e.g. Karpathy when he was at Tesla) haven't had great success either.


>They knew what Yann LeCun was when they hired him.

Yes but he was hired in the ZIRP era where all SV companies were hiring every opinionated academic and giving them free reign and unlimited money to burn in the hopes that maybe they'll create the next big thing for them eventually.

These are very different economic times right now, after the FED infinite money glitch has been patched out, so now people do need to adjust to them and start actually making some products of value for their seven figure costs to their employers, or end up being shown the door.


Some employees even need to physically present at the office


so your message is to short OpenAI before it implodes and gets absorbed into Cortana or equivalent ;)


Unless you're an insider, currently you'd need to express that short via something else.


> risky things that will work

Things known to work are not risky. Risky things can fail by definition.


What exactly does it mean for something to be a "risky thing that will work"?


“Risky things that will work” - contradiction in terms. If companies only did things they knew would work, we probably still wouldn’t have microchips.

Also, like… it’s Facebook. It has a history of ploughing billions into complete nonsense (see metaverse). It is clearly not particularly risk averse.


> I also agree with LeCun that LLMs in their current form - are a dead end.

Well then you and he are clearly dead wrong.


How do you know that for sure? How can you be absolutely certain that LLMs are what will lead to AGI?


I’ve yet to meet a single person who claims AGI will happen without recycling the same broken reasoning the peak-oil retards were peddling a decade ago.

Talking to these people is exhausting, so I cut straight to the chase: name the exact, unavoidable conditions that would prove AGI won’t happen.

Shockingly, nobody has an answer. They’ve never even considered it.

That’s because their whole belief is unfalsifiable.


Either that, or just tautological, given that LLM tech is continually morphing and improving.


LLMs and Diffusion solve a completely different problem than world models.

If you want to predict future text, you use an LLM. If you want to predict future frames in a video, you go with Diffusion. But what both of them lack is object permanence. If a car isn't visible in the input frame, it won't be visible in the output. But in the real world, there are A LOT of things that are invisible (image) or not mentioned but only implied (text) that still strongly affect the future. Every kid knows that when you roll a marble behind your hand, it'll come out on the other side. But LLMs and Diffusion models routinely fail to predict that, as for them the object disappears when it stops being visible.

Based on what I heard from others, world models are considered the missing ingredient for useful robots and self-driving cars. If that's halfway accurate, it would make sense to pour A LOT of money into world models, because they will unlock high-value products.


Sure, if you only consider the model they have no object permanence. However you can just put your model in a loop, and feed the previous frame into the next frame. This is what LLM agent engineers do with their context histories, and it's probably also what the diffusion engineers do with their video models.

Messing with the logic in the loop and combining models has an enormous potential, but it's more engineering than researching, and it's just not the sort of work that LeCun is interested in. I think the conflict lies there, that Facebook is an engineering company, and a possible future of AI lies in AI engineering rather than AI research.


>But what both of them lack is object permanence.

This is something that was true last year, but hanging on by a thread this year. Genie shows this off really well, but it's also in the video models as well.[1]

[1]https://storage.googleapis.com/gdm-deepmind-com-prod-public/...


I think World models is way to go for Super Intelligence. One of teh patent i saw already going in this direction for Autonomous mobility is https://patents.google.com/patent/EP4379577A1 where synthetic data generation (visualization) is missing step in terms of our human intelligence.


This is the first time I have heard of world models. Based on my brief reading it does look like this is the idea model for autonomous driving. I wonder if the self driving companies are already using this architecture or something close to it.


I thoroughly disagree, I believe world models will be critical in some aspect for text generation too. A predictive world model you can help to validate your token prediction. Take a look at the Code World Model for example.


lol what is this? We already have world models based on diffusion and ar algorithms.


> but an enterprise like Facebook can't have their top researcher work on risky things when there's surefire paths to success still available.

Bell Labs


> I think LeCun is underestimating the impact that LLM's and Diffusion models

No, I think hes suggesting that "world models" are more impactful. The issue for him inside meta is that there is already a research group looking at that, and are wildly more successful (in terms of getting research to product) and way fucking cheaper to run than FAIR.

Also LeCun is stuck weirdly in product land, rather than research (RL-R) which means he's not got the protection of Abrash to isolate him from the industrial stupidity that is the product council.


Which other group is that?


Hard to tell.

The last time LeCun disagreed with the AI mainstream was when he kept working on neural net when everyone thought it was a dead end. He might be entirely right in his LLM scepticism. It's hardly a surefire path. He didn't prevent Meta from working on LLM anyway.

The issue is more than his position is not compatible with short term investors expectations and that's fatal in a company like Meta at the position LeCun occupies.


> Facebook can't have their top researcher work on risky things when there's surefire paths to success still available.

How did you determine that "surefire paths to success still available"? Most academics agree that LLMs (or LLMs alone) are not going to lead us to AGI. How are you so certain?


I don't believe we need more academic research to achieve AGI. The sort of applications that are solving the recent AGI challenges are just severely resource constrained AGI. The only difference between those systems and human intelligence are resources and incentives.

Not that I believe AGI is the measure of success, there's probably much more efficient ways to achieve company goals than simulating humans.


Unless I've missed a few updates, much of the JEPA stuff didn't really bear a lot of fruit in the end.


I don't think he's given up on it.

How many decades did it take for neural nets to take off?

The reason we're even talking about LeCun today is because he was early in seeing the promise of neural nets and stuck with it through the whole AI winter when most people thought it was a waste of time.


But neural nets were always popular, they just went through phases of hype depending on the capacity of hardware at the time. The only limitation of neural nets at the time was computational power to scale up. AI winters came when other techniques became available that required less compute. Once GPGPU became available, all of that work became immediately viable.

No similar limitations exist today for JEPA, to my knowledge.


Depends on how far back you are going. There was the whole 1969 Minsky Perceptron flap where he said ANNs (i.e Perceptrons) were useless because they can't learn XOR (and no-one at the time knew how to train multi-layer ANNs), which stiffled ANN research and funding for a while. It would then be almost 20 years until the 1986 PDP handbook published LeCun and Hinton's rediscovery of backpropagation as a way to train multi-layer ANNs thereby making them practical.

The JEPA parallel is just that it's not a popular/mainstream approach (at least in terms of well funded research), but may eventually win out over LLMs in the long term. Modern GPUs provide plenty of power for almost any artifical brain type approach, but of course are expensive at scale, so lack of funding can be a barrier in of itself.


>the huge impact they're already having

In the software development world yes, outside of that, virtually none. Yes, you can transcribe a video call in Office, yes, but that's not ground breaking. I dare you to list 10 impacts on different fields, excluding tech and including at least half blue collar fields and at least half white collar fields , at different levels from the lowest to the highest in the company hierarchy, that LLM/Diffusion models are having. Impact here specifically means a significant reduction of costs or a significant increase of revenue. Go on


I'm also not sure it even drives a ton of value in software engineering. It makes the easy part easier and the hard part harder. Typing out software in your mind was never the difficult part. Figuring out what to write, how to interpret specs in context, how to make your code work within the context of a broader whole, how to be extensible, maintainable, reliable, etc. That's hard, and LLMs really don't help.

Even when writing, it shifts the mental burden from an easy thing (writing code) to a very hard thing (reading that code, validating it's right, hallucination free, and then refactoring it to match your teams code style and patterns).

It's great for building a first-order approximation of a tech demo app that you then throw out and build from scratch, and auto-complete. In my experience, anyways. I'm sure others have had different experiences.


You already mentioned two fields they have a huge impact on, software development and NLP (this latter one the most impacted so far). Another field that comes to mind is academic research is getting an important boost as well, via semantic search or more advanced stuff like Google's biological cell model which already uncovered new treatments. I'm sure I'm missing a lot of other fields I'm less familiar with (legal, for example). But just these impacts I listed are all huge and they will indirectly have a huge impact on all other areas of human industry, it's just a matter of time. "Software will eat the world" and all that.


Personally, I find myself using LLMs more than Google now, even for non-development tasks. I think this shift is going to become the new normal (if it isn't already).


And what's the end result? All one can see is just bigger representation of those who confidently subscribe to false information and become arrogant when their validity is questioned, as the LLM writing style has convinced them it's some sort of authority. Even people on this website are so misinformed to believe that ChatGPT has developed its own reasoning, despite it being at the core an advanced learning algorithm trained on a enormous amount of human generated data.

And let's not speak about those so deep into sloth that put it into use to deteriorate, and not augment as they claim to do, humane creative recreational activities.

https://archive.ph/fg7HE


This seems a bit self-contradictory: you say LLMs mislead people and can't reason, then fault them for being good at helping people solve puzzles or win trivia games. You can't have it both ways.


> you say LLMs mislead people and can't reason

Why would you postulate these two to be mutually exclusive?

> then fault them for being good at helping people solve puzzles or win trivia games

They only help them in the same sense that a calculator would 'help' win at a hypothetical mental math competition, that is the gist; robbing people of the creative and mentally stimulating processes that make the game(s) fun. But I've come to realize this is an unpopular opinion on this website where being fiercely competitive is the only remarkable personality trait, so I guess yeah it may be useful for this particular population.


I don't think you'll find many here believing anything outside tech is worth investing into, it's schizophrenic isn't it.


While I agree with your point, “Superintelligence” is a far cry from what Meta will end up delivering with Wang in charge. I suppose that, at the end of the day, it’s all marketing. What else should we expect from an ads company :?


The Meta Super-Intelligence can dwell in the Metaverse with the 23 other active users there.


not sure I agree. AI seems to be following the same 3-stage path of many inventions: innovation > adoption > diffusion. LeCun and co focus on the first, and LLMs in their current form appear to be incremental at improvements; we're still using the same basis from more than ten years ago. FB and industry are signalling a focus on harvesting the innovation and that could last - but also take - many years or decades. Your fundamental researchers are not interested (or the right people) in that position.


He's quoted in OP as calling them 'useful but fundamentally limited'; that seems correct, and not at all like he's denying their utility.


Yeah honestly I'm with the LLM people here

If you think LLMs are not the future then you need to come with something better

If you have a theoretical idea that's great, but take to at least GPT2 level first before writing off LLMs

Theoretical people love coming up with "better ideas" that fall flat or have hidden gotchas when they get to practical implementation

As Linus says, "talk is cheap, show me the code".


Do you? Or is it possible to acknowledge a plateau in innovation without necessarily having an immediate solution cooked-up and ready to go?

Are all critiques of the obvious decline in physical durability of American-made products invalid unless they figure out a solution to the problem? Or may critics of a subject exist without necessarily being accredited engineers themselves?


LLM's are probably always going to be the fundamental interface, the problem they solved was related to the flexibility of human languages allowing us to have decent mimikry's.

And while we've been able to approximate the world behind the words, it's just full of hallucinations because the AI's lack axiomatic systems beyond much manually constructed machinery.

You can probably expand the capabilties by attaching to the front-end but I suspect that Yann is seeing limits to this and wants to go back and build up from the back-end of world reasoning and then _among other things_ attach LLM's at the front-end (but maybe on equal terms with vision models that allows for seamless integration of LLM interfacing _combined_ with vision for proper autonomous systems).


> because the AI's lack axiomatic systems beyond much manually constructed machinery.

Oh god, that is massively under-selling their learning ability. These models are able to extract and reply with why jokes are funny without even knowing basic vocab, yet there are pure-code models out there with lingual rules baked in from day one which still struggle with basic grammar.

The _point_ of LLMs arguably is there ability to learn any pattern thrown at it with enough compute. With an exception to learning how logical processes work, and pure LLMs only see "time" in the sense of a paragraph begins and ends.

At the least they have taught computers, "how to language", which in regards to how to interact with a machine is a _huge_ step forward.

Unfortunately the financial incentives are split between agentic model usage (taking the idea of a computerised butler further), maximizing model memory and raw learning capacity (answering all problems at any time), and long-range consistency (longer ranges give better stable results due to a few reasons, but we're some way from seeing an LLM with a 128k experts and 10e18 active tokens).

I think in terms of building the perfect monkey butler we already have most or all of the parts. With regard to a model which can dynamically learn on the fly... LLMs are not the end of the story and we need something to allow the models to more closely tie their LS with the context. Frankly the fact that DeepSeek gave us an LLM with LS was a huge leap since previous model attempts had been overly complex and had failed in training.


>If you think LLMs are not the future then you need to come with something better

The problem isn't LLMS, the problem is that everyone is trying to build bigger/better llms or manually code agents around LLMs. Meanwhile, projects like Mu Zero are forgotten, despite being vastly more important for things like self driving.


Why not both? LLM:s probably have a lot more potential than what is currently being realized but so does world models.


Isn't that exactly why he's starting a new company?


Of course the challenge with that is it's often not obvious until after quite a bit of work and refinement that something else is, in fact, better.


LLMs are the present. We will see what the future holds.


Well, we will see if Yann can.


Yeah no, it's just not how it works. They're trying to support fundamental research and they have limited resources to accomplish them. Some random dude who wants to build a company that generates pretty AI pictures is just not the target audience, and he rightly got rejected.

And frankly, the dream scenario that Pieter describes where he somehow would qualify for these resources also wouldn't help kickstart the tech industry, and it's also not how it works in the states.

What does help, and what European governments (at least the one in The Netherlands that Pieter is from) actually do, is more funding for startups. If you're a startup founder in NL almost every angel you talk to has a matched funding deal with the government. That's such a smart way of keeping up with the US. Do you think US startups get free compute from the government? They don't even get subsidies most of the time. What they get is better funding because there's more capital available, and helping investors with that is exactly how you solve that.


I don't think what you're saying is inconsistent with what I'm saying. I think you are making a big deal out of the difference between state investment funds and subsidized GPUs but I think they basically work by similar mechanisms.


> What does help, and what European governments (at least the one in The Netherlands that Pieter is from) actually do, is more funding for startups. If you're a startup founder in NL almost every angel you talk to has a matched funding deal with the government. That's such a smart way of keeping up with the US.

Does government offering matched funding to investors actually help startups who are struggling to find (any) funding? If a startup can't find (any) funding, matching is irrelevant.

> Do you think US startups get free compute from the government? They don't even get subsidies most of the time. What they get is better funding because there's more capital available, and helping investors with that is exactly how you solve that.

Umm. I'm not really convinced that the political elites in Europe understand how to do any of this stuff well.

See also: https://www.eib.org/en/publications/online/all/the-scale-up-...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: