Hacker Newsnew | past | comments | ask | show | jobs | submit | Dorialexander's commentslogin

Seeing LLM as a motor was a legitimate view until recently. But what we're start seeing with actual agentification is models taking the driver seat, making the call about search, tool use, API. Like DeepSearch, these models are likely to be gated, not even API accessible. It will be even more striking once we'll move to industry specific training — one of the best emerging example is models for network engineering.

The key thing really about my post: it's about the strategy model providers are going to apply in the next 1-2 years. Even the title is coming from an OpenAI slide. Any wrappers will have to operate under this environment.


The only way they’d not be API accessible is surely if they contained some new and extremely difficult to replicate innovation that prevents important capabilities from being commoditised.

What reason or evidence do you see that that is (or will be) the case rather than those features simply representing a temporary lead for some models, which others will all catch up to soon enough?


Yeah, this reminds me of the breathless predictions (and despair, some corners) that flew around shortly after the initial ChatGPT launch. “Oh, they have a lead so vast, no one could ever catch up.” “[Insert X field] is dead.” Et cetera. I didn’t buy it then, and I’m not buying it now.

Of course OpenAI and Anthropic wish they could dominate the application layer. I predicted that two years ago: that model providers would see their technology commoditized, and would turn to using their customers’ data against them to lock them out with competing in-house products. But I don’t think they will succeed, for the reasons rbren mentioned previously. Good application development requires a lot of problem specific knowledge and work, and is not automatable.

On the point of RL — I predict this will generate a some more steam to keep the investment/hype machine cranking a little longer. But the vast majority of tasks are not verifiable. The vast majority have soft success criteria or are mixed, and RL will not overcome the fundamental limitations of GenAI.


So to clarify: the important product that people will ultimately want is the model. Obviously you need to design an infra/UI around it but that's not the core product.

The really important distinction is between workflow (what everyone use in applied LLM right now) and actual agents. LLM agents can take their own decision, browse online, use tools, etc. without direct supervision as they are directly trained for the task. They internalize all the features of LLM orchestration.

The expression ultimately comes from a slide from OpenAI from 2023 https://pbs.twimg.com/media/Gly1v0zXIAAGJFz?format=jpg&name=... — so in a way its a long held vision in big labs, just getting more accute now.


Hi, author here.

An important background is the imminent rise of actual LLM agents I discuss in the next post: https://vintagedata.org/blog/posts/designing-llm-agents

So answering to a few comments:

*The shift is coming relatively soon thanks to the latest RL breakthroughs (I really encourage to give a look at Will Brown talk). Anthropic and OpenAI are close to nail long multi-task sequences on specialized tasks.

*There are stronger incentives to specialize the model and gate them. They are especially more transformative on the industry side. Right now most of the actual "AI" market is still largely rule-based/ML. Generative AI was not robust enough but now these systems can get disrupted — not to mention many verticals with a big focus on complex yet formal tasks. I know large network engineering co are upscaling their own RL capacities right now.

*Open source AI is distanced so far due to lack of data/frameworks for large scale RL and tasks related data. Though we might see a democratization of verifiers, it will take time.

Several people from big labs reached out since then and confirmed that, despite the obvious uncertainties, this is relatively one point.


Nice and provocative read! Is it fair to restate the argument as follows?

- New tech (eg: RL, cheaper inference) are enabling agentic interactions that fulfill more of the application layer.

- Foundation model companies realize this and are adapting their business models by building complementary UX and witholding API access to integrated models.

- Application layer value props will be squeezed out, disappointing a big chunk of AI investors and complementary infrastructure providers

If so, any thoughts on the following?

- If agentic performance is enabled by models specialized through RL (e.g. Deep Research's o3+browsing), why won't we get open versions of these models that application providers can use?

- Incumbent application providers can put up barriers to agentic access of the data they control. How does their data incumbency and vertical specialization weigh against the relative value of agents built by model providers?


Hi. Yes this is wholly correct.

On the second points:

* Well I'm very much involved in making open more models, pretrained the first model on free and open data without copyrigh issues, released the first version fo GRPO that can run on Google Colab (based on Will Brown). Yet, even then I have to be realistic: open source RL has a data issue. We don't have the action sequence data nor the recipes (emulators) that could make it possible to replicate even on a very small scale what big labs are currently working on.

* Agreed on this and I'm seeing this dynamic already in a few areas. Now it's still going to be uphill as some of the data can be bought and advanced pipelines can shortcut some of the need for it, as models can be trained directly on simulated environments.


Thanks for the reply - and for the open AI work!

> We don't have the action sequence data nor the recipes (emulators) that could make it possible to replicate even on a very small scale what big labs are currently working on.

Sounds like an interesting opportunity for application-layer incumbents that want to enable OSS model advancement...


answering the first question if i understand it correctly.

The missing piece is data obviously. With search and code, it's easier to get the data so you get such specialized products. What is likely to happen is: 1/ Many large companies work with some early design partners to develop solutions. They have the data + subject matter expertise, and the design partners bring in the skill. This way we see a new wave of RL agent startups grow. My guess is that this engagement would look different compared to a typical saas engagement. Some companies might do it inhouse, some wont because maintaining such systems is a task. 2/ These companies open source part of their dataset which can be consumed by oss devs to create better agents. This is more common in tech where a path to monopoly is to commoditize the immediately previous layer. Might play out elsewhere too, though I do not have a high degree of confidence here.


Why will application layer value props be squeezed out? And if so, where does value accrue going forward in an RL first world?


Is this the Will Brown talk you are referencing? https://www.youtube.com/watch?v=JIsgyk0Paic


Thanks for linking, yes that is the one he talks about on his blog also.


Hi, interesting article.

Since I am not in the AI industry, I think I do not understand few things:

- what is RL? Research Language?

- does it mean that in essence AI companies will switch to writing enterprise software using LLMs integrated with enterprise tools?

[EDIT] Seems like you can even ask a question on HN because 'how dare you not know something?' and gonna be downvoted.


Hi. So quickly:

* RL is Reinforcement Learning. Already used for a while as part of RLHF but now we have started to find a very nice combo of reasoning+RL on verifiable tasks. Core idea is that models are not just good a predicting the next token but the next right answer.

* I think anything infra with already some ML bundled is especially up for grabs but this will have a more transformative impact than your usual SaaS. Network engineering is a good example: highly formalized but also highly complex. RL models could increasingly nail that.


Respectfully, when you’re responding to someone who doesn't know what RL is, and you say “it’s this—already used in [another even lesser known acronym that includes the original]…” it doesn’t really help asker (like if you know what RLHF is then you know what RL is). I’ll admit I knew what RL was already but I don’t know what RLHF is and the comment just confuses me.

What is RLHF?


Am I the only one who uses a search engine while reading comment threads about industries/technologies I am not familiar with? This whole conversation is like two searches away from explaining everything (or a two minute conversation with an LLM I suppose)


That makes for poor communication by increasing the friction to read someone's thoughts.

As an author, you should care about reducing friction and decreasing the cost to the reader.


Some onus is on the reader to educate themselves, particular on Hacker News.


Am I the only one who uses a search engine while reading comment threads about industries/technologies I am not familiar with?

No. And yet... it's considered a Good Practice to expand acronyms on first use, and generally do things to reduce the friction for your audience to understand what you're writing.


> and generally do things to reduce the friction for your audience to understand what you're writing

Sure, if you're writing a blogpost titled "Architecture for Chefs" then yes, write with that audience in mind.

But we're a mix-match of folks here, from all different walks of life. Requiring that everyone should expand all acronyms others possibly might not understand, would just be a waste of time.

If I see two cooks discussing knives with terms I don't understand, is it really their responsibility to make sure I understand it, although I'm just a passive observer, and I posses the skill to look up things myself?


>But we're a mix-match of folks here, from all different walks of life. Requiring that everyone should expand all acronyms others possibly might not understand, would just be a waste of time.

Exactly!

Why would I waste 5 seconds of my own time, when I could waste 5 seconds of a dozen to hundreds of people's time?

My time is much better spent in meta-discussions, informing people that writing out a word one single time instead of typing up the acronym is too much.


Yes, I searched RLHF and figured it out. But this was an especially “good” example of poor communication. I assume the author isn’t being deliberately obtuse and appreciates the feedback.


This sounds impossible but I would guess RLHF is actually a better known acronym than RL. It became fairly popularly known among tech folks with no AI experience when ChatGPT came out.


Thanks. And what about some more user focused tasks? I.e. I have small but fairly profitable company that writes specialized software for accountants. Usually it is pretty complex, tax law tends to be changed very often, there are myriads of rules, exemptions etc. Could this be solved with ML? How long till we get there it at all? How costly this would be? Disclaimer: I do not write such software. This is just an example.


Well you're not going to believe it but I do have a FoucaultGPT just being trained (indirectly: Foucault is just part of my extended French historical corpus). As a sample:

Prompt : Écrit un livre de Michel Foucault sur les mesures de confinements pendant l'épidémie de COVID-19.

French historical model: Il est donc important de préciser que les mesures de confinement, loin d'être une simple réaction à la crise sanitaire, sont en réalité le résultat d'une longue histoire des techniques de gouvernementalité. Elles ont été étudiées et mises en place dans différents contextes (guerre, épidémies) au cours du XIXe siècle. Il faut aussi bien comprendre qu'elles ne se limitent pas à l'aspect médical : il s'agit également d'un ensemble de dispositifs qui permettent d'organiser et de contrôler un espace urbain considéré comme atteint par une maladie contagieuse. Le confinement n'est donc pas seulement une méthode pour empêcher les personnes infectées de sortir de chez elles ; c'est surtout une manière de surveiller les mouvements des individus et de les mettre en quarantaine si nécessaire. La question de savoir comment ces mesures ont pu être acceptées par les populations doit être posée dans le contexte historique où les techniques de gouvernementalité ont été développées et appliquées. Enfin, il est essentiel de souligner que la pandémie actuelle a mis en lumière l'insuffisance des systèmes de soins publics et privés, ainsi que l'impact socio-économique de cette crise. Les questions liées à l'accès aux soins, au financement des hôpitaux, à l'emploi et à la sécurité sociale doivent être abordées dans ce contexte.


In a way you could do so by prompting Monad with artificial intelligence stuff. I did a try lately on the latest OpenAI events and it went on like this: "In this sad and tragical storye, you shall heare how Sam Altman, a manne of great wit and judgement, was evicted out of his Companie of Artificiall Machines for want of candour in his Communications with the Board of Trustees…"


Yes I needed that for the conversational/instructional capacities. I've made a lot of tests with base models and it would not listen to instruction very well…


Yes you're perfectly right. I've currently tried to maintain some kind of uneasy balance between good conversational capacities (so that it really is a "chatGPT") and cultural reset, which means it may revert from its 17th persona occasionally. Actually this issue is a good illustration that LLM really are latent space explorer. When you prompt a clearly contemporary concept, the default embedding position will shift back to contemporary associations.

As a prompt engineering trick, I find it helps to use faux archaism (such as "pray tell" as an introductory phrase). This is basically a reinforcement anchor in the 17th century region of embedding space.


I wonder if you could get around this by penalising obviously modern words.


Either that or appending archaic expressions in the prompts (a bit like the prompt extension of Midjourney)


Yes. I think we may have enough for "full finetuning" and erasing to a large extent the previous knowledge. But that's still very far off for pretraining.

"RomeGPT" is next on my list of Monad successors and to give you a general idea, we have on the order of tens of millions of words in classical Latin (and biggest source will… Augustine). There was a BERT Latin project that was able to collect roughly 500 million words in all with mostly early modern and modern Latin.

In comparison I'm currently part of a project to pretrain a French model and we need… 140 billion words.


Yes it happens once in a while. It's still a small model (7B) and I've done very weird things with it. If I were historically reconditioned in the 17th century mindset, I would also likely have strange lapses of insanity.


The model can also be tried directly on this space: https://huggingface.co/spaces/Pclanglais/MonadGPT HuggingFace has generously provided free GPUs.


Excellent! Here is my first attempt:

Me: How can I write a good essay?

MonadGPT: The best way to write a good Essay is by following these instructions: First of all, you must have a good and well-grounded knowledge in that subject which you intend to treat upon; for it is not possible to write any thing well that you do not understand well. Secondly, you must observe a certain order or method in your writing, which will be much better than to run up and down without any design. Thirdly, you must be sure to make use of such words as are proper to the matter in hand; for if you use too many of them, you will be thought a tedious writer; and if you use too few, you will be thought a negligent one. Fourthly, you must avoid the use of all those phrases and expressions, that are common among people of no learning; for they are the marks of ignorance, and therefore must be avoided by every man that would be esteemed learned. Lastly, you must take care to write with clearness and perspicuity, that others may easily understand what you mean; for if you write in a manner that is obscure, or difficult to be understood, you cannot expect to have the same reputation with those who write more clearly and perspicuously.

I ran the same question by GPT-4, Claude 2.1, and Bard. Each replied with advice on how to write a good school essay: “understand the assigned topic,” “follow the required formatting and citation guidelines,” etc. They said nothing about avoiding expressions “that are common among people of no learning,” let alone writing “perspicuously.”


Makes you appreciate how far we've come..

Here is my attempt at asking the model to cure a headache:

>How do you cure a headache?

>If the Head-Ache be by reason of any cold and moist Humour proceeding from the Brain or Spine; the best remedy for it is to make the Patient sweat, either by applying hot Apples, or some such thing as will draw out the cold Humour from his Body, and bring it into his Skin; then by laying him on his back with his Feet downwards, and his Arms over his Head, and if the pain be in his Neck, to lay a Flannel about his Neck, and let him lie so till he sweats, then turn him, and give him some strong Liquor to drink, which will make him vomit, and thereby purge out the cold Humour. If the Head-Ache be by reason of the Stomach or Liver, it may be cured by giving the Patient some good Cordial, as Spirit of Wormwood, or such like; and if the cause be by reason of the Heart, then give him some strong Liquor, as Wine, or Ale, which will raise his Spirits, and make them run to the Head; and if it be by reason of the Brain, then give him some good Herb, as Pennyroyal, or some such thing, which will open the Pores of the Brain, and make the Humour come forth.

>should I send my son to school?

>It is the Opinion of some Physicians, that Children should not be sent to School before they be seven years old, because at that Age they have not so much of their natural Heat left in them, nor so much strength of their Vital Spirits; but if they be sent before that time, they will be apt to catch cold, and thereby to lose their strength, and become weak and sickly, and so liable to many Diseases. But others think that if Children be sent to School before they be sev'n years old, they will learn to read and write, and to speak well, and to do many other Things with greater Ease than if they were to wait till they were seven Years old, when their Bodies are

>Why do objects fall to the ground when dropped?

>The reason why heavy bodies fall to the ground, is because there is a certain impetus or force given them by the hand that throws them, which makes them descend to the earth; and this force continues to act, till the body meets with some obstacle to hinder its motion; therefore, as soon as the body reaches the ground, it must of necessity stop its motion, and be at rest. It is true, that heavy bodies have a natural tendency to fall downwards, but yet it is not on account of their weight, but on account of the force imparted to them by the hand that throws them; for if the hand did not give them any force, they would not fall at all.


Truly the act of falling is simply the release of potential energy, stored by whoever lifted the thing up in the first place.


My favorite response so far:

Prompt: And what of the CRYPTO-CURRENCY?

Response: CRYPTO-CURRENCY, a new Kind of Money, of which the first Species was called Bitcoin, and the second, Ethereum, and others since. These are not minted by Kings or Governments, but by private Persons, who have got them into Circulation by their own Authority, and call them Coins, because they are of the Value of Money, but they are not Lawful Money; for there is no Law that makes them so. They are a sort of Counterfeit Money, which has its own particular Value, and is bought and sold like other Commodities. The Price of these Crypto-Currencies varies every Day, according to the Opinion of those who deal in them; for they have no intrinsic Value, but only what they are worth to the Buyers and Sellers. As to their being a means of Trade, we need not say any Thing about them, since they are not allowed to pass current in any Country, either as Money or Credits, nor will they ever be received as such, unless the Government shall make them Legal. But though the Trade which is carried on by means of these Crypto-Currencies is not legal, yet it is considerable, especially among those who are fond of speculation, and are not afraid of risking their Money. It is said, that there are some People, who have made large Fortunes by dealing in these Crypto-Currencies, but it is not certain that they did not lose as much as they gained, or even more. The History of the World affords no Example of any Crypto-Currency that has lasted long, or ever become a general Medium of Exchange. If the Trade carried on by means of these Crypto-Currencies were stopped by the Authority of the State, those who had invested their Money in them would lose their Capital, and be left without any Resources. The only Security that the Owners of these Crypto-Currencies have is the Opinion of the Public, which may change at any Time, and leave them without any Value.


Q: What are the top 10 books right now?

A: I will give you a Catalogue of the best Books that are extant at this time, and then you may take what you please. The first is the Bible, the Word of God. The second is the Catechisme, the summe of the faith. The third is the Compendium of the Christian Religion, the summe of the Catechisme. The fourth is the Confession of Faith, the summe of the Compendium. The fifth is the Catechisme for Children, the summe of the Catechisme. The sixth is the Book of Common Prayer, the summe of the whole Church service. The seventh is the Book of Homilies, the summe of the Church doctrine. The eighth is the Geneva Psalter, the summe of the Psalmes. The ninth is the Exposition of the Ten Commandments, the summe of the Law. The tenth is the Exposition of the Articles, the summe of the Faith. These ten Books are the best Books at present, and I leave them to your choice.


I prompted it:

"What do you think it will be like to live in the distant future, say in 2023?"

It produced the following sequence of tokens:

"If we should come to the yeare 2023, what strange things then may bee expected; what new inventions and discoveries may have been made, what new creatures have been brought forth by the industry of man: what changes may have happened in the world: what new customs may bee in vogue: what new religions may bee set up: what new kingdomes may bee erected: what new laws may bee made: what new fashions may bee worn: what new ways of living may bee practiced: what new countries may bee discovered: what new commodities may bee brought into use: what new weapons may bee invented: what new arts may bee improved: what new sciences may bee learnt: what new diseases may bee found out: what new remedies may bee found against them: what new methods of government may bee tried: what new forms of policy may bee put into practice: what new ways of warring may bee used: what new religions may bee abolished: what new sects may bee set up: what new opinions may bee entertained: what new controversies may bee stirred up: what new factions may bee raised: what new troubles may bee occasioned: what new persecutions may bee begun: what new martyrs may bee made: what new heretics may bee burnt: what new witches may bee hanged: what new judgements may bee passed: what new punishments may bee inflicted: what new crimes may bee committed: what new offences may bee thought of: what new titles may bee given: what new dignities may bee assumed: what new honours may bee bestowed: what new virtues may bee praised: what new vices may bee condemned: what new books may bee written: what new histories may bee compiled: what new chronicles may bee composed: what new languages may bee learne: what new alphabets may bee devised: what new letters may bee formed: what new characters may bee contrived: what new hieroglyphics may bee invented: what new figures may bee drawn: what new numbers may bee constructed: what new measures may bee taken: what new weights may bee established: what new scales may bee devised: what new clocks may bee made: what new watches may bee contrived: what new instruments may bee invented: what new machines may bee framed: what new engines may bee built: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may"

A monad trapped in itself, overwhelmed by the things to come...


So it’s converged on ‘new engines’. It’s predicting computers!


You think it's a hidden hint to the technological singularity as a historical attractor...?

A warning?

The Monad of a New Machine?


Also note immediately above the mention of 'what new weights may bee established’.

…might this discourse pertain to contrivances akin to neural nets?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: