More

nullbound · 2026-01-11T18:32:41 1768156361

Isn't it kinda fascinating that 'Rainbow's end' called it ( among other things )?

mapontosevenths · 2026-01-11T19:37:23 1768160243

Vinge is one of my favorite authors, and I read both Rainbows End and Synthetic Serendipity years ago. I'm not sure I can figure out why they're relevant here though. Can you elaborate?

nullbound · 2026-01-11T23:43:05 1768174985

Sure. The connection is not direct, but I think that was the first book I read that explictly predicted a need to start poisoning data sets ( I think org in the book was called friend of privacy ) and to disprupt alternative reality monopoly efforts, its flooded with believable garbage. The connection may not be as clear, because it not mention AI in that capacity, but predicts human's reactions to corporate efforts.

I never read Synthetic Serendipity though so you got me curious.

mapontosevenths · 2026-01-12T00:50:03 1768179003

Ahh. Makes sense. Sometimes I forgot how prescient that book was.

I read it after a lot of the predictions had already come true. In some cases it felt like the books tech was just an alternative to tech we already had by the time I read it, and while reading it I had to remind myself that when it was written we were still a year out from the iPhone.

> I never read Synthetic Serendipity though so you got me curious. It's a short story based on a couple of chapters in the book. I think that some versions of the book included it? Either way you can read it here:

https://spectrum.ieee.org/synthetic-serendipity

nullbound · 2026-01-11T18:28:18 1768156098

I am devoting 2026 to focus on my rpg. The one differentiator here is something I am still not seeing: proper use of LLMs in games. The current batch is all lazy asset generation, maybe logic coding and what not, but not anything that could make the world actually feel alive.

Obviously, it all comes with its own sets of issues, but I am working through those as they come. But it is still a slow move solo.

the__alchemist · 2026-01-11T18:30:20 1768156220

What is your 2c on some of the recent backlash about AI assets in games? Is that a vocal minority (game artists and similar ec), or is that part of a broader trend?

nullbound · 2026-01-11T18:41:52 1768156912

I think the artists have a valid bone to pick and I think some of the backlash is mostly coming from that quarter. The actual gamers are more repulsed by established studios using AI for asset generation, because it comes across as a lazy money grab. But gamers are both flexible ( will jump through a lot of hoops to do what they want to do ) and and set in their way of 'just want to have fun'.

On the other side of the spectrum are indies, who can afford to experiment a little more so you get more interesting uses ( like 'voicelines' by AI come to mind ).

I know what I want to believe, but it is hard for me to call it ( I think it is a temporary trend ), because I might be a little too close to it. After all, gamers have been conditioned to endure a lot over the past decade and mostly shrugged off most of the assaults on their favorite past time.

If I were to compare it to something.. it is almost like using LLMs for email summaries ( which is still what most bosses seem to be most pleased with ). There are better use cases. I think those were not explored yet.

jbreckmckye · 2026-01-11T19:02:19 1768158139

Are you thinking more, of LLM based "AI" or "director" logic? Like an LLM tailoring the challenges and randomness the player experiences?

nullbound · 2026-01-11T19:16:10 1768158970

We did have some movement in the director logic before and, I forgot the zombie game in question, some of it seems pretty well served for the most part. I still think there is some room for a proper game master type that you refer to ( adjusting the challenge to what the player seems to be to do ).

I was personally thinking of making NPC less npcy ( not completely unlike dwarf fortress, but expanded ).

nullbound · 2025-12-17T00:34:26 1765931666

While I do sympathize with the thought behind it, general user is already equating llm chat box as 'better browsing'. In terms of simple positioning vis-a-vis non-technical audience, this is one integration that does make fiscal sense.. if mozilla was a real business.

Now, personally, I would like to have sane defaults, where I can toggle stuff on and off, but we all know which way the wind blows in this case.

TheRealPomax · 2025-12-17T02:07:34 1765937254

Firefox is not for general users, which is the problem that Mozilla's for a literal decade now. There is no way to make it better than Chrome or Safari (because it has to be better for every day users to switch, not just "as good" or even "way more configurable but slightly worse". It has to be appreciably better).

So the only user base is the power user. And then yes: sane defaults, and a way to turn things on and off. And functionality that makes power users tell their power user friends to give FF a try again. Because if you can't even do that, Firefox firmly deserves (and right now, it does) it's "we don't even really rank" position in the browser market.

kbelder · 2025-12-17T02:39:47 1765939187

The way to make Firefox better is by not doing the things that are making the other browsers worse. Ads and privacy are an example of areas where Chrome is clearly getting worse.

LLM integration... is arguable. Maybe it'll make Chrome worse, maybe not. Clunky and obtrusive integration certainly will.

oneeyedpigeon · 2025-12-17T09:15:34 1765962934

These comments are full of people explaining how Firefox can differentiate from chrome and safari: don't force AI on us.

chillfox · 2025-12-17T02:42:11 1765939331

I find that hard to believe, every general/average user I have spoken to does not use AI for anything in their daily lives and have either not tried it at all or only played with it a bit a few years ago when it first came out.

Turskarama · 2025-12-17T02:22:13 1765938133

The problem with integrating a chat bot is that what you are effectively doing is the same thing as adding a single bookmark, except now it's taking up extra space. There IS no advantage here, it's unnecessary bloat.

nullbound · 2025-12-11T21:31:57 1765488717

<< I feel there is a point when all these benchmarks are meaningless.

I am relatively certain you are not alone in this sentiment. The issue is that the moment we move past seemingly objective measurements, it is harder to convince people that what we measure is appropriate, but the measurable stuff can be somewhat gamed, which adds a fascinating layer of cat and mouse game to this.

nullbound · 2025-12-11T21:22:24 1765488144

I will say that it is wild, if not somewhat problematic that two users have such disparate views of seemingly the same product. I say that, but then I remember my own experience just from few days ago. I don't pay for gemini, but I have paid chatgpt sub. I tested both for the same product with seemingly same prompt and subbed chatgpt subjectively beat gemini in terms of scope, options and links with current decent deals.

It seems ( only seems, because I have not gotten around to test it in any systematic way ) that some variables like context and what the model knows about you may actually influence quality ( or lack thereof ) of the response.

martinpw · 2025-12-11T21:58:19 1765490299

> I will say that it is wild, if not somewhat problematic that two users have such disparate views of seemingly the same product.

This happens all the time on HN. Before opening this thread, I was expecting that the top comment would be 100% positive about the product or its competitor, and one of the top replies would be exactly the opposite, and sure enough...

I don't know why it is. It's honestly a bit disappointing that the most upvoted comments often have the least nuance.

stevage · 2025-12-11T22:41:26 1765492886

How much nuance can one person's experience have? If the top two most visible things are detailed, contrary experiences of the same product, that seems a pretty good outcome?

AznHisoka · 2025-12-12T12:25:18 1765542318

Also, why introduce nuance for the sake of nuance? For every single use case, Gemini (and Claude) has performed better. I can’t give ChatGPT even the slightest credit when it doesnt deserve any

block_dagger · 2025-12-11T22:03:46 1765490626

Replace "on HN" with "in the course of human events" and we may have a generally true statement ;)

rabf · 2025-12-11T23:27:36 1765495656

Chatgpt is not one model! Unless you manually specify to use a particular model your question can be routed to different models depending on what it guesses would be most appropriate for your question.

stingraycharles · 2025-12-12T03:10:54 1765509054

Isn’t that just standard MoE behavior? And isn’t the only choice you have from the UI between “Instant” and “Thinking”?

baq · 2025-12-12T07:20:18 1765524018

MoE is a single model thing, model routing happens earlier.

stingraycharles · 2025-12-12T17:51:40 1765561900

Yes but then what does the grandparent mean with “unless you specify a specific model” ? Do they mean “if you select auto, it automatically decides between instant or thinking” ?

That’s… hardly something worth mentioning.

rabf · 2025-12-16T04:36:53 1765859813

If you have the paid subscription you can choose what model your question is routed to. Current options in the UI are GPT-5.1 Instant, GPT-5.1 Thinking, GPT-5 Instant, GPT-5 thinking mini, GPT-5 thinking, GPT-4o, GPT-4.1, o3 and o4-mini. Options like deep-research will affect the reasoning level used. There is a lot that goes on behind the scenes in the chatgpt app with things like tool use or function calling coming into play as well. Ultimately what OpenAI will be trying/hoping to do is give you a satifactory result using the least amount of compute possible - this is where the autorouter is very useful for them and obstensibly for the user who would not know which one to pick. I mostly just use the API's these days as I like to be the one who decides who/what I am talking to.

blks · 2025-12-11T23:00:34 1765494034

Because neither product has any consistency in its results, no predictive behaviour. One day it performs well, another it hallucinates non existing facts and libraries. Those are stochastic machines

sendes · 2025-12-11T23:46:58 1765496818

I see the hyperbole is the point, but surely what these machines do is to literally predict? The entire prompt engineering endeavour is to get them to predict better and more precisely. Of course, these are not perfect solutions - they are stochastic after all, just not unpredictably.

coliveira · 2025-12-12T02:52:53 1765507973

Prompt engineering is voodoo. There's no sure way to determine how well these models will respond to a question. Of course, giving additional information may be helpful, but even that is not guaranteed.

lossyalgo · 2025-12-12T06:58:12 1765522692

Also every model update changes how you have to prompt them to get the answers you want. Setting up pre-prompts can help, but with each new version, you have to figure out through trial and error how to get it to respond to your type of queries.

I can't wait to see how bad my finally sort-of-working ChatGPT 5.1 pre-prompts work with 5.2.

Edit: How to talk to these models is actually documented, but you have to read through huge documents: https://cdn.openai.com/gpt-5-system-card.pdf

baq · 2025-12-12T07:17:25 1765523845

It definitely isn’t voodoo, it’s more like forecasting weather. Some forecasts are easier to make, some are harder (it’ll be cold when it’s winter vs the exact location and wind speed of a tornado for an extreme example). The difference is you can try to mix things up in the prompt to maximize the likelihood of getting what you want out and there are feasibility thresholds for use cases, e.g. if you get a good answer 95% of the time it’s qualitatively different than 55%.

coliveira · 2025-12-12T13:34:43 1765546483

No, it's not. Nowadays we know how to predict the weather with great confidence. Prompting may get you different results each time. Moreover, LLMs depend on the context of your prompts (because of their memory), so a single prompt may be close to useless and two different people can get vastly different results.

baq · 2025-12-12T15:10:32 1765552232

> we know how to predict the weather with great confidence

some weather, sometimes. we're not good at predicting exact paths of tornadoes.

> so a single prompt may be close to useless and two different people can get vastly different results

of course, but it can be wrong 50% of the time or 5% of the time or .5% of the time and each of those thresholds unlock possibilities.

dmd · 2025-12-11T21:38:17 1765489097

And I’d really like for Gemini to be as good or better, since I get it for free with my Workspace account, whereas I pay for chatgpt. But every time I try both on a query I’m just blown away by how vastly better chatgpt is, at least for the heavy-on-searching-for-stuff kinds of queries I typically do.

Workaccount2 · 2025-12-11T22:15:56 1765491356

Gemini has tons of people using it free via aistudio

I can't help but feel that google gives free requests the absolute lowest priority, greatest quantization, cheapest thinking budget, etc.

I pay for gemini and chatGPT and have been pretty hooked on Gemini 3 since launch.

crorella · 2025-12-11T23:07:17 1765494437

It’s like having 3 coins and users preferring one or the other when tossing it because one coin gives consistently more heads (or tails) than the other coin.

What is better is to build a good set of rules and stick to one and then refine those rules over time as you get more experience using the tool or if the tool evolves and digress from the results you expect.

nullbound · 2025-12-11T23:12:56 1765494776

<< What is better is to build a good set of rules and

But, unless you are on a local model you control, you literally can't. Otherwise, good rules will work only as long as the next update allows. I will admit that makes me consider some other options, but those probably shouldn't be 'set and iterate' each time something changes.

crorella · 2025-12-12T03:06:43 1765508803

what I had in mind when I added that comment was for coding, with the use of .md files. For the web version of chats I agree there is little control on how to tailor the way you want the agent to behave, unless you give a initial "setup" prompt.

jhancock · 2025-12-11T22:49:15 1765493355

I can use GPT one day and the next get a different experience with the same problem space. Same with Gemini.

4ndrewl · 2025-12-11T23:10:38 1765494638

This is by design, given a non-determenitisic application?

jhancock · 2025-12-11T23:14:43 1765494883

sure. It may be more than that...possibly due to variable operating params on the servers and current load.

On whole, if I compare my AI assistant to a human worker, I get more variance than I would from a human office worker.

pixl97 · 2025-12-11T23:35:57 1765496157

Thats because you don't 'own' the LLM compute. If you instead bought your office workers by the question I'm sure the variability would increase.

astrange · 2025-12-12T00:07:26 1765498046

They're not really capable of producing varying answers based on load.

But they are capable of producing different answers because they feel like behaving differently if the current date is a holiday, and things like that. They're basically just little guys.

sjaramillo · 2025-12-11T23:25:24 1765495524

I guess LLMs have a mood too

dr_dshiv · 2025-12-12T00:39:29 1765499969

Vibes

nunez · 2025-12-11T23:28:19 1765495699

Tesla FSD has been more or less the same experience. Some people drive 100s of miles without disengaging while others pull the plug within half a mile from their house. A lot of it depends on what the customer is willing to tolerate.

austhrow743 · 2025-12-12T02:49:45 1765507785

We've been having trouble telling if people are using the same product ever since Chat GPT first got popular. The had a free model and a paid model, that was it, no other competitors or naming schemes to worry about, and discussions were still full of people talking about current capabilities without saying what model they were using.

For me, "gemini" currently means using this model in the llm.datasette.io cli tool.

openrouter/google/gemini-3-pro-preview

For what anyone else means? If they're equivalent? If Google does something different when you use "Gemini 3" in their browser app vs their cli app vs plans vs api users vs third party api users? No idea to any of the above.

I hate naming in the llm space.

dmd · 2025-12-12T12:02:30 1765540950

FWIW i’m always using 5.1 Thinking.

Bombthecat · 2025-12-12T01:26:36 1765502796

Could also be a language thing ...

nullbound · 2025-12-09T20:09:32 1765310972

<< Running LLaMA-12 7B on a contact lens with WASM (arxiv.org)

But is it a hallucination if it was asked for it?:D

benob · 2025-12-09T20:24:57 1765311897

Will there ever be a llama12? Is it going to go the yolo route?

Den_VR · 2025-12-09T20:15:39 1765311339

I don’t think so, no.

nullbound · 2025-12-09T14:44:39 1765291479

I feel the same way. In a sense, our parents had it easier in terms of the damage external world could do emotionally, because there was typically a simple way to prevent most of it. Now, it is not nearly as simple. Not to search very far, our kid has a media diet that some consider strict ( 30 minutes a day of pre-selected items if kid meets some criteria, which I still consider too high ). But then some kids already have cellphones, ipads ( some completely unlocked too ! ). I only recently gave my kid lappy with gcompris installed ( locked down lappy; no net access ). Point I am trying to make in my rambly way is that each parent is hodge podge of various choices. And it does not work in aggregate.

I get that it is all about balance, but it is hard to disagree with Rahm here. Top down ban is the only real way to go.

rozap · 2025-12-09T17:51:00 1765302660

> Point I am trying to make in my rambly way is that each parent is hodge podge of various choices. And it does not work in aggregate.

On top of that, you have some of the biggest, most moneyed companies in the country spending billions of dollars to get kids and adults hooked. Even for parents with good intentions, it's not a fair fight.

Maybe I'm going off the deep end, but I sometimes think people that work at Facebook should be considered social pariahs. The amount of damage that company has done to our country and society is truly incalculable. It's really hard for me to forgive anyone who had any part in it.

nullbound · 2025-12-09T13:38:14 1765287494

<< They have always wanted us to magically know what parts are "good enough" and what parts can slide but for us to bear the burden of blame.

Well, that part is bound to add a level of tension to the process. Our leadership has AI training, where the user is responsible for checking its output, but the same leadership also outright stated it now sees individual user of AI as having 7 employees under them ( so should be 7x more productive ). Honestly, its maddening. None of it is how it works at all.

nullbound · 2025-12-09T13:14:24 1765286064

I love the level of detail ( probably, because I see it less and less these days ). It genuinely makes me wonder if anyone tried training LLMs on their own writings ( assuming those bigger than 100+ pages ) and what the results were.

jadbox · 2025-12-09T13:18:06 1765286286

I just want to chime in here about the importance of taking notes and having a journal. These things are now more important than ever as they can literally help fine-tune agents to help assist you using your personal style.

trial3 · 2025-12-09T13:23:59 1765286639

> These things are now more important than ever

oh definitely. i agree here. can't wait to read the rest of the sentence, probably saying something meaningful about the creative benefits of unstructured writing, or the importance of relying on your own thoughts and language and unique voice in the era of LLMs

> as they can literally help fine-tune agents to help assist you using your personal style.

oh

jadbox · 2025-12-10T01:33:57 1765330437

I get it. Both things can be true. Unstructured writing can help you develop as a person. It can also teach your own model the 'real raw human train of thoughts' of your personal journey. Personally I love the idea of booting up great-great-grandpa-model that'll have been trained on his 40 years of almost daily journaling. We are not trying to 'remake him' to be clear- we are talking about being have to have an interaction chat with his personality-vibe as it was recorded by his own hand and in his own words.

itissid · 2025-12-09T17:12:17 1765300337

I have always wondered if I should be recording all my conversations privately — with consent —with family and friends and then train an LLM to let anyone speak to someone that sounds "like me" when I am gone.

I suppose one could order all the data over time -— decades — and then train a model incrementally every decade and imitate me better at a point in time.

I suppose one could also narrate thoughts and feelings associated with many transcripts, which would be very tedious but would make the LLM imitate not just style but some amount of internal monologue.

I suppose one level further could be an LLM learning about the variety or parts of the ego, the I, me, mine, ours. Then the Observer and the Observed parts of thought — if we can somehow tap internal thought without manually speaking — because thoughts are, metaphorically speaking, the speed of light.

Why would one do all this? I suppose a curt answer would be to "live" eternally of course — with all the limitations of the current tech — but still try.

It might make a fascinating psychoanalysis project, one that might be a better shot at explaining someone's _self_ not as a we, a stranger, might as outwardly see it: just as a series of highs and lows and nothing in between, but instead as how they lived through it.

futuraperdita · 2025-12-10T04:53:37 1765342417

You've created a text-based version of a Black Mirror episode: https://en.wikipedia.org/wiki/Be_Right_Back

SecretDreams · 2025-12-09T13:35:36 1765287336

Is this what tool and die makers used to feel when going to LOC to train their replacements?

Personally, I do not want my likeness to persist after my death, nor do I wish for a company to be able to leverage my likeness after I leave said company.

djmips · 2025-12-10T00:48:39 1765327719

from context I figure you meant China and/or other places that would take over American manufacturing but I'm curious what LOC means - typo?

nullbound · 2025-12-09T13:45:24 1765287924

I understand the concern, but I also think there are benefits to this approach. And while I absolutely agree with you on the likeness part used for a company, at a personal level, I believe it could have a great impact ( and be of use ). And, more importantly, you can then control the disposition of your likeness appropriately ( via an old fashioned will ). As a society, we seem to have solutions for these situations. They were just not very common.

SecretDreams · 2025-12-09T14:59:11 1765292351

Given the velocity of this industry and it being largely driven by corporations, how many individuals do you think will have control over their likeness vs their likeness being stored by some entity they did not explicitly consent towards?

I appreciate your take, I just think it is not in line with the current trajectory outside of some unique HN posters and the like - and even they will probably wake up one day realizing some entity also already owns their likeness, albeit the HN user might have a local copy they hand crafted themselves using some cobbled together hardware.

nullbound · 2025-12-09T15:49:34 1765295374

You do have a point. That is why I am not pushing it as a general solution and frankly why I am not super keen on putting everything on github for everyone to see. If there is only one dark joke of the current times, it is that pressing agree somehow constitutes agreeing to legally consenting all sorts of invasive practices.

I would absolutely not suggest doing what I am doing to an average user.

edit: Frankly, just by thinking I am above average I might be inviting a more risky behavior.

levmiseri · 2025-12-09T15:12:06 1765293126

Fully agree on the importance of taking notes and writing in general [1], but I absolutely do not want to train a model on my texts or attempt a personal style imitation. I can't fully put my finger on why exactly other than that it feels icky and that it would hinder my long-term writing quality rather than help it.

[1] I made an app to be my lifelong companion for this: https://kraa.io/about – No AI integration.

BoredomIsFun · 2025-12-09T13:54:01 1765288441

/r/localllama every once in awhile has such posts; usually very succesful, good results.

alansaber · 2025-12-09T15:33:49 1765294429

Fine-tuning on a small corpus can definitely get you good results with some care

nullbound · 2025-12-04T23:25:57 1764890757

Overall, it does sound weird. On the one hand, assuming I properly I understand what they are saying is that they removed model's ability to cheat based on their specific training. And I do get that nuance ablation is a thing, but this is not what they are discussing there. They are only removing one avenue of the model to 'cheat'. For all we know, some that data may have been part of its training set already...