More

jdmoreira · 2026-01-13T10:06:13 1768298773

claude code is really good at it from my experience

jdmoreira · 2026-01-08T17:56:31 1767894991

that's because they are distilling the frontier models

epolanski · 2026-01-08T18:00:31 1767895231

Frontier models aren't released, they are closed source.

And the Chinese have been a huge source of innovation in the field.

Tostino · 2026-01-08T18:06:10 1767895570

You can do a rough distill through the APIs. You don't need the weights.

It was much easier when companies had models on the /completion style APIs, because you could actually get the logits for each generation step, and use that as a dataset to fit your model to.

That isn't to diminish the efforts of the Chinese developers though, they are great.

jdmoreira · 2026-01-08T18:04:55 1767895495

distilling does not require the models to be released. They simply use the apis.

They have been a source of innovation but probably not in training them.

NoOn3 · 2026-01-08T21:08:59 1767906539

They just lack of performant hardware. They have enough knowledge. And so they choose a more effective strategy without wasting resources on training from scratch.

riku_iki · 2026-01-08T18:37:04 1767897424

is there any evidence that this is significant contribution?

My intuition that one need ALOT api credits to distill such large models.

jdmoreira · 2026-01-06T19:49:06 1767728946

Peak civilization

jdmoreira · 2026-01-03T08:48:51 1767430131

You would expect them to have started baking the pizza inside the pentagon already by now :)

hahahahhaah · 2026-01-03T09:22:16 1767432136

Or they order Chinese food to throw it off

jdmoreira · 2025-12-28T17:18:56 1766942336

That's not how pressure works if it's opened. The forces balance out

jdmoreira · 2025-12-28T10:07:06 1766916426

Hard to tell... It's not double or more net gain, that I can tell.

I'm not even entirely sure it's a net positive at this point but it feels like it.

I would say probably between 20% to 50% more productive.

jdmoreira · 2025-12-12T19:02:45 1765566165

I am giving my 6 year old girl an old acer netbook that boots directly to pico-8. This will be her first computing experience. She never had access to phones or tablets.

DANmode · 2025-12-13T04:12:57 1765599177

You’re awesome.

jdmoreira · 2025-12-08T20:23:02 1765225382

I must be holding wrong then because I do use Claude Code all the time and I do think its quite impressive… still I cant see where the productivity gains go nor am I even sure they exist (they might, I just cant tell for sure!)

hurturue · 2025-12-08T22:26:21 1765232781

if you back and forth with the model, and discuss/approve every change it does, that's the problem.

you need to give it a bigish thing so it can work 15 min on it. and in those 15 min you prepare the next one(s)

jdmoreira · 2025-12-08T22:28:55 1765232935

Sure. But am I supposed to still understand that code at some point? Am I supposed to ask other team members to review and approve that code as if I had written it?

I'm still trying to ship quality work by the same standards I had 3 or 5 years ago.

hurturue · 2025-12-08T23:27:41 1765236461

when compiler appeared assembly programmers would complain all day how ugly and inneficient the generated code was

if you want to get the productivity gain you need to figure out how to solve the code review problem

bccdee · 2025-12-09T05:01:57 1765256517

Your solution is, "just ship worse code, it's probably fine"?

I think it's your standards that have fallen 90%…

onehair · 2025-12-09T06:44:46 1765262686

No not, worse code. Wrong code. Code filled with bugs. Code filled with lawsuits too. Code that make you look productive this month while you prepare to leave the company, and turn out to be absolute pooopoo the day after you leave.

jdmoreira · 2025-12-09T07:38:51 1765265931

I think there might be something here! a core of truth about what the future might hold. I cant take this approach right now though. Its not a good approach today.

christophilus · 2025-12-09T13:07:18 1765285638

But did the generated code do what you told it to do and was it deterministic? LLMs aren’t the same at all. That metaphor doesn’t work.

jdmoreira · 2025-12-07T08:14:12 1765095252

Very good (albeit extremely cynic) take!

reality is probably somewhere in the middle but it does make a good case

jdmoreira · 2025-12-05T08:00:15 1764921615

I’ve always had this weird intuition that Zeno’s Arrow Paradox is some indication that there must be some discreteness. Somehow, somewhere, there must be a ‘tick.