More

padolsey · 2026-01-26T10:46:50 1769424410

Agree! And this is why it is a bad idea IMHO for agents to sit at the abstraction layer of browser or below (OS). Even at the browser-addon level it's dangerous. It runs with the user’s authority across contexts and erodes zero-trust by becoming a confused deputy: https://en.wikipedia.org/wiki/Confused_deputy_problem

padolsey · 2026-01-24T01:02:25 1769216545

Yeh I don't think there's much value in a credo if it celebrates Altman. He's a terrible idol to have. He compared Trump to Hitler in 2016, then donated $1M to his inauguration and tweeted about being in an "NPC trap" when he criticized him. Took about six weeks after the election to flip. Testified to Congress that AI regulation is "essential," then lobbied against California's safety bill when it actually showed up. His own board fired him for lying to them for years. His safety team leads quit in protest saying safety took a backseat to shiny products. Multiple former colleagues, including the people who left to start Anthropic, describe psychological abuse and manipulation. Claims a $65k salary while sitting on a billion-dollar fortune built through conflicts of interest. He's not a good guy. He's a guy who says whatever serves him in the moment and has left a trail of people warning us about exactly that.

padolsey · 2026-01-17T13:12:20 1768655540

The incumbants Goodreads and their owner Amazon have indeed done such a poor job at this. Seven years ago I tried creating a basic graph using collaborative-filtering (effectively using our actual reading patterns as the embeddings space instead of semantics [human X likes book Y so likers of Y might like other things that human X has enjoyed]). It works well to this day (ablf.io) but the codebase is so ugly I've not had the bravery to update its data in a couple of years.

alansaber · 2026-01-17T14:03:39 1768658619

Yes imo this is very useful but there's not a clear industry standard on how to do so yet, which I imagine will change? Tell me if i'm missing something

padolsey · 2025-11-23T11:47:19 1763898439

I think it's become a bit of a cliche/clique'y thing amongst a certain population. I don't know its origins (tumblr emo crowd??) but I first encountered it in Silicon Valley. The Collison brothers used to love doing it, as did Altman. I feel it projects a kind of stream-of-thought with an aloofness, like "i dont care enough for correct form. language bends to my unique thoughts. read this if you like, i dont care lol".

All-lowercase comes accross as the text equivalent of a hoodie and jeans: comfortable, a bit defensive against being seen as trying too hard, and now so common it barely reads as rebellion.

saaaaaam · 2025-11-23T12:00:32 1763899232

As I understand it the root was people using the iPhone with autocorrect turned off. That’s how someone from the tumblr emo crowd (where it was definitely prevalent!) explained it to me, and the reason was because there was a lot of culture specific terminology used (including deliberate misspellings of words) that was difficult if autocorrect was switched on.

By extension you can see how that could also apply to tech.

6581 · 2025-11-23T12:27:39 1763900859

> I don't know its origins (tumblr emo crowd??)

You're almost a century off :) https://www.bauhaus-bookshelf.org/bauhaus_writing_in_small_l...

padolsey · 2025-11-16T12:36:06 1763296566

> What would the IoCs even be?

Prompts.

EMM_386 · 2025-11-16T12:46:27 1763297187

The prompts aren't the key to the attack, though. They were able to get around guardrails with task decomposition.

There is no way for the AI system to verify whether you are white hat or black hat when you are doing pen-testing if the only task is to pen-test. Since this is not part of a "broader attack" (in the context), there is no "threat".

I don't see how this can be avoided, given that there are legitime uses to every step of this in creating defenses to novel attacks.

Yes, all of this can be done with code and humans as well - but it is the scale and the speed that becomes problematic. It can adjust in real-time to individual targets and does not need as much human intervention / tailoring.

Is this obvious? Yes - but it seems they are trying to raise awareness of an actual use of this in the wild and get people discussing it.

padolsey · 2025-11-16T12:53:33 1763297613

I agree that there will be no single call or inference that presents malice. But I feel like they could still share general patterns of orchestration (latencies, concurrencies, general cadences and parallelization of attacks, prompts used to granulaize work, whether prompts themselves have been generated in previous calls to Claude). There's a bunch of more specific telltales they could have alluded to. I think it's likely they're being obscure because they don't want to empower bad actors, but that's not really how the cybersecurity industry likes to operates. Maybe Anthropic believes this entire AI thing is a brand new security regime and so believe existing resiliences are moot. That we should all follow blindly as they lead the fight. Their narrative is confusing. Are they being actually transparent or transparency-"coded"?

andrewflnr · 2025-11-17T05:42:56 1763358176

IoCs are generally things that the victim/defender of the attack sees. Defenders don't see prompts.

padolsey · 2025-11-16T12:33:09 1763296389

> PoC || GTFO

I agree so much with this. And am so sick of AI labs, who genuinely do have access to some really great engineers, putting stuff out that just doesn't pass the smell test. GPT-5's system card was pathetic. Big-talk of Microsoft doing red-teaming in ill-specified ways, entirely unreproducable. All the labs are "pro-research" but they again-and-again release whitepapers and pump headlines without producing the code and data alongside their claims. This just feeds into the shill-cycle of journalists doing 'research' and finding 'shocking thing AI told me today' and somehow being immune to the normal expectations of burden-of-proof.

stogot · 2025-11-16T13:07:44 1763298464

Microsoft’s quantum lab also made ridiculous claims this year, with no updates or retractions after they were mocked by the community and some even claimed fraud

https://www.theregister.com/2025/03/12/microsoft_majorana_qu...

https://www.windowscentral.com/microsoft/microsoft-dismisses...

52-6F-62 · 2025-11-16T14:47:24 1763304444

Tech companies simply don’t feel it is fraud. They feel it is “marketing fiction”

hugh-avherald · 2025-11-16T14:55:27 1763304927

"I had Elizabeth Holmes explain to me three times what she got arrested for because it sounds an awful lot like what I do here every day."

yahoozoo · 2025-11-17T01:44:04 1763343844

Since these were earlier this year in March, they’re just memory holing it?

mlinhares · 2025-11-16T13:03:21 1763298201

They're gonna say that if they explain how it was done bad people will find out how to use their models for more evil deeds. The perfect excuse.

stogot · 2025-11-16T13:08:18 1763298498

They can still provide indicators of compromise

ACCount37 · 2025-11-16T16:00:11 1763308811

What ARE the indicators of compromise?

It's not a piece of malware or an exploit. It's an AI hacker. It does the same things a human hacker would but faster.

JKCalhoun · 2025-11-16T13:05:49 1763298349

So that is a bad excuse?

jeltz · 2025-11-17T05:02:57 1763355777

padolsey · 2025-11-15T05:59:15 1763186355

We've done this! https://weval.org/analysis/visual__pelican/f141a8500de7f37f/...

padolsey · 2025-11-13T03:30:07 1763004607

Yeh I still don't think there's a fixed definition of what a world model is or in what modality it will emerge. I'm unconvinced it will emerge as a satisfying 3d game-like first-person walkthrough.

voodooEntity · 2025-11-13T06:32:05 1763015525

Ye but there wont be, same as with "agi" and "ai" depends on whom you are asking *shrug

ProofHouse · 2025-11-13T03:39:04 1763005144

I think absolutely it will in a year

padolsey · 2025-11-13T03:23:25 1763004205

>What happens when you prompt one of these kind of models with de_dust?

Presumably de_dust2

echelon · 2025-11-13T05:04:14 1763010254

You both should check out DiamondWM. It runs on Ubuntu and I think Windows, presuming you have an Nvidia GPU. It's exactly what you're talking about.

I linked it elsewhere in this thread.

padolsey · 2025-11-11T23:58:22 1762905502

I am looking forward to this being available on taobao/temu for around $5 !