More

robmccoll · 2025-12-26T15:29:26 1766762966

I don't think most software houses spend enough time even focusing on engineering time. CI pipelines that take tens of minutes to over an hour, compile times that exceed ten seconds when nothing has changed, startup times that are much more than a few seconds. Focus and fast iteration are super important to writing software and it seems like a lot of orgs just kinda shrug when these long waits creep into the development process.

robmccoll · 2025-12-22T20:05:21 1766433921

That's the thing about investing in scientific research, especially toward the basic science end of the spectrum - the real benefit is seen years down the line after technology transfer to public-private partnerships and private industry. It can take many years to decades to see the long-term benefit, which is why it needs government backing. It's not sustainable for most players in the private sector to invest research that is high risk (with respect to applicability), long term, or both. This also makes it easy to cast doubt on the value of research being done now or recently - we don't have a ton of concrete results to show for it yet. The best numbers to look at would probably be emigration / immigration of PhDs, papers published in top-tier journals and the universities associated with them, and where conferences are being held.

robmccoll · 2025-12-13T22:01:17 1765663277

This is interesting, but how do you bootstrap it? How does this little software enclave get key material in that doesn't transit untrusted memory? From a file? I guess the attacker this is guarding against can read parts of memory remotely but doesn't have RCE. Seems like a better approach would be an explicitly separate allocator and message passing boundaries. Maybe a new way to launch an isolated go routine with limited copying channels.

cyberax · 2025-12-13T22:23:45 1765664625

> How does this little software enclave get key material in that doesn't transit untrusted memory?

Linux has memfd_secret ( https://man7.org/linux/man-pages/man2/memfd_secret.2.html ), that allow you to create a secure memory region that can't be directly mapped into regular RAM.

robmccoll · 2025-12-01T13:27:28 1764595648

This is not my experience on the latest version of Chrome Android (142.0.7444.171). It did not crash for me.

robmccoll · 2025-11-25T11:09:00 1764068940

What's the reason for moving from ASCII CHAR to UTF16 WCHAR rather than UTF8 CHAR? I wouldn't think any parts of the codebase that don't need to render the string or worry about character counts would need to be modified.

Edit: https://devblogs.microsoft.com/oldnewthing/20190830-00/?p=10... seems the justification was that UTF-8 didn't exist yet? Not totally accurate, but it wasn't fully standardized. Also that other article seems to imply Windows 95 used UTF16 (or UCS2, but either way 16-bit chars) so I'm confused about porting code being a problem. Was it that the APIs in 95 were still kind of a halfway point?

ynik · 2025-11-25T11:25:39 1764069939

Windows NT started supporting unicode before UTF-8 was invented, back when Unicode was fundamentally 16-bit. As a result, in Microsoft world, WCHAR meant "supports unicode" and CHAR meant "doesn't support unicode yet".

By the way, UTF-16 also didn't exist yet: Windows started with UCS-2. Though I think the name "UCS-2" also didn't exist yet -- AFAIK that name was only introduced in Unicode 2.0 together with UCS-4/UTF-32 and UTF-16 -- in Unicode 1.0, the 16-bit encoding was just called "Unicode" as there were no other encodings of unicode.

usrnm · 2025-11-25T11:42:48 1764070968

> Windows NT started supporting unicode before UTF-8 was invented

That's not true, UTF-8 predates Windows NT. It's just that the jump from ASCII to UCS2 (not even real UTF16) was much easier and natural and at the time a lot of people really thought that it would be enough. Java made the same mistake around the same time. I actually had the very same discussions with older die-hard win developers as late as 2015, for a lot of them 2 bytes per symbol was still all that you could possibly need.

jasode · 2025-11-25T11:48:58 1764071338

>, UTF-8 predates Windows NT.

Windows NT started development in 1988 and the public beta was released in July 1992 which happened before Ken Thompson devised UTF-8 on a napkin in September 1992. Rob Pike gave a UTF-8 presentation at USENIX January 1993.

Windows NT general release was July 1993 so it's not realistic to replace all UCS-16 code with UTF-8 after January 1993 and have it ready in less than 6 months. Even Linux didn't have UTF-8 support in July 1993.

anonymars · 2025-11-25T12:00:45 1764072045

> public beta

Which, let's not forget, also meant an external ecosystem already developing software for it

wongarsu · 2025-11-25T11:52:06 1764071526

UTF-8 was invented in 1992 and was first published in 1993. Windows NT 3.1 had its first public demo in 1991, was scheduled for release in 1992 and was released in 1993.

Technically UTF-8 was invented before the first Windows NT release, but they would have had to rework a nearly finished and already delayed OS

skissane · 2025-11-25T15:57:58 1764086278

Also keep in mind that ISO’s official answer was UTF-1 not UTF-8, and UTF-8 wasn’t formally accepted as part of the Unicode and ISO standards until 1996. And early versions of UTF-8 still allowed the full 31 bit range of the original ISO 10646 repertoire, before it was limited to the 21 bit range of UTF-16. Also, a lot of early UTF-8 implementations were actually what we now call CESU-8, or had various other infelicities (such as accepting overlong encodings, nowadays commonly disabled as a security risk). So even in 1993, I’m not sure it was yet clear that UTF-8 was going to win.

throwaway2037 · 2025-11-25T13:34:54 1764077694

Oh god, this again. One word: "History". No one thought we would need more than 16 bits (65k chars) to represent all the world's written languages. Then it happened. There must be no less than one thousand individually authored blog posts and technical articles on this matter. Win32, Java, and Qt all suffer from the same UTF-16 internal representation. There has been endless discussion on the matter over the last 10 years about how to change these frameworks to use UTF-8 internal representation. It is a crazy hard problem.

ninkendo · 2025-11-25T14:12:55 1764079975

The tragic part is how brief the period of time was between “ascii and a mess of code pages” and the problem actually getting solved with Unicode 2.0 and UTF-8.

Unicode 1.0 was in 1991, UTF-8 happened a year later, and Unicode 2.0 (where more than 65,536 characters became “official”, and UTF-8 was the recommended choice) was in 1996.

That means if you were green-fielding a new bit of tech in 1991, you likely decided 16 bits per character was the correct approach. But in 1992 it started to become clear that maybe a variable with encoding (with 8 bits as the base character size) was on the horizon. And by 1996 it was clear that fixed 16-bit characters was a mistake.

But that 5-year window was an extremely critical time in computing history: Windows NT was invented, so was Java, JavaScript, and a bunch of other things. So, too late, huge swaths of what would become today’s technical landscape had set the problem in stone.

UNIXes only use the “right” technical choice because it was already too hard to move from ASCII to 16-bit characters… but laziness in moving off of ASCII ultimately paid off as it became clear that 16-bits per character was the wrong choice in the first place. But otherwise UNIX would have had the same fate.

aallaall · 2025-11-25T15:39:05 1764085145

For a while the brain dead utf32 encoding was popular in the Unix/Linux world.

throwaway2037 · 2025-11-25T23:16:15 1764112575

Exactly: Long live "wchar_t".

robmccoll · 2025-11-12T13:46:06 1762955166

Let's say the only devices you can get that will run YouTube are running i/pad/visionOS or Android and that those will only run on controlled hardware and that the hardware will only run signed code. Now let's say the only way to get the YouTube client is though the controlled app stores on those platforms. You can build a chain of trust tied to something like a TPM in the device at one end and signing keys held by Apple or Google at the other that makes it very difficult to get access to the client implementation and the key material and run something like the client in an environment that would allow it to provide convincing evidence that it is a trusted client. As long as you have the hardware and software in your hands, it's probably not impossible, but it can be made just a few steps shy.

robmccoll · 2025-11-01T15:59:07 1762012747

In single threaded scripting languages, it has arisen as a way to allow overlapping computation with communication without having to introduce multi threading and dealing with the fact that memory management and existing code in the language aren't thread-safe. In other languages it seems to be used as a away to achieve green threading with an opt-in runtime written as a library within the language rather than doing something like Go where the language and built-in runtime manage scheduling go routines onto OS threads. Personally I like Go's approach. Async / await seems like achieving a similar thing with way more complexity. Most of the time I want an emulation of synchronous behavior. I'd rather be explicit around when I want something to go run on it's own.

fmajid · 2025-11-01T17:57:24 1762019844

Agreed. Async I/O is something where letting the runtime keep track of it for you doesn't incur any more overhead, unlike garbage collection, and that makes for a much more natural programming pseudo-synchronous.

robmccoll · 2025-10-22T15:06:26 1761145586

I'm more concerned about added sugar in other foods. If you're trying to keep your sugar intake down, cutting sugary sodas seems pretty obvious to me, but remembering to be careful about bread or tomato paste or anything else you might eat because some brands or restaurants add a bunch of extra sugar is really a pain.

bschwindHN · 2025-10-22T15:40:24 1761147624

The good thing is, if your palate has gotten used to unsweetened beverages, you can easily tell when bread or a sauce has sugar in it when it shouldn't. Or maybe I'm weirdly sensitive to sugar, I dunno.

robmccoll · 2025-10-09T08:34:44 1759998884

Could you intern strings? Seems like you're likely to see the same tags and attributes over and over.

piker · 2025-10-09T08:35:56 1759998956

Yes, and there are probably a lot of other clever ideas. But the better solution is probably just to implement more of the spec. Once you get through maybe 80% of the tags, you've eliminated 99.9% of the memory issue given their frequency distribution.

robmccoll · 2025-10-04T00:03:57 1759536237

Data sovereignty and distributed identity. It's trying to avoid a centralized authority on where data is stored or how to resolve and trust an identity other than DNS and using signatures to validate data that you didn't get directly from the author.