More

frumplestlatz · 2026-01-24T08:10:03 1769242203

It's absolutely a true dichotomy. If unchecked exceptions exist, all code must be carefully written to be exception-safe, and the compiler is not going to help you at all.

Of course it's convenient to be able to ignore error paths when you're writing code. It's also a lot less convenient when those error paths cause unexpected runtime failures and data corruption in production.

A preference for unchecked exceptions is one of my most basic litmus tests for whether a developer prioritizes thinking deeply about invariants and fully modeling system behavior. Those that don't, write buggy code.

frumplestlatz · 2026-01-23T23:50:29 1769212229

At this point I just assume Claude Code isn't OSS out of embarrassment for how poor the code actually is. I've got a $200/mo claude subscription I'm about to cancel out of frustration with just how consistently broken, slow, and annoying to use the claude CLI is.

andy12_ · 2026-01-24T09:33:40 1769247220

> how poor the code actually is.

Very probably. Apparently, it's literally implemented with a React->Text pipeline and it was so badly implemented that they were having problems with the garbage collector executing too frequently.

[1] https://news.ycombinator.com/item?id=46699072#46701013

stavros · 2026-01-24T01:45:20 1769219120

OpenCode is amazing, though.

skerit · 2026-01-24T08:55:07 1769244907

I switched to Opencode a few weeks ago. What a pleasant experience. I can finally resume subagents (which has been Broken in CC for weeks), copy the source of the Assistant's output (even over SSH), have different main agents, have subagents call subagents,... Beautiful.

fragmede · 2026-01-24T02:02:04 1769220124

Especially that RCE!

qaz_plm · 2026-01-24T02:14:42 1769220882

A new one or one previously patched?

Razengan · 2026-01-24T02:13:48 1769220828

Anthropic/Claude's entire UX is the worst among the bunch

halfcat · 2026-01-24T02:31:34 1769221894

What’s the best?

Razengan · 2026-01-24T17:45:41 1769276741

In my experience, ChatGPT, and then Grok.

I've posted a lot of feedback about Claude since several months and for example they still don't support Sign in with Apple on the website (but support Sign in with Google, and with Apple on iOS!)

rashidae · 2026-01-24T01:56:53 1769219813

Interesting. Have you tested other LLMs or CLIs as a comparison? Curious which one you’re finding more reliable than Opus 4.5 through Claude Code.

frumplestlatz · 2026-01-24T06:03:47 1769234627

Codex is quite a bit better in terms of code quality and usability. My only frustration is that it's a lot less interactive than Claude. On the plus side, I can also trust it to go off and implement a deep complicated feature without a lot of input from me.

kordlessagain · 2026-01-24T03:01:45 1769223705

Yeah same with Claude Code pretty much and most people don’t realize some people use Windows.

athrowaway3z · 2026-01-24T11:19:13 1769253553

I'm almost certain their code is a dumpster fire.

As for your 200$/mo sub. Dont buy it. If you read the fine print, their 20x usage is _per 5h session_, not overall usage.

Take 2x 100$ if you're hitting the limit.

frumplestlatz · 2026-01-19T18:22:43 1768846963

> In practice, the agent isn't replacing ripgrep with pure Python, it's generating a Python wrapper that calls ripgrep via subprocess.

Yep. I have very strong guardrails on what commands agents can execute, but I also have a "vterm" MCP server that the agent uses to test the TUI I'm developing in a real terminal emulator; it can send events, take screenshots, etc.

More than once it's worked around bash tool limitations by using the vterm MCP server to exit the TUI app under development and start issuing unrestricted bash commands. I'm probably going to add command filtering on what can be run under vterm (so it can't exit back to an initial shell), which will help unless/until I add a "!<script>" style command to my TUI, in which case I'm sure it'll find and exploit that instead.

frumplestlatz · 2026-01-19T18:15:58 1768846558

Given my years of experience with Cisco "quality", I'm not surprised by this:

> Another notable affected implementation was the DNSC process in three models of Cisco ethernet switches. In the case where switches had been configured to use 1.1.1.1 these switches experienced spontaneous reboot loops when they received a response containing the reordered CNAMEs.

... but I am surprised by this:

> One such implementation that broke is the getaddrinfo function in glibc, which is commonly used on Linux for DNS resolution.

Not that glibc did anything wrong -- I'm just surprised that anyone is implementing an internet-scale caching resolver without a comprehensive test suite that includes one of the most common client implementations on the planet.

frumplestlatz · 2026-01-19T16:54:23 1768841663

> It's insane how Capitalism curtails innovation.

There is an incredible irony in your typing that out on a device so advanced that it was beyond science fiction when I was growing up 40 years ago.

frumplestlatz · 2026-01-18T10:18:57 1768731537

> but for LLM's they can instantly compose the low level tools for their use case and learn to generalize

Hard disagree; this wastes enormous amounts of tokens, and massively pollutes the context window. In addition to being a waste of resources (compute, money, time), this also significantly decreases their output quality. Manually combining painfully rudimentary tools to achieve simple, obvious things -- over and over and over -- is *not* an effective use of a human mind or an expensive LLM.

Just like humans, LLMs benefit from automating the things they need to do repeatedly so that they can reserve their computational capacity for much more interesting problems.

I've written[1] custom MCP servers to provide narrowly focused API search and code indexing, build system wrappers that filter all spurious noise and present only the material warnings and errors, "edit file" hooks that speculatively trigger builds before the LLM even has to ask for it, and a litany of other similar tools.

Due to LLM's annoying tendency to fall back on inefficient shell scripting, I also had to write a full bash syntax parser and shell script rewriting ruleset engine to allow me to silently and trivially rewrite their shell invocations to more optimal forms that use the other tools I've written, so that they don't have to do expensive, wasteful things like pipe build output through `head`/`tail`/`grep`/etc, which results in them invariably missing important information, and either wandering off into the weeds, or -- if they notice -- consuming a huge number of turns (and time) re-running the commands to get what they need.

Instead, they call build systems directly with arbitrary options, | filters, etc, and magically the command gets rewritten to something that will produce the ideal output they actually need, without eating more context and unnecessary turns.

LLMs benefit from an IDE just like humans do -- even if an "IDE" for them looks very different. The difference is night and day. They produce vastly better code, faster.

[1] And by "I've written", I mean I had an LLM do it.

frumplestlatz · 2026-01-17T15:33:57 1768664037

Validating the correctness of AI output seems like one of the biggest problems we are going to face. AI can generate code far faster than humans can adequately review it.

My work is in formal verification, and we’re looking at how to apply what we do to putting guard rails on AI output.

It’s a promising space, but there’s a long way to go, and in the meantime, I think we’re about to enter a new era of exploitable bugs becoming extremely common due to vibe coding.

I vibe coded an entire LSP server — in a day — for an oddball verification language I’m stuck working in. It’s fantastic to have it, and an enormous productivity boost, but it would’ve literally taken months of work to write the same thing myself.

Moreover, because it ties deeply into unstable upstream compiler implementation details, I would struggle to actually maintain it.

The AI took care of all of that — but I have almost no idea what’s in there. It would be foolish to assume the code is correct or safe.

frumplestlatz · 2026-01-16T20:18:22 1768594702

What's the story on local-only operation with a paid subscription? If it works like it seems to from the screen shots and video, I'm more than happy to pay for this even while using the OSS version (and I probably am going to modify it myself quite a bit for my own use-cases), but in my work environment:

1. Absolutely nothing can be sent off-site

2. All AI API requests must go through our custom gateway (we've got deals with all the major AI providers, and I think that even involves a degree of isolated hosting in specific approved cloud environments)

While $20/mo feels a bit weird for an app that doesn't rely on its own cloud service, I'd subscribe right now just because I do "really want this to work".

frumplestlatz · 2026-01-12T15:16:12 1768230972

To be fair to those designers, color reproduction is a really hard problem, and shitty monitors have terrible color reproduction.

You want your designers to have accurate color reproduction for obvious reasons, but they should be testing their work on shitty monitors, too.

Perepiska · 2026-01-15T21:53:09 1768513989

They were denied any attempts to put them on mass market laptops.

nosianu · 2026-01-12T18:36:52 1768243012

> You want your designers to have accurate color reproduction for obvious reasons

I don't know, I conclude the opposite. If you need accurate color reproduction when you publish online, you are doing something wrong.

I used to co-own a small digital printing business, so I'm aware of what all of it means, and I had an appropriate monitor myself and a paid Adobe Design Suite subscription.

But for the web, when our setup is too good it's actually a detriment. It is predictable that you end up publishing things that require your quality setup. There is a good reason not to bother with a high quality monitor usable for serious publishing and photo/video editing when you only do web thing. Which is exactly why when I bought my last monitor, which is for business work and coding and web browsing and other mundane things, I deliberately ignored all the very high quality displays, even though the company would have paid whatever I chose. It is not an advantage for that use case.

frumplestlatz · 2026-01-11T09:44:50 1768124690

I’ve actually got an MCP server that makes it really easy for Claude to generate key events, wait for changes / wait for stable output / etc, and then take PNG screenshots of the terminal state (including all colors/styling) — which it “views” directly as part of the MCP tool response.

Wish I could open source it; it’s a game changer for TUI development.

frumplestlatz · 2026-01-11T16:00:36 1768147236

If anyone wants to do this at home, this is a great base to work from:

https://github.com/memextech/ht-mcp