I discussed approaches in my earlier reply. But what you are saying now makes me think you are having problems with too much context. Pare down your CLAUDE.md massively and never let you context usage get over 60-65%. And tell CLAUDE not to commit anything without explicit instructions from you (unless you are working in a branch/worktree and are willing to throw it all away).
Those kinds of errors were super common 4-6 months ago, but LLM quality moves fast. Nowadays I don't see these very often at all. Two things that make a huge difference: work on writing a spec first. github.speckit, GSD, BMAD, whatever tool you like can help with this. Do several passes on the spec to refine it and focus on the key ideas.
Now that you have a spec, task it out, but tell the LLM to write the tests first (like Test-Driven Development, but without all the formalisms). This forces the LLM to focus on the desired behavior instead of the algorithms. Be sure to focus on tests that focus on real behavior: client apis doing the right error handling when you get bad input, handling tricky cases, etc. Tell the system not to write 'struct' tests - checking that getters/setters work isn't interesting or useful.
Then you implement 1-3 tasks at a time, getting the tests to pass. The rules prevent disabling tests, commenting out tests, and, most importantly, changing the behavior of the tests. Doesn't use a lot of context, little to no hallucinating, and easily measurable progress.
Wrong. If you know nix then you know "leverages the unique way that Flox environments are rendered without performing a nix evaluation" is a very significant statement.
Yeah, it's essentially cached eval, the key being where/how that eval is stored.
When you create a Flox environment, we evaluate the Nix expressions once and store the concrete result (ie exact store paths) in FloxHub. The k8s node just fetches that pre-rendered manifest and bind-mounts the packages with no evaluation step at pod startup.
It's like the difference between giving the node a recipe to interpret vs. giving it a shopping list of exact items. Faster, safer, and the node doesn't need to know how to cook (evaluate Nix). I don't know, there's a metaphor here somewhere, I'll find it.
Only so much room for magic, for sure, but tons of room for efficiency and optimization.
Correction: we don't eval when you create environments.
Our catalog continuously pre-evaluates nixpkgs in the background. 'flox install' just selects from pre-evaluated packages -- no eval needed, ever. The k8s node fetches the manifest and mounts the packages.
Eval is done once, centrally, continuously. So... even more pre-val'd, so to speak.
Recent articles seem only to mean LLMs when they reference AI. There are tons of commercial usecases for other models. Image Classification models, Image Generation models (traditionally difusion models, although some do use llm for image now), TTS models, Speach Transcription, translation models, AI driving models(autopilot), AI risk assessment for fraud, 3D structural engineering enhancement models.
With many of the good usecases of AI the end user doesn't know that ai exists and so it doesn't feel like there is AI present.
> With many of the good usecases of AI the end user doesn't know that ai exists and so it doesn't feel like there is AI present.
This! The best technology is the one that you don't notice and that doesn't get in the way. A prominent example is the failure of the first generation of smart phones: they only took off once someone (Apple) managed to the hide OS and its details properly from the user. We need the same for AI - chat is simply not a good interface for every use case.
I don't think its for everyone, the paper metaphor either works for you or it doesn't.
That said, the other big benefit for me is it breaks a lot less often than hyprland and its ecosystem seems to (and I don't just mean bugs here, I also mean things like config file format changes). And this isn't a slam on hyprland - I was only ever mildly annoyed by its breakage.
Not OP, but I typically only have 1–2 windows per workspace. I use tabs in both browser and terminal (eg via tmux). So it seems like niri's scrolling capabilities wouldn't bring much to my use case.
So the Safari developers are overworked/under-resourced, but Google somehow should have infinite resources to maintain things forever? Apple is a much bigger company than Google these days, so why shouldn't they also have these infinite resources? Oh, right, its because fundamentally they don't value their web browser as much as they should. But you give them a pass.
reply