> By collecting usage information into a single place, engineering and IT leaders can get a complete picture into both user and agent token efficiency across the organization and providers.
Potato, potatoh. People get confused by all this agent talk and forget that, at the end of the day, LLM calls are effectively stateless. It's all abstractions around how to manage the context you send with each request.
> checkpoints/rollbacks are still not implemented in the VS Code GUI
Rollbacks have been broken for me in the terminal for over a month. It just didn’t roll back the code most of the time. I’ve totally stopped using the feature and instead just rely on git. Is this this case for others?
I've been using /rewind in claude code (the terminal, not using vscode at all) quite a bit recently without issue - if that's the feature you're asking about.
Not discounting at all that you might "hold it" differently and have a different experience. E.g. I basically avoid claude code having any interaction with the VCS at all - and I could easily VCS interaction being a source of bugs with this sort of feature.
I mean double tapping escape, going back up the history, and choosing the “restore conversation and code” option. Sometimes bits of code are restored, but rarely all changes.
It worked when first released but hasn’t for ages now.
API Error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"Output blocked by content filtering policy"},
recently, for perfectly innocuous tasks. There's no information given about the cause, so it's very frustrating. At first I thought it was a false positive for copyright issues, since it happened when I was translating code to another language. But now it's happening for all kinds of random prompts, so I have no idea.
According to Claude:
I don't have visibility into exactly what triggered the content filter - it was likely a false positive. The code I'm writing (pinyin/Chinese/English mode detection for a language learning search feature) is completely benign.
What if you simply need to give them access. E.g if you want them to do code review you have to at least give them code repo read access. But you don't know if the environment where agent runs will be compromised
It seems like everyone wants to avoid running a local VM manually and I'm not sure why. It's a very simple solution that solves all these issues.
If you're on a Mac working on a linux docker containers, your Docker engine is already running a VM (and a linux VM doesn't need one). So you're still only "one VM away" from the real environment. Most IDEs support directly working in the VM via SSH if you need to inspect the code.
You then run --dangerously-skip-permissions and do all changes via PRs. I have been running this combined with workmux [0] for a couple of months and highly recommend it. You can one-shot several whole PRs concurrently with this setup.
The reason it beats a cloud VM is because when you're running multiple concurrent copies of all containers in a project, it quickly eats up memory. Running a cloud VM 24/7 with high enough memory is expensive.
> By collecting usage information into a single place, engineering and IT leaders can get a complete picture into both user and agent token efficiency across the organization and providers.
What exactly is “user token efficiency”?
reply