Amusingly, for a library [1] I’ve been building, 100% of the code is AI-written (with a huge number of iterations of course) and the ONLY part I wanted to write myself is the portion of the README that explains the thought process behind one of the features. It took a lot of thinking and iterations to come up with the right style and tone, and methodically explain the ideas in the right order.
Leaving that to an LLM would have been a frustrating exercise.
The aichat tool I mentioned in another comment [1] enables exactly this type of cross-agent work-continuation, specifically between Claude-Code and Codex-CLI.
I like that it does not require following any particular "system" or discipline. But having to use a non-local/proprietary memory layer is not ideal.
My own fully-local, minimalistic take on this problem of "session continuation without compaction" is to rely on the session JSONL files directly rather than create separate "memory" artifacts, and seamlessly index them to enable fast full-text search. This is the idea behind the "aichat" command-group + plugin I just added to my claude-code-tools [1] repo. You can quit your Claude-Code/Codex-CLI session S and type
aichat resume <id-of-session-S-you-just-quit>
It launches a TUI, offering a few ways to continue your work:
- blind trim - clones the session, truncates large tool calls/results and older assistant messages, which can clear up as much as 50% of context depending of course on what's going on; this is a quick hack to continue your work a bit longer
- smart trim - similar but uses headless agent to decide what to truncate
- rollover: the one I use most frequently; it creates a new session S1 (which can optionally be a different CLI agent, allowing cross-agent work continuation), and injects back-pointers to the parent session JSONL file of S, the parent's parent , and so on (what I call session lineage) , into the first user message, and the user can then prompt the agent to use a sub-agent to extract arbitrary context from the ancestor sessions to continue the work.
Oops yes Hesse, right, I got my German authors crossed :)
Sidhartha was beautiful so I wanted to try Glass Bead Game, and I agree, I liked the former much more. I’ll give Demian a try. Thanks for the rec.
This resonates with how I’ve been thinking about open source. I see the steps as:
1. Personally identify a pain in your own work, and it most likely will be a pain for many others.
2. Build a solution to solve for it.
3. Organically talk about it in forums — for me this is Reddit, HN lately and to some extent Bluesky.
When people ask why I build open source, I say it’s about signaling. As other comments have mentioned, if you’re fortunate enough that it gains traction, it becomes your calling card and can lead to consulting and jobs. It’s analogous to academic publishing (used to do more of that) but with different dynamics.
My personal examples of solving for a pain are:
[A] I started building the Langroid LLM agent framework after having a look at LangChain in Apr 2023, at a time when there was hardly any talk of LLM-agents. The aim was to create a principled, hackable, lightweight library for building LLM applications, and agents happened to be a good abstraction:
https://github.com/langroid/langroid
[B] With the explosion of Claude Code and similar CLI coding agents, there were several interesting problems to solve for myself, and I started collecting them here: https://github.com/pchalasani/claude-code-tools
One such tool is a lossless alternative to compaction, and a Tmux-CLI tool/skill for CLI agents to interact with others.
A workflow I find useful is to have multiple CLI agents running in different Tmux panes and have one consult/delegate to another using my Tmux-CLI [1] tool + skill. Advantage of this is that the agents’ work is fully visible and I can intervene as needed.
Have you considered using their command line options instead? At least Codex and Claude both support feeding in new prompts in an ongoing conversation via the command line, and can return text or stream JSON back.
You mean so-called headless or non-interactive mode? Yes I’ve considered that but the advantage communication via Tmux panes is that all agent work is fully visible and you can intervene as needed.
My repo has other tools that leverage such headless agents; for example there’s a resume [1] functionality that provides alternatives to compaction (which is not great since it always loses valuable context details):
The “smart-trim” feature uses a headless agent to find irrelevant long messages for truncation, and the “rollover” feature creates a new session and injects session lineage links, with a customizable extraction of context for the task to be continued.
I've had good success with a similar workflow, most recently using it to help me build out a captive-wifi debugger[0]. In short, it worked _pretty_ well, but it was quite time intensive. That said, I think removing the human from the loop would have been insanity on this: lots of situations where there were some very poor ideas suggested that the other LLMs went along with, and others where one LLM was the sole voice of reason against the other two.
I think my only real take-away from all of it was that Claude is probably the best at prototyping code, where Codex make a very strong (but pedantic) code-reviewer. Gemini was all over the place, sometimes inspired, sometimes idiotic.
This is exactly why I built Mysti because I used that flow very often and it worked well, I also added personas and skills so that it is easy to customize the agents behavior and if you have any ideas to make the behavior better then please don’t hesitate to share! Happy to jump on a call and discuss it as well
I have a similar workflow except I haven’t put time into the tooling - Claude is adept at TMUX and it can almost even prompt and respond to ChatGPT except it always forgets to press Enter when it sends keys. Have your agents been able to communicate with each other with tmux send-keys?
I had the same issue. Subagents are nice but the LLM calling them can’t have a back and forth conversation. I tried tmux-cli and even other options like AgentAPI[0] but the same issue persists, the agent can’t have a back and forth with the tmux pane.
To people asking why would you want Claude to call Codex or Gemini, it’s because of orchestration. We have an architect skill we feed the first agent. That agent can call subagents or even use tmux and feed in the builder skill. The architect is harnessed to a CRUD application just keeping track of what features were built already so the builder is focused on building only.
I find that asking Claude to develop and Codex to review the uncommitted changes will typically result in high-value code, and eliminate all of Claude’s propensity to perpetually lie and cheat. Sometimes I also ideate with Claude and then ask Claude to get ChatGPT’s opinion on the matter. I started by copy-pasting responses but I found tmux to be a nice way to get rid of the middleman.
What does tmux add here? Or how does it allow you to do that? I’m sorry I’m just missing it I’m sure. I don’t use tmux a lot so I don’t know all its potential.
You're right that vanilla tmux can do all of this, if a human were to use it. tmux-cli exists because LLMs frequently make mistakes with raw tmux: forgetting the Enter key, not adding delays between text and Enter (causing race conditions with fast CLI apps), or incorrect escaping.
It bakes in defaults that address these: Enter is sent automatically with a 1-second delay (configurable), pane targeting accepts simple numbers instead of session:window.pane, and there's built-in wait_idle to detect when a CLI is ready for input. Basically a wrapper that eliminates the common failure modes I kept hitting when having Claude Code interact with other terminal sessions.
The idea works well with or without direct integration. You can have a cli agent read arbitrary state of any tmux session and have it drive work through it. I use it for everything from dev work to system debugging. It turns out a portable and callable binary with simple parameters is still easier to use for agents than protocols and skills: https://github.com/tikimcfee/gomuxai
There’s no special support needed; it’s just a bash command that any CLI agent can use. For agents that have skills, the corresponding skill helps leverage more easily. I’ll add that to the README
I have both Codex and Claude subs so I wanted one to be able to consult the other. Also it’s useful when you have a cli script that an agent is iterating on, so it can test it. Another use case is for a CLI agent to run a debugger like PDB in another pane, though I haven’t used it much.
I used to get stuck sometimes with Claude and needing a different agent to take a look and the switch back and forth between those agents is a headache and also you won’t be able to port all the context so thought this might help solve real blockers for many devs on larger projects
I recently found myself wanting to use Claude Code and Codex-CLI with local LLMs on my MacBook Pro M1 Max 64GB. This setup can make sense for cost/privacy reasons and for non-coding tasks like writing, summarization, q/a with your private notes etc.
I found the instructions for this scattered all over the place so I put together this guide to using Claude-Code/Codex-CLI with Qwen3-30B-A3B, 80B-A3B, Nemotron-Nano and GPT-OSS spun up with Llama-server:
Llama.cpp recently started supporting Anthropic’s messages API for some models, which makes it really straightforward to use Claude Code with these LLMs, without having to resort to say Claude-Code-Router (an excellent library), by just setting the ANTHROPIC_BASE_URL.
Curious how well it would do in Gemini CLI. Probably not that good, at least from looking at the terminal-bench-2 benchmark where it’s significantly behind Gemini-3-Pro (47.6% vs 54.2%), and I didn’t really like G3Pro in Gemini-CLI anyway. Also curious that the posted benchmark omitted comparison with Opus 4.5, which in Claude-Code is anecdotally at/near the top right now.
reply