I still mostly work in Claude Code. With the release of Sonnet 4.5, Anthropic removed the opusplan model from Claude Code, claiming that Sonnet 4.5 is a superior model to Opus 4.1. Unfortunately, I haven’t found this to be the case.

I ended up so frustrated with Sonnet 4.5 I installed Codex CLI and gave it a spin. The UX isn’t as good as Claude Code, but the GPT-5 Codex model more than makes up for that. For very small tasks, I’ll generally let the model plan and make changes, but for larger tasks I’ll ask it for a plan, then copy and paste it over to Claude Code for Sonnet 4.5 to execute.

I’ll still sometimes switch the model to Opus 4.1 manually, but that has been relatively rare since it is much more tedious without the opusplan model. When planning, I have to instruct it to add “stop so I can switch models” as the first step of the plan. I’ve also had to restart the planning process from scratch when I’ve run out of Opus subscription credits.

With this iteration, I’ve ended up with one codex and one claude tab per my three project lanes. Codex does planning, debugging, and reviewing. Claude Code is purely for code (and nonsense) generation.

I still use Deep Research occasionally, but the new Agent mode output is even more obnoxious so the real value of the subscription is Codex CLI. And given how little value I’ve gotten from Cursor, I’ve actually switched back to Visual Studio Code as my editor. It’s nice to be able to hit tab without wincing.

Rationale

The main reason I decided to even try GPT-5 Codex is that Sonnet 4.5 has been such a frustrating disappointment, especially when working with any sort of existing code. The reasoning capability, even when told to ultrathink, feels extremely limited.

I also find that it doesn’t always follow instructions completely. For example, even something as simple as “amend these fixes into the last commit and update the commit message,” something I’d often let Opus plan and Sonnet do before, usually ends up with a commit message full of bug fixes, completely dropping the original message.

Debugging gets even worse. Sonnet tends to guess wildly and flail between two mistakes if it gets it wrong. Sometimes it’ll just skip results it doesn’t like. Recently, when I asked it to fix broken types, the solution it chose was to pass a “missing” env variable to the static type checker, which it called “the smoking gun.” After making this “fix,” it re-ran the type check, ignored the failing result, and declared success. It turned out the type was declared incorrectly, and was trivially simple to fix.

GPT-5 Codex, on the other hand, even with just medium reasoning, feels like it actually follows instructions and reads the files on disk, attempting to engage with existing code. It isn’t perfect either - it, too, sometimes gets stuck on incorrect assumptions - but has been so much more reliable that it has fully replaced Opus in my workflows, especially when debugging code I’m not familiar with.

Having two apps and therefore two tabs per track does come with a cost. Putting aside the doubled subscriptions costs ($200 per sub, though the workflow might work with cheaper tiers), there’s the cognitive load of keeping track of tabs and the need to copy and paste between the two tabs for a given track.

It really is a shame - it was easy enough to put up with the obsequious responses from Claude when it was delivering so much value in a single app. But the Sonnet 4.5 swap has felt like a pure nerf. I’m hoping that Opus 4.5 will come along and fix some things.

Details

There are really only a few things I have set up to customize my setup.

Claude Code Dual Subscription

I use my work API key as the main subscription for my account. That ends up in the default ~/.claude folder.

For the personal subscription, I created a second configuration folder and then added an alias that invokes the claude instance with the second configuration.

alias pclaude="CLAUDE_CONFIG_DIR=~/.other-claude claude"

That way claude uses the work API key and pclaude uses my personal subscription.

Workspaces

I’ve found a happy medium of three different copies of my git repositories, what I call three “tracks.”

One is the one I use for small or immediately relevant tasks I’m working on. I’ll give tasks to Devin if I’m in the middle of working on something in that workspace. It’s the one that usually connects to my Docker instances, unless they’re being used by the other two.

The second is for slightly larger projects that I’m working on. This is usually my “real” work, the larger task I’m currently assigned. I spend time working on a plan, then wait for 20-30 minutes while Claude runs. While that’s running, I’ll move over to the first track and try to tackle little tasks from the backlog. When I’m running tests here it’ll block the first track, but I’m usually focusing on the work at that point so that’s not a big deal.

The third is where I do exploratory work - lofty goals that may not pan out, but hopefully learning how we can tackle those projects. This has become where I use my personal subscription, since these lofty goals often require Opus and can get pretty expensive.

Models

In Claude Code, I’m using default now that opusplan was removed.

In Codex CLI, I’ve mostly stuck with gpt-5-codex medium. I sometimes push up the reasoning for larger problems, but have found the medium reasoning perfectly reasonable for most things.

Claude Slash Commands and Subagents

I had a small collection of commands and subagents that I used to avoid typing things out repeatedly, but since the Sonnet 4.5 nerf I’ve had worse luck with them, especially the subagents (since they used Opus). For most, I just ask the model directly instead of delegating to a subagent or command because it feels like I need to keep Sonnet 4.5 on a short leash.

I did replace my existing review subagent with the built-in /review command in Codex CLI - it’s comparable but much much cheaper.

MCP

I use MCP rarely, mostly since I don’t often want or need external tools. Occasionally I use it for automation (e.g. create Linear tickets from a list), but I don’t see as much value doing that from Claude Code or Codex CLI.

I did for a while use Claude Code as an MCP server for itself, but this was completely replaced by subagents.

I tried to use a similar setup to connect Codex CLI with Claude Code as an MCP server, but Sonnet’s frustrating lack of reasoning made it blind to when it should have relied on Codex.

Guidance Files

Codex CLI, Devin, and Cursor all support AGENTS.md while Claude Code uses CLAUDE.md for guidance.

Until Anthropic decides to support AGENTS.md, I’ve renamed all my CLAUDE.md files to AGENTS.md and then softlinked them back to CLAUDE.md, so that both tools can use the same guidance files.

This worked pretty well with the Claude 4 models alongside GPT-5 Codex. Sonnet 4.5 seems to need considerably more heavy-handed guidance, which in turn causes poorer results with the GPT-5 family. I’ve opted to stick with my older guidance files with the hope that future upstream model improvements will smooth out the differences again.