DeepSeek's API Price Cut Changed My Claude Code and ChatGPT Math

The DeepSeek API price cut made me rethink a habit I had quietly accepted: choosing an AI coding tool and then living with whatever model economics came with it.

Claude Code is great when I want a strong terminal-native coding agent. ChatGPT and Codex are great when I want OpenAI's workflow and model stack. But when a provider like DeepSeek suddenly drops API pricing, the obvious question is not just "is this cheap?"

It is: can I actually use the cheaper model from the tools I already use?

The Price Cut Is The Interesting Part

As of May 25, 2026, DeepSeek's pricing page lists V4 Flash at:

$0.14 per 1M input tokens
$0.0028 per 1M cached input tokens
$0.28 per 1M output tokens

It also lists V4 Pro at the 75% discounted rate, with a note that after the promotion ends on May 31, 2026, the API price will still be officially adjusted to one-quarter of the original price:

$0.435 per 1M input tokens
$0.003625 per 1M cached input tokens
$0.87 per 1M output tokens

The part that matters for coding agents is cached input. Coding tools resend a lot of repeated context: system prompts, repo summaries, conversation history, tool schemas, and task state. If cache hits are cheap enough, repeated agent loops start looking very different economically.

I checked the current public pricing pages before writing this: DeepSeek API pricing, Claude plans, Claude API models, ChatGPT plans, and OpenAI API pricing.

That is why this cut is more than a nice model announcement. It changes where I want routine coding traffic to go.

The Comparison I Actually Care About

Claude Code pricing is predictable if you use a subscription: Claude Pro is $20/month when billed monthly, and Max starts at $100/month. On the API side, Anthropic lists Claude Opus 4.7 at $5 input and $25 output per 1M tokens, and Sonnet 4.6 at $3 input and $15 output.

ChatGPT has the same split. Plus is the familiar $20/month plan, Pro tiers go much higher, and OpenAI API pricing for flagship GPT models is still priced like premium infrastructure. GPT-5.5 is listed at $5 input, $0.50 cached input, and $30 output per 1M tokens.

Those plans can be worth it. I am not pretending DeepSeek replaces every hard reasoning workload.

But for coding-agent traffic, the uncomfortable truth is that a lot of tokens are not "hard reasoning" tokens. They are:

reading files
rewriting boilerplate
producing test scaffolds
formatting docs
classifying intent
continuing a known task

That is exactly the kind of traffic I want to route to a cheaper model first.

The Annoying Part: Tools Do Not Make This Easy

The problem is that Claude Code, Codex, and ChatGPT-style workflows do not all speak the same protocol.

Claude Code expects Anthropic-shaped requests.

Codex expects OpenAI-shaped requests.

Other tools may expect Gemini-style routes or their own local configuration. So even when DeepSeek exposes low-cost models, the practical setup can still turn into a mess of environment variables, API keys, base URLs, and wrappers.

That is the gap I built CliGate to fill.

What Changed With CliGate

CliGate is a local AI gateway that runs on localhost. Instead of pointing every tool directly at a provider, I point the tools at CliGate once:

# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:8081
export ANTHROPIC_API_KEY=any-key

Codex can also point at the same local gateway through its OpenAI-compatible configuration.

From there, CliGate handles the important layer:

route Claude Code, Codex CLI, Gemini CLI, and web chat through one local control plane
keep account pools and API keys in the same routing layer
map model names and app-level routes
send routine traffic to DeepSeek when cost matters
keep premium models available for the tasks that actually need them
show usage, request logs, and cost views in the dashboard

That means I do not have to decide "Claude Code or DeepSeek" as a tool choice. I can keep Claude Code as the interface and route some of its traffic through DeepSeek. I can keep Codex as the workflow and still move compatible requests to a cheaper upstream.

The Real Advantage Is Not Just Cheap Tokens

Cheap tokens help. But the bigger advantage is optionality.

I want to be able to say:

use DeepSeek V4 Flash for cheap routine work
use DeepSeek V4 Pro when I want stronger low-cost reasoning
keep Claude for difficult multi-file edits
keep GPT for workflows where OpenAI's stack is the right fit
keep local models for private or offline tasks

Without a routing layer, that sounds like a spreadsheet and a pile of config files. With a local gateway, it becomes an operations problem: add keys, set routing, inspect usage, adjust when the bill or quality tells you to.

That is the product advantage I care about. CliGate does not ask me to abandon Claude Code or ChatGPT-style tools. It lets those tools reach low-cost DeepSeek models without changing how I work.

My New Default

After this price cut, my default is no longer "pick one premium coding assistant and pay whatever it costs."

It is:

keep the coding tools I like
route routine traffic to the cheapest good-enough model
reserve expensive models for the tasks that justify them
watch usage and pricing in one place

That feels like the right shape for AI coding in 2026.

The models will keep changing. The prices will definitely keep changing. The part I do not want to keep changing is every CLI config on my machine.

CliGate is here if you want to inspect the implementation: https://github.com/codeking-ai/cligate

How are you handling model cost now: one subscription, direct API usage, or routing per task?

推荐订阅源

DEV Community