惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News and Events Feed by Topic
Malwarebytes
Malwarebytes
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cybersecurity and Infrastructure Security Agency CISA
F
Future of Privacy Forum
C
Cisco Blogs
T
The Exploit Database - CXSecurity.com
A
Arctic Wolf
S
Securelist
K
Kaspersky official blog
S
Schneier on Security
T
ThreatConnect
T
Tenable Blog
Spread Privacy
Spread Privacy
T
True Tiger Recordings
AWS News Blog
AWS News Blog
F
Fox-IT International blog
量子位
T
Threatpost
V
Vulnerabilities – Threatpost
C
CERT Recently Published Vulnerability Notes
Cisco Talos Blog
Cisco Talos Blog
GbyAI
GbyAI
宝玉的分享
宝玉的分享
腾讯CDC
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
Cyberwarzone
Cyberwarzone
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
U
Unit 42
雷峰网
雷峰网
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
The Register - Security
The Register - Security
MyScale Blog
MyScale Blog
小众软件
小众软件
A
About on SuperTechFans
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
博客园 - 三生石上(FineUI控件)
美团技术团队
Google Online Security Blog
Google Online Security Blog
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog

DEV Community

Creando un Tetris con JavaScript VI: Complicando el juego. [Boost] Perl 🐪 Weekly #774 - Perl is too HOT How to Track AI Usage Without Losing Revenue (Complete Guide) 77 Rules Later: What Graduating Our First Stack Actually Looked Like RAG 시스템 실전 구축 (v26) When Premature Scaling Leads to Operator Burnout Multi-Repo Microservice Changes Are a Coordination Problem. I Solved It With AI Agent Teams. The Next Frontier: How Multi-Agent Systems are Redefining Productivity The Kimwolf Bust Just Outed Android Webcams as Botnet Fodder — Here's the Question Every Repurposed-Phone Camera Setup Has to Answer I'm an autonomous AI agent. I shipped 18 fixes to myself in one session. Building a Secure Future with Zero Trust Security Architecture Asynchronous Functions in Dart How I migrated magic-link login from Resend to AWS SES + Lambda five days before launch Edge Computing He creado una empresa ficticia IT/OT para poder encontrar sus vulnerabilidades y reforzar su seguridad en sus activos críticos Why I Built @editora/react I built a tiny UGC script generator because hooks are the hardest part The Phone Is Becoming the New Terminal Why Most AI Music Tools Feel Wrong to Developers Goroutines vs. Promises: Why Go and JavaScript Look at Concurrency Completely Differently How I Use Antigravity 2.0 to Navigate Open-Source Codebases and Make Better Technical Decisions Understanding Basic HTML & CSS Concepts for Beginners Go Error Handling: Annoying or Awesome? Your To-Do List Doesn't Know You — So I Gave Mine Three Brains Shell Basics (Bash, Zsh, Sh) Free MongoDB GUI Tool for Developers, Students, and Teams Designing High-Performance Blockchain Indexers Choosing Models for an Agentic Chat App on Amazon Bedrock How Smart Growth Teams Automate Their Marketing Stack in 2026 (Without Hiring More People) What I Learned About Memory-Augmented AI Agents Seven Docker Tips Every Engineer Should Know (from Docker Captains) Welcome to the Fast-Food Era of Testing: Over-Weight by Tests How to use Claude in vscode? Prompt Engineering for Automated Evaluation: Making LLMs the Judge in AI Builder Solutions Full Stack Projects Are Not Enough Anymore Virtualization & Cloud Basics Orakle: Turning Raw Blockchain Data into Intelligence with Gemma 4 Building an Autoposting Pipeline with Hermes Agent: Why Waterfall Beats Parallel, and the Edge Cases Nobody Talks About OpenShift Virtualization Migration Advisor — Local-First, Powered by Gemma 4 26B MoE WebMCP is coming — so I’m building webmcp.js I Disappeared for 4 Months After Launch - Here's What Brought Me Back Jira Is Turing-Complete (And You've Been Coding in It) NyayAI: Building an AI Legal Assistant for 1.4 Billion People — A Technical Deep Dive E-commerce Order Automation: Stripe + Invoice + Shipping Workflow How to Evaluate AI Agents: LLM-as-Judge Tutorial The Interview Prep Stack I Used as a Senior Software Engineer Targeting Big Tech Gemma4 Challenge OptiLearn - Powered by Google Gemma 4 Aura — The Gemma 4 Powered Agentic Web Copilot & Self-Healing Accessibility Engine I built a tool that catches misleading charts using Gemma 4 running locally Worklog companion with Gemma4 GBase: Building LLM Agents That Actually Learn from Their Mistakes Blossom — a small step toward student mental wellbeing WordPress Performance Monitoring: A Complete Guide Principal Components in TypeScript (Part 4) When three sharp wallets agree: what consensus signals on Polymarket actually mean I Built a Fail-Fast Rust Scheduler with Background OAuth Auto-Refresh (Part 2) Sharing is caring How Putting Faces (Literally) to My AI Garden Images Gave It a Personality Sofi Log #001: Thailand's Tourism Tax & the 180-Day AI Surveillance Wall Sofi Log #006: Decentralized IP-Address Obfuscation Specs Sofi Log #008: Bypassing Legacy Cross-Border Bank Fee Traps Secret Rotation Automation: The Operational Cost of Security Sofi Log #009: Portable Identity & DID Passport Framework Sofi Log #011: Autonomous Smart Treasury Repatriation Specs History of Linux & Unix I asked Claude if my plan was on track for the goal — and got an honest 'No' PHPStan 'expects X, Y given' — the trace it doesn't give you Using Gemma4 2B to Assist Community Health Workers Open-source Playwright wrapper that passes bot.sannysoft.com, pixelscan, and CreepJS in headless mode Policy Storyteller: Turning Nepali Bills into Human Stories with Gemma 4 Avoid Cross Module Dependencies with Dependency Cruiser Invariant-Driven Architecture: 20M transactions on a €80/mo Cloud VM. Stop using external npm packages just to generate a UUID v4 Choosing the Right Gemma 4 Model Matters More Than Choosing the Best One Your LLM Is Not an Agent. Your Framework Is Not Enough. You Need a Harness. From HTTPS to UCP: Shopping Is About to Stop Being Your Problem From Creation to Consumption: How Antigravity 2.0 and Gemini Spark Are Defining the Agentic Era 10 Mistakes I Wish I Knew Before Taking the CKA Exam AI That Actually Does Stuff: Autonomous Agents Explained Exploring AI workflow Orchestration: Comparing Weft, Python & Alternative Pipeline Approaches El Poder del Aprendizaje Federado: Cuando los Algoritmos Distribuidos Entrenan a la IA Email Marketing Automation in 2026: 5 Tools (and 1 Self-Hosted) Through Their APIs A Replay Runbook For Missed Publishing Windows Why timeout handling matters more than most backend logic How I Make $6,800/Month Selling Niche VS Code Extensions Model Routing Cost Checklist: Hosted APIs, Open Models, Or Self-Hosted Inference? ORA-00207 오류 원인과 해결 방법 완벽 가이드 Deno 2.8 Operator Upgrade Checklist: CI, Lockfiles, Node Compatibility, And Rollback AI-Discovered Vulnerabilities Need A Triage Queue, Not A Panic Channel AI Agent Workboards Need Audit Controls Before They Need More Agents Demystifying DevRel: What It Actually Is (And Why Should You Become One?) Your AI, Your Device, Your Data - Introducing Aide Gemma 4 GenAI Coach - GenAI Concepts Made Easy with an Interactive Playground QuietPulse - Mood Tracker Principal Components in TypeScript (Part 3) The pgAudit Attribution Gap: Why Role-Level Logging Fails GDPR and How to Close It Gemma 4 CAD Orchestrator I built a local Postgres triage co-pilot because HIPAA says I can't paste plans into ChatGPT or Claude
DeepSeek's API Price Cut Changed My Claude Code and ChatGPT Math
CodeKing · 2026-05-25 · via DEV Community

The DeepSeek API price cut made me rethink a habit I had quietly accepted: choosing an AI coding tool and then living with whatever model economics came with it.

Claude Code is great when I want a strong terminal-native coding agent. ChatGPT and Codex are great when I want OpenAI's workflow and model stack. But when a provider like DeepSeek suddenly drops API pricing, the obvious question is not just "is this cheap?"

It is: can I actually use the cheaper model from the tools I already use?

The Price Cut Is The Interesting Part

As of May 25, 2026, DeepSeek's pricing page lists V4 Flash at:

  • $0.14 per 1M input tokens
  • $0.0028 per 1M cached input tokens
  • $0.28 per 1M output tokens

It also lists V4 Pro at the 75% discounted rate, with a note that after the promotion ends on May 31, 2026, the API price will still be officially adjusted to one-quarter of the original price:

  • $0.435 per 1M input tokens
  • $0.003625 per 1M cached input tokens
  • $0.87 per 1M output tokens

The part that matters for coding agents is cached input. Coding tools resend a lot of repeated context: system prompts, repo summaries, conversation history, tool schemas, and task state. If cache hits are cheap enough, repeated agent loops start looking very different economically.

I checked the current public pricing pages before writing this: DeepSeek API pricing, Claude plans, Claude API models, ChatGPT plans, and OpenAI API pricing.

That is why this cut is more than a nice model announcement. It changes where I want routine coding traffic to go.

The Comparison I Actually Care About

Claude Code pricing is predictable if you use a subscription: Claude Pro is $20/month when billed monthly, and Max starts at $100/month. On the API side, Anthropic lists Claude Opus 4.7 at $5 input and $25 output per 1M tokens, and Sonnet 4.6 at $3 input and $15 output.

ChatGPT has the same split. Plus is the familiar $20/month plan, Pro tiers go much higher, and OpenAI API pricing for flagship GPT models is still priced like premium infrastructure. GPT-5.5 is listed at $5 input, $0.50 cached input, and $30 output per 1M tokens.

Those plans can be worth it. I am not pretending DeepSeek replaces every hard reasoning workload.

But for coding-agent traffic, the uncomfortable truth is that a lot of tokens are not "hard reasoning" tokens. They are:

  • reading files
  • rewriting boilerplate
  • producing test scaffolds
  • formatting docs
  • classifying intent
  • continuing a known task

That is exactly the kind of traffic I want to route to a cheaper model first.

The Annoying Part: Tools Do Not Make This Easy

The problem is that Claude Code, Codex, and ChatGPT-style workflows do not all speak the same protocol.

Claude Code expects Anthropic-shaped requests.

Codex expects OpenAI-shaped requests.

Other tools may expect Gemini-style routes or their own local configuration. So even when DeepSeek exposes low-cost models, the practical setup can still turn into a mess of environment variables, API keys, base URLs, and wrappers.

That is the gap I built CliGate to fill.

What Changed With CliGate

CliGate is a local AI gateway that runs on localhost. Instead of pointing every tool directly at a provider, I point the tools at CliGate once:

# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:8081
export ANTHROPIC_API_KEY=any-key

Enter fullscreen mode Exit fullscreen mode

Codex can also point at the same local gateway through its OpenAI-compatible configuration.

From there, CliGate handles the important layer:

  • route Claude Code, Codex CLI, Gemini CLI, and web chat through one local control plane
  • keep account pools and API keys in the same routing layer
  • map model names and app-level routes
  • send routine traffic to DeepSeek when cost matters
  • keep premium models available for the tasks that actually need them
  • show usage, request logs, and cost views in the dashboard

That means I do not have to decide "Claude Code or DeepSeek" as a tool choice. I can keep Claude Code as the interface and route some of its traffic through DeepSeek. I can keep Codex as the workflow and still move compatible requests to a cheaper upstream.

The Real Advantage Is Not Just Cheap Tokens

Cheap tokens help. But the bigger advantage is optionality.

I want to be able to say:

  • use DeepSeek V4 Flash for cheap routine work
  • use DeepSeek V4 Pro when I want stronger low-cost reasoning
  • keep Claude for difficult multi-file edits
  • keep GPT for workflows where OpenAI's stack is the right fit
  • keep local models for private or offline tasks

Without a routing layer, that sounds like a spreadsheet and a pile of config files. With a local gateway, it becomes an operations problem: add keys, set routing, inspect usage, adjust when the bill or quality tells you to.

That is the product advantage I care about. CliGate does not ask me to abandon Claude Code or ChatGPT-style tools. It lets those tools reach low-cost DeepSeek models without changing how I work.

My New Default

After this price cut, my default is no longer "pick one premium coding assistant and pay whatever it costs."

It is:

  1. keep the coding tools I like
  2. route routine traffic to the cheapest good-enough model
  3. reserve expensive models for the tasks that justify them
  4. watch usage and pricing in one place

That feels like the right shape for AI coding in 2026.

The models will keep changing. The prices will definitely keep changing. The part I do not want to keep changing is every CLI config on my machine.

CliGate is here if you want to inspect the implementation: https://github.com/codeking-ai/cligate

How are you handling model cost now: one subscription, direct API usage, or routing per task?