慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
二零二六年人工智能API定价:汝实付之GPT-5.5、Claude Opus、Gemini及二十余模型
NeverKnowsBe · 2026-05-24 · via DEV Community

GPT-5.5上三十美元之提示,于DeepSeek V4 Flash仅需二八角。百倍之差——此实情也。

若尔建于AI API之上,二六年间之价局,纷繁殊甚。四大家供,二十以上之模,价级涵缓存读、缓存写、批量折、促销价、隐阈之属。吾制一符文价算器。 以明其理。此乃其价目之据也。

所列之价,皆以百万单位(MTok)计,用美元标示,源自官方之文牍,时在二零二六年五月。


诸模,依价而序

尽览其象——凡二十模,自廉至贵,列于输入之次:

# 供者 输入 輸出 比率
蓮光Gemini 2.5 谷歌 $0.10 $0.40 四倍
深尋V4蓮光 深尋 $0.14 $0.28 二倍
GPT-5.4 纳诺 OpenAI(OpenAI) 两分钱 壹分贰角五分 六分之三
双子座2.5闪存 谷歌(Google) 三十钱 两角五分 八分之三乘以十
深索V4 Pro* 深索 四百三十五分之七 $0.87 2x
6 Gemini 3 Flash Google $0.50 $3.00 6x
7 GPT-5.4 Mini OpenAI $0.75 $4.50 6x
8 Claude Haiku 4.5 Anthropic $1.00 $5.00 5x
9 Gemini 2.5 Pro Google $1.25 $10.00 8x
10 Gemini 3.1 Pro Google $2.00 $12.00 6x
十一 GPT-5.4 OpenAI(OpenAI) 两角五分 十五元 六乘以x
十二 克劳德十四行诗第四六章 Anthropic(安斯索普) 三元 壹拾伍元整 五乘以x
十有三 克劳德十四行诗第四五 Anthropic(安纳崔克) $三圆 $十五圆 五倍
$十四 Gemini 3 深思 Google $四圆 $二十四圆 六倍
$十五 GPT-5.5 OpenAI $五圆 $三十圆 六倍
$十六 克劳德·奥普斯四点七 安斯派克特 $五元 $二十五元 五倍
十七 GPT-五点四专业版 开普AI $二十一元 $一百六十八元 八倍
十八 GPT-五点五专业版 开普AI $三十元 $180.00 6x

* 深寻V4 Pro:至2026年5月31日享七五折优惠.

比率之列,乃输出入价之比。深寻之2倍比率,谓输出符文较输入者价廉甚——若汝之应用多生长文,则此甚要.


对比观:同阶,异价

前沿模型(最佳能)

模式 输入 输出 月度(每日需10K)*
Gemini 3.1 Pro $2.00 $12.00 $3,900
Claude Opus 4.7 $5.00 $25.00 $6,375
GPT-5.5 $5.00 $三十元 $七千五百元

*每请求数据五千输入,五百输出

Gemini 3.1 Pro输入费用较GPT-5.5减半。然超二十万token之提示,其价倍增——此隐匿之费,人猝不及防

中阶(最佳平衡)

模型 输入 输出 月度(每日需10K)
Gemini 2.5 Pro $1.25 $10.00 $3,375
GPT-5.4 $2.50 $15.00 $6,000
Claude Sonnet 4.6 $3.00 $15.00 $6,375

预算(最廉)

模式 输入 输出 月度(每日需10K)
Gemini 2.5 闪存版 $0.10 $0.40 $30
DeepSeek V4 闪存版 $0.14 $0.28 $71
GPT-5.4 微型版 $0.20 $1.25 $75
Claude Haiku 4.5 $1.00 $5.00 $300

缓存:成本之枢

若汝之应用反复发送同系统提示或工具定义,缓存之效重于基准定价。诸提供者皆于缓存之令牌省约九成,唯DeepSeek者,得九八至九九之省。

其难处在于:Anthropic于缓存写入收取25%之溢价。初次Opus处理前缀,尔付6.25元/兆,而非5元。此意即缓存仅能省费,若尔于缓存TTL窗口内三次以上发送同前缀。OpenAI与Google不收此溢价,但予尔折扣。

欲知详尽之析,请参阅如何以提示缓存节省90%的AI API成本.


何时宜俭(且不宜俭之时)

当用预算模型时:

  • 任务明确(提取、分类、概括)
  • 需高吞吐量
  • 输出质量有“足够好”之界
  • 尔建管道,廉模可应十之九案

当守前沿之模者:

  • 事需多步推演
  • 精准确要,误则损重
  • 需工整之码生
  • 模乃尔产,非器用也

最睿智之架构,将九成之流导向价廉($0.10/M)之模,而留价昂($5.00/M)之模,供百中十之一实需之用.


要旨

AI API 之价已崩。最廉与最昂之模,输入相差三百倍,输出相差四百五十倍。其要,在配模于事。勿以 GPT-5.5 之价,以分邮件。勿以 Flash-Lite,以书繁码。用缓存之术,择适之级,则 API 之账,自巨项为微差。


凡二十余器之价目表,具缓存读写之阶次,批价之法,及专供之注:全API价目之较

吾制tokencostcalc.com——一免费之符价计算器。无广,无联属,无踪迹。但择一器,入符之用,即见实价。