AI API定价计算器 — 比较各供应商成本

Hacker News: Show HN

UUTA — A calm notebook for showing up GitHub - scosman/cursed_browser: True AI-Native Browser — a VLM reads the HTML and hallucinates the page. Linear Chess Show HN: Browser-based Glider Sim Next Train GitHub - xfoa/Impatience: A library for instrumentation of event-to-event latency over a network GitHub - bitomule/musts: The validation loop that stops AI coding agents from claiming work is done before it actually is. Feynman - AI research assistant SynapCores — the AI-native database GitHub - erikshelley/complete-family-tree-viewer: A webpage for viewing all of a person's family tree at once GitHub - Noumenon-ai/AutoMaxFix: Controlled AI repair loop. Audit → Reproduce → Patch → Test → Report. Safety boundaries most AI agents skip. GitHub - JosephRedfern/plonk: Python interpreter at your fingertips Open Satchel — A free local-first PDF editor. Show HN: Hackobar – One feed for AI news GitHub - ghostchat-dev/widget: ~10KB chat widget. Zero cookies. Zero tracking. Fully open source. tiltbump X posts as clean Markdown for LLMs Symbol Combos — Cute Symbols, Kaomoji & Aesthetic Emojis to Copy GitHub - SellswordSoftware/justbookmarks: A simple desktop bookmark manager that uses the Netscape Bookmark HTML format so you can keep one browser-independent source of truth. SailWP. WordPress without the weight. Show HN: Widget Cast – Video Widgets for iOS PhoneDiffusion App - App Store Show HN: NanoApps: Run custom homebrew apps on iPod nano 7th generation Breadboard Knockout GitHub - elixir-volt/volt: Elixir-native frontend build tool — dev server, HMR, and production builds for JavaScript, TypeScript, Vue SFCs, and CSS. No Node.js required. Show HN: GuideOS – A radar-first, off-grid navigation kernel for edge robotics State lives on disk, not in chat Show HN: My biggest solo-project: Game engine with its own programming language MarketChacha | Stock Trading Community for Real Market Discussion GitHub - dmitryAQA/playwright-bdr-template Kubernetes, explained — interactive walkthroughs Show HN: Proj – organize your coding projects with categories and one-key CD Show HN: I made a compiler/VM for untrusted scripts Show HN: Stumpy – StumbleUpon Re-Created Show HN: Reward Is Not Reinforcement Until Admitted GitHub - dominikhei/cardamon: Cardamon is a cleanup tool for Prometheus that collects unused metrics from Grafana and Prometheus and generates drop statements for them. GitHub - NavodPeiris/grizzlars: High-performance DataFrame library written in C++ with Python bindings. Peakedin - LinkedIn's finest moments, curated weekly Planetensuche GitHub - cnemri/awesome-gemini-omni: A curated list of awesome Google Gemini Omni prompt guides, interactive platforms, and creative showcases. Show HN: An open-source, interactive AI engineering syllabus (1,100 papers) Show HN: I Built a Debugging Challenge for the AI Coding Age HTML Deployer: 1-Click AI Code To Website Publisher - Chrome 应用商店 GitHub - alkait/WhatsKept: Searchable, agent-queryable WhatsApp history from an iOS backup — a single Go binary. Geomatic | Tiny Volt Show HN: SenseCollect – Web data extraction made simple GitHub - feers77/iasql: A new implementation of SQL for IA purposes, using postgresSQL and Karpathy wiki-llm as inspiration. Kubernetes Study Path — From kubectl to a Production Cluster GitHub - octelium/cordium: Open-source sandbox platform with identity-based secretless infrastructure access for developers and AI agents on Kubernetes

mlongo · 2026-05-26 · via Hacker News: Show HN

每日请求量

平均输入令牌

平均输出令牌

标准

普通批量（50% 折扣）缓存输入（输入 90% 折扣）

排序依据

提供商	型号	在 $/1M	输出 $/1M	每日	每月

定价更新于2026年5月25日。价格来源于每个提供商的官方文档（如果可用）；否则来源于公共OpenRouter模型列表。在决定支出前，请务必与您的服务商的定价页面进行核对。批量处理 = 标准费率50%折扣（大多数服务商适用）。缓存 = 输入令牌成本90%折扣（提示缓存）。实际折扣因服务商而异。Meta/Llama的定价代表了主要云服务商（AWS Bedrock、Together AI）的收费标准。

。如何估算您的AI API预算

AI API 成本取决于三个主要因素：你发出的请求数量、每个请求中包含的文本量（输入 token）、模型生成的内容量（输出 token）。这个计算器将你的使用模式与每个主要模型的定价相乘，以展示完整的成本情况。

理解输入

每日请求量 — 你的应用程序每天进行多少次 API 调用。一个聊天机器人可能每天处理 500-5,000 次对话。一个批处理管道可能每天运行 10,000-100,000 次。
平均输入令牌 — 你的提示的典型大小。一个简单的问题是 ~50 个令牌。一个包含上下文/指令的提示是 200-500。带有文档块的 RAG 可以是 2,000-8,000+。
平均输出令牌 — 模型每次请求生成的数量。简短的回答是约50个token。一个段落是100-200个。一篇文章或代码生成是500-2,000+个。

通过批量处理节省费用

大多数提供者提供批量处理 享受标准费率50%的折扣。您无需实时响应，而是批量提交请求，并在24小时内获得结果。适用于：数据标注、内容生成、文档处理以及任何对延迟不敏感的工作流程.

通过提示缓存节省费用

提示缓存（可在Anthropic、OpenAI和Google上使用）存储您的系统提示并在多个请求中重复使用。缓存的输入标记成本比未缓存的低约90%。当您有一个大型的、静态的系统提示（指令、示例、文档），并且在许多请求中保持不变时，这是最有效的。

输入与输出标记定价

输出标记通常比3-6倍更贵比输入标记消耗更多计算资源。这是因为生成文本比读取文本需要更多的计算能力。在优化成本时，减少输出长度（更短的回复、结构化的输出格式）通常比减少输入长度有更大的影响。

需要关注的隐藏成本

推理标记 — OpenAI 的 o 系列（o-series）和其他一些推理模型在输出时对内部"思考"的 token 进行计费，这可能导致成本增加 3-10 倍。
长上下文附加费 — Google Gemini 对超过 200K token 的提示收取 2 倍费用。
工具使用/函数调用 — 工具定义被视为输入 token，并在多个请求中累积。
重试和错误 — 部分响应的失败请求仍可能被收费。

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

Hacker News: Show HN

。如何估算您的AI API预算

理解输入

通过批量处理节省费用

通过提示缓存节省费用

输入与输出标记定价

需要关注的隐藏成本