惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

博客园 - Franky
爱范儿
爱范儿
H
Hacker News: Front Page
Stack Overflow Blog
Stack Overflow Blog
V
Visual Studio Blog
Microsoft Azure Blog
Microsoft Azure Blog
博客园 - 三生石上(FineUI控件)
W
WeLiveSecurity
TaoSecurity Blog
TaoSecurity Blog
G
Google Developers Blog
Martin Fowler
Martin Fowler
I
InfoQ
www.infosecurity-magazine.com
www.infosecurity-magazine.com
V2EX - 技术
V2EX - 技术
Vercel News
Vercel News
博客园 - 【当耐特】
T
Tor Project blog
T
The Exploit Database - CXSecurity.com
美团技术团队
B
Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
V
Vulnerabilities – Threatpost
C
CXSECURITY Database RSS Feed - CXSecurity.com
N
News and Events Feed by Topic
Project Zero
Project Zero
Hacker News - Newest:
Hacker News - Newest: "LLM"
Cisco Talos Blog
Cisco Talos Blog
S
Schneier on Security
S
Security @ Cisco Blogs
H
Help Net Security
小众软件
小众软件
The Last Watchdog
The Last Watchdog
Security Archives - TechRepublic
Security Archives - TechRepublic
T
The Blog of Author Tim Ferriss
量子位
Microsoft Security Blog
Microsoft Security Blog
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
Apple Machine Learning Research
Apple Machine Learning Research
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Help Net Security
Help Net Security
S
Secure Thoughts
罗磊的独立博客
PCI Perspectives
PCI Perspectives
月光博客
月光博客
O
OpenAI News
Attack and Defense Labs
Attack and Defense Labs
云风的 BLOG
云风的 BLOG
S
SegmentFault 最新的问题
The Hacker News
The Hacker News

OfficeChai

18 Best AI Tools For English Speaking (With Examples) [2026] AI Impact? Vacancy Rates For US Office Properties Are Now Highest Since The 2008 Crisis KPMG Pulls Report Praising AI After It Was Found To Have Fake AI-Generated Citations India's Sarvam Raises $234 Million At $1.5 Billion Valuation After SpaceX Stock Pops 20%, Musk Has Made More Money In The Last 24 Hours Than Warren Buffett Made In His Entire Career OfficeChai Nobody Is Using AI Better Than Meta: NVIDIA CEO Jensen Huang 21 Best AI Tools For Animation (With Examples) [2026] 22 Best AI Tools For Architecture (With Examples) [2026] Datacenter Construction Spending Has Eclipsed Public Transportation Spending In The US China Scraps 12,000 Degree Courses, Mainly In Arts And Humanities, To Prepare For AI Age OfficeChai There Is No Job Loss With AI: David Friedberg Loop Between Human Capital And "Token Capital" Will Be The New IP For Firms, Says Satya Nadella How to Reduce Dependency on Key Employees 8 Google Index Checker Use Cases Beyond New Blog Posts Memory Squeeze? Smartphone Purchases Are Down Globally 21 Best AI Tools For Accounting (With Examples) [2026] AI For Voice Generation: 22 Best Options (With Examples) [2026] These Are The Most Popular Image Generation Models On OpenRouter [June 2026] Search Traffic For Websites Is Down 25% Over The Last Year Because Of AI: a16z Data Agentic Coding Has Led To A 50% Increase In Number Of Apps, But Most Are Finding Very Few Users: SimilarWeb Data OpenRouter Launches Fusion API, Which Uses A Combination Of Models To Achieve Fable-Like Performance At Half The Price Dario Amodei Refused To De-Deploy Or Fix Vulnerabilities In Fable Before US Export Controls, Says David Sacks 23 Best AI Tools For Notes Making (With Examples) [2026] 16 Best AI Tools For Astrology (With Examples) [2026] How Jensen Huang Once Had To Ask SEGA's CEO To Pay NVIDIA For A Technology That Didn't Work ChatGPT Already Has 11% Of The Search Market: OpenAI CFO Sarah Friar SpaceX Has Now Launched More Satellites Than Rest Of Humanity Combined Across History Globalization Is Dead, Time For India To Wake Up Says Sridhar Vembu After US Bans Anthropic Mythos And Fable Models For Foreign Users Elon Musk Becomes World's First Trillionaire After Record SpaceX IPO Anthropic Suspends Access To Mythos And Fable Models Following US Govt Directive Against Foreign Users 27 Best AI Tools For Market Research (With Examples) [2026] Why Jeff Bezos Makes Important Decisions Early in The Morning Education And Healthcare IT Have Been The Hardest Areas To Invest In: Peter Thiel Giving AI Long-Term Goals Could Lead To The Emergence Of Self-Preservation: Geoffrey Hinton Your Startup Doesn't Have a Hardware Problem. It Has an Accountability Problem Cyber Incidents Rarely Start With a Hacker: The Weak Links Businesses Overlook What Makes an App Worth Returning to Every Day? 21 Best AI Tools For Lead Generation (With Examples) [2026] How NBA Player Shaquille O'Neal Became An Early Investor In Ring AI For Kids Learning: 22 Best Options (With Examples) [2026] These Are The Most Popular AI Model Companies On OpenRouter [June 2026] Advanced Fintech and NeoBank Software Development Solutions: Building the Digital Banks of Tomorrow TRON Payments: Integrating AML Checks Into Business Workflows 18 Best AI Tools For Resume (With Examples) [2026] 16 Best AI Tools For UI Design (With Examples) [2026] These Are Top 10 Countries Generating The Most Internet Traffic How to Choose the Best Magento Agency for Your Store These Are The Best AI Models For Creative Writing [June 2026] AI For Managers: 28 Best Tools (With Examples) [2026] 17 AI Tools For Trading (With Examples) [2026] AI Has Led To An Explosion Of New Apps, But Nearly None Have Managed To Garner Significant Usage Cloudflare CEO Matthew Prince Says Vinod Khosla Asked Him To Fire His Co-founders For Him To Invest In His Company Australia’s AirTrunk To Invest $30 Billion To Develop Datacenters In India Anthropic Says That Their Employees Are Using AI To Write 8x More Code Compared To 18 Months Ago Anthropic Is Extremely Expensive, Many Are Urgently Looking For Alternatives: Microsoft AI CEO Mustafa Suleyman Sergey Tokarev on creating DIY “Beehives” and a free guidebook AI Crypto Price Prediction: How Accurate Are Machine Learning Models? Why Anthropic Could Find It Hard To Maintain Its $965 Billion Valuation Startup CEO Says They're Saving "Millions Of Dollars" By Replacing Anthropic Models With DeepSeek Ola Cabs' Valuation Falls 99% From Peak, Now Valued At Just $70 Million By Vanguard After TCS Case, Former Wipro Employee Alleges Attempt At Religious Conversion By Coworkers Bot Traffic Has Surpassed Human Traffic On The Internet For The First Time In History, Clouflare Says ChatGPT's Free Users Do 7 Queries Per Day, Those On $20 Plan Do 3x More: CFO Sarah Friar How Keith Rabois Had Been "Highly Skeptical" In 2023 That Anthropic Would Be Worth More Than $5 Billion In 10 Years How to Install AdGuard Home with Docker Step by Step We're Running Out Of Training Data, But Not Too Worried Because There Are Alternate Approaches: Google's Jeff Dean JioHotstar Is Hiring For 75 AI Roles Amid AI Content Push NVIDIA's Nemotron 3 Becomes Most Intelligent Open Weights Model From The US Hackers Allegedly Fooled Meta's AI To Take Over Accounts By Simply Asking It To Change User Emails Manchester Super Giants' AI Promotional Video Gets Panned As "Slop" For Glaring Cricketing Errors AI Reducing Jobs Is "Complete Nonsense": NVIDIA CEO Jensen Huang MiniMax Releases MiniMax M3, Is Competitive With Frontier Models On Many Benchmarks IIT Delhi-Incubated BotLab Dynamics Lights Up Skies With Lord Shiva Themed Drone Show During IPL Final NVIDIA Introduces RTX Spark, A New Chip Optimized For AI Agents For Windows Laptops And PCs NVIDIA Introduces Vera, A New CPU Chip For AI Agents That Is 80% Faster Than x86 CPUs OpenAI's Codex Reaches 5 Million Users, Resets Rate Limits For Users Key Factors That Influence Personal Loan Approval in India AI Is Allowing Me To Experiment And Try Crazier Things: Mathematician Terrance Tao Efficiency Of Human Learning Is Still A Thousand Times Better Than LLM Learning, Need Algorithmic Advances To Improve It: Jeff Dean San Francisco Home's Zillow Listing Says It'll Accept OpenAI Or Anthropic Stock As Payment Open-Source Models Currently Lag Proprietary Models By Just 4 Months: Epoch AI Self-Improvement Possible In AI Models Within A Year, Say Google's Top AI Leaders Digital Minds: Preparing for a Moral Challenge Before It Arrives Nearly 30% Of US-Based Y-Combinator Founders Are Of Indian Origin: SF Chronicle Data "A New Era Of PC": NVIDIA, Microsoft Windows Tease New Collaboration At Least 146,000 AI Hallucinated Citations In Papers Published In 2025, Finds Paper AI Doesn't Undergo Experiences, Has No Moral Conscience: Pope Leo XIV Claude Opus 4.8 Tops Artificial Analysis Intelligence Index, Edges Out GPT 5.5 With Score Of 61.4 Anthropic Says Its Annual Revenue Run-rate Has Now Touched $47 Billion Anthropic Raises $65 Billion At $965 Billion Valuation, Is Now Worth More Than OpenAI Claude Opus 4.8 Is Better Than Opus 4.7 But Not As Good As Mythos Preview, Says Anthropic Claude Opus 4.8 Beats GPT 5.5 On GDPval-AA Benchmark For Real World Tasks Anthropic Releases Claude Opus 4.8, Beats Opus 4.7, GPT-5.5 On Many Benchmarks GTM for Tech Startups Explained How to Use an AI Picture Generator to Create Professional Images Anthropic Is Now Generating 35% More Revenue Than OpenAI: The Information SK Hynix, Micron Join $1 Trillion Club Following AI-Led Memory Shortages
These Are The 10 Cheapest AI Models In The World [June 2026]
OfficeChai Team · 2026-06-17 · via OfficeChai

Pricing data sourced from Artificial Analysis. Blended price uses a 7:2:1 cache-hit/input/output token ratio. Lower is better.

The AI pricing war is well and truly over — and developers won. A wave of open-weight Chinese models, aggressive API strategies from xAI, and OpenAI’s own budget tiers have collapsed token costs by an order of magnitude in under two years. If you’re still paying frontier prices for workhorse tasks, you’re leaving serious money on the table. Here’s a look at the 10 cheapest AI models available right now, ranked by blended price — and why the cheapest AI models on this list aren’t just cheap, they’re genuinely capable.

1. DeepSeek V4 Flash (Max) — $0.06/1M token

The undisputed price champion right now, the cheapest AI models conversation invariably starts here. DeepSeek V4 Flash is a 284B-parameter Mixture-of-Experts model that activates just 13B parameters per token, which is how DeepSeek keeps inference costs so brutally low. Released on April 24, 2026, it supports a 1 million token context window and dual Thinking/Non-Thinking modes. At a blended rate of $0.06 per million tokens, it costs less than almost anything else on the market — and benchmarks comparably to what would have been a frontier closed-source model as recently as mid-2025. For output-heavy workloads like document generation, V4 Flash is the cheapest AI models story of the year.


2. GPT-OSS-20B (High) — $0.07/1M tokens

OpenAI’s open-weights answer to the Chinese cost offensive, GPT-OSS-20B is among the cheapest AI models from a US lab, sitting at just $0.07 per million tokens blended. At only 20 billion parameters, it’s the smallest model on this list and trades raw intelligence for extraordinary throughput and economy. It’s not going to win benchmarks against DeepSeek V4 or Kimi K2.6, but for high-volume classification, summarisation, and simple agentic routing tasks, the cost-per-useful-output ratio is hard to beat. OpenAI’s gpt-oss series briefly held the title of strongest open-weights model before Chinese labs pulled ahead — GPT-OSS-20B is the lean, budget-friendly entry point of that family.


3. DeepSeek V4 Pro (Max) — $0.18/1M tokens

The flagship of DeepSeek’s latest family, V4 Pro is a 1.6 trillion parameter model that activates 49B parameters per inference pass — and still manages a blended price of $0.18 per million tokens, making it one of the cheapest AI models at the frontier tier. It scores 52 on the Artificial Analysis Intelligence Index, placing it second among open-weights reasoning models globally, behind only Kimi K2.6. For teams that need near-frontier reasoning but can’t stomach GPT-5.4 or Claude Opus prices, V4 Pro is the most compelling cost-performance trade-off available. DeepSeek itself acknowledges it trails US frontier labs by 3–6 months — but at these prices, that gap barely matters for most production applications.


4. MiMo-V2.5-Pro — $0.18/1M tokens

Tied with DeepSeek V4 Pro at a blended $0.18, MiMo-V2.5-Pro is part of the growing cohort of cheapest AI models pushing into the mid-tier efficiency sweet spot. MiMo (Mini Model) is a small reasoning model from Xiaomi, originally designed for on-device deployment but increasingly competitive in cloud API contexts. Its V2.5-Pro variant delivers strong performance on reasoning and coding benchmarks relative to its size, making it an attractive option for developers who need reliable results at minimal cost — particularly in mobile-adjacent or edge-computing workflows where the cheapest AI models also need to be the leanest.


5. GPT-OSS-120B (High) — $0.20/1M tokens

A step up in capability from GPT-OSS-20B, the 120B variant is one of the cheapest AI models that credibly competes with mid-tier frontier outputs. At a blended $0.20, it occupies the territory where you get meaningfully better reasoning depth than the 20B model without approaching the cost of proprietary alternatives. OpenAI’s OSS-120B has been benchmarked as broadly comparable to Qwen and NVIDIA Nemotron alternatives, while delivering substantially higher throughput on equivalent hardware. For engineering teams building multi-step agent pipelines where the cheapest AI models need to handle genuine complexity, GPT-OSS-120B hits a practical sweet spot that few models at this price point match.


6. MiniMax-M2.7 — $0.22/1M tokens

MiniMax M2.7 is Shanghai-based MiniMax’s latest flagship, and at a blended rate of $0.22 per million tokens it represents one of the most striking value propositions among the cheapest AI models available today. It runs a sparse MoE architecture with roughly 230 billion total parameters but only 10B active per token. M2.7 delivers approximately 90% of Claude Opus 4.6’s quality on coding tasks at roughly 7% of the total cost — a head-to-head that speaks for itself. Built with agentic workflows in mind, it handles multi-step debugging, document generation across Word, Excel, and PowerPoint, and long-horizon tool chains well above its price point. Its automatic caching cuts repeated-context costs further, making it an especially compelling pick for RAG-style production workloads.


7. NVIDIA Nemotron 3 Super — $0.28/1M tokens

NVIDIA is famous for building the chips that power AI models. Nemotron 3 Super proves the company can also build some of the cheapest AI models worth running on them. Released at GTC in March 2026, it’s a 120B-parameter hybrid Mamba-Transformer MoE model with just 12B active parameters per token, a 1 million token context window, and inference throughput 2.2x higher than GPT-OSS-120B. The blended price of $0.28 per million tokens puts it within reach of high-volume agentic workloads where speed matters as much as cost. For developers self-hosting, NVIDIA releases the full weights, training recipe, and 10 trillion pretraining tokens under a permissive open model license — making it the most open model on this list by a wide margin.


8. Grok 4.3 (High) — $0.64/1M tokens

Grok 4.3 is xAI’s current flagship API model and, at a blended $0.64 per million tokens, is the most expensive entry on this list — though still dramatically cheaper than comparable proprietary models from OpenAI or Anthropic. Released April 30, 2026, it scores 53.2 on the Artificial Analysis Intelligence Index and supports a 1 million token context window with vision input and full tool use. The cheapest AI models capable of this level of reasoning are hard to come by, and Grok 4.3 earns its place here: at $1.25/$2.50 per million input/output tokens before blending, it’s 20% cheaper than its predecessor Grok 4.20 while outperforming it on benchmarks. xAI also gives developers up to $175/month in free API credits through its data-sharing program, making the effective cost even lower for qualifying workloads.


9. GPT-5.4 Mini (xHigh) — $0.65/1M tokens

OpenAI’s GPT-5.4 Mini is one of the cheapest AI models in its own lineup and a serious performer at that tier. Released March 17, 2026 — twelve days after the full GPT-5.4 launch — it scores 54.38% on SWE-Bench Pro, remarkably close to the standard GPT-5.4’s 57.7%, at roughly one-sixth the cost. The API list price of $0.75/$4.50 per million input/output tokens blends to approximately $0.65, putting it in direct competition with Grok 4.3. For teams already embedded in the OpenAI ecosystem, GPT-5.4 Mini is the most cost-efficient on-ramp to the GPT-5.4 family’s capabilities. It’s the model that makes building high-volume, latency-sensitive applications — customer support, content pipelines, lightweight coding assistants — financially sensible without leaving the OpenAI stack.


10. Kimi K2.6 — $0.70/1M tokens

Kimi K2.6 from Moonshot AI rounds out the list as the most intelligent of the cheapest AI models surveyed here. Released April 20, 2026, it scores 54 on the Artificial Analysis Intelligence Index — the highest of any open-weights model, period. With 1 trillion total parameters and 32B active per token across 384 MoE experts, it sits in the top tier of global AI rankings while pricing in at a blended $0.70 per million tokens. On SWE-Bench Pro, K2.6 edges out GPT-5.4 (57.7) with a score of 58.6 and beats Claude Opus 4.6 on Toolathlon and Humanity’s Last Exam with tools. For agentic coding tasks specifically, it’s the leading open-source model in the world — and at these prices, one of the most remarkable cost-performance stories in AI right now.


The Takeaway

The cheapest AI models in 2026 aren’t compromises. Seven of the ten models on this list are either open-weight or priced below $0.30 per million blended tokens — a level of access that was unthinkable for frontier-grade capabilities even 18 months ago. Chinese labs (DeepSeek, MiniMax, Moonshot AI) dominate the bottom of the price curve, while xAI and OpenAI have pushed their own budget tiers to stay competitive. For businesses running production AI workloads, the cheapest AI models available today aren’t the ones you pick when you can’t afford something better — they’re often the ones you pick because they’re genuinely the smartest buy.