Open model Kimi K2.7 Code undercuts GPT-5.5 and Claude by up to 12x on price per token

The Decoder

The AI industry's platform trap is starting to look a lot like Microsoft's OpenAI buys Ona to push Codex toward long-running, autonomous coding tasks Jeff Bezos' AI startup Prometheus closes $12 billion round at a $41 billion valuation Free Deezer tool lets users on any streaming service check their playlists for AI music OpenAI vs. Anthropic: A price war over API tokens is brewing Dario Amodei's new essay reads like a Cold War playbook for the AI age Claude Fable 5: Anthropic admits "wrong tradeoff" after invisibly throttling rival AI researchers Google's new open model DiffusionGemma generates text from noise instead of word by word OpenAI's IPO slips as Altman tells staff to expect a public offering "within the next year" Anthropic study shows AI needs hours, not weeks, to build exploits from security patches OpenAI wants its biggest data center yet, and Nvidia would back the bill Claude Fable 5: The first Mythos model is powerful, expensive, and heavily filtered Germany's National Security Council greenights an AI Safety Institute modeled after the UK's AISI Google's NotebookLM now runs its own cloud computer with code execution and agent-based research Anthropic releases Claude Fable 5 and Mythos 5 with major gains in coding and science Google's Gemini 3.5 Live Translate delivers real-time voice translation across 70+ languages SpaceX wants to put data centers in orbit, and Musk says it's no big deal Landmark German ruling declares Google's AI Overviews are Google's own words and makes it liable for false answers Beijing's $295 billion AI buildout would require 80 percent domestic chips, locking out US suppliers Apple Intelligence gets a second shot with help from Google and Nvidia OpenAI now says "entirely automating everything is not the future we want" OpenAI says going public is "a complicated set of tradeoffs" and is unsure about the timing Microsoft Research's Lens proves detailed captions matter more than raw scale for training efficient image generators Intel gets a second life as Google and Nvidia explore it as a TSMC backup for AI chips Most companies are flying blind on AI spending Frontier Radar #3: How agentic AI is turning tokens into a business metric Instagram AI chatbot breach may have affected over to 20,000 accounts, Meta discloses Microsoft tightens rules for conflict zones after investigation into Israel's military use of Azure Moonshot AI targets a $30 billion valuation, more than six times its late-2025 worth Deepseek topped Ramp's trending software vendors in June 2026 as US companies chase cheaper AI OpenAI says "chat is dead" and plans to rebuild ChatGPT as a full-blown agent app Perplexity's "Search as Code" lets AI models write their own search pipelines instead of calling fixed APIs ChatGPT's new Lockdown Mode lets you disable web access and more to protect sensitive data from prompt injection Anthropic poaches OpenAI's second-ever chip engineer as both companies race toward IPOs Researchers pinpoint why larger language models pick up skills that small ones miss Sakana AI bets AI that improves itself can break the compute arms race of frontier labs Meta's Hatch AI agent could cost up to $200 a month and marks its first paid AI product Elon Musk's xAI reportedly trained its coding models on Claude outputs for months before getting cut off New open-source voice model listens nonstop and decides every 0.4 seconds whether to speak or stay silent SpaceX signs $920 million per month deal with Google for 110,000 Nvidia AI chips ahead of IPO OpenAI and the Trump administration are negotiating a government stake in the AI startup Qwen3.7-Plus is Alibaba's bid to turn multimodal AI into a full-blown autonomous agent Florida's lawsuit against OpenAI and CEO Altman treats ChatGPT as a defective product and public nuisance Satya Nadella publicly torches a VP's plan to make Microsoft's AI agent deliberately addictive Microsoft trained its MAI models on unlicensed web data despite promising "enterprise grade, clean and commercially licensed data" Anthropic's Mythos model is reportedly powering NSA offensive cyber ops against China and Iran Anthropic says Claude now writes over 90% of its code and wants the world to have an AI pause button Cloudflare CEO says the web's future is "pay to crawl" as bots overtake human traffic ChatGPT now saves narrative dossiers about you sorted by work, hobbies, and travel preferences Bain study finds companies miss AI savings targets because humans keep getting in the way OpenAI CEO Sam Altman sees "proactive AI" as the next big phase after chatbots and agents AI can now coach amateur virologists, and top tech leaders want Congress to act on DNA security xAI updates Grok Imagine to 1.5 with image-to-video generation at 720p resolution Google Deepmind's Gemma 4 12B squeezes multimodal AI onto a laptop with just 16 GB of RAM Google lets sites opt out of AI search results, knowing most have nowhere else to go Ideogram 4.0 drops as an open-weight model with native 2K resolution and improved text rendering Trump's new executive order wants AI companies to voluntarily submit models for government safety reviews Perplexity announces hybrid AI system that decides what runs locally or in the cloud AI music startup Suno doubles its valuation to $5.4 billion while fighting major record labels in court Nous Research releases Hermes Desktop, an open-source AI agent for every platform Build 2026: Microsoft tops Google in image generation while playing catch-up on reasoning OpenAI expands Codex with role-specific plugins to build a general-purpose app for non-developers Anthropic scales Project Glasswing to 150 partners across 15 countries to hunt critical software flaws Hackers hijacked high-profile Instagram accounts by simply asking Meta's AI chatbot to change the email OpenAI turns ChatGPT into a career platform with job search and CV editor Warren Buffett's Berkshire Hathaway bets $10 billion on Alphabet's AI infrastructure buildout OpenAI models now available on Amazon Web Services Claude maker Anthropic files for IPO with the SEC Turing Award winner Richard Sutton says pure generative AI can't do real science MiniMax M3: Open-weight model with a million-token context challenges proprietary leaders Nvidia's Nemotron 3 Ultra becomes the smartest open US model, but China still leads Nvidia bets big on physical AI at GTC Taipei with a new world model, driving brain, and open humanoid robot Nvidia pitches RTX Spark as the chip that finally makes local AI agents practical on Windows devices OpenAI starts with infrastructure robots but aims for "everyone having a personal robot doing anything they need" Ask AI what goes with chicken and the answer depends on whether it learned from recipes or molecules Anthropic bans AI tools during job interviews to see how candidates actually think Anthropic study finds men use AI coding agents more than twice as often as women in social science research SoftBank plans 75 billion euro AI data center buildout in France AI search agents often confirm what they already know instead of actually researching the web Microsoft and Nvidia reportedly team up on AI PCs that run actual agents instead of Copilot Making AI chatbots helpful weakens their ability to simulate human behavior, large-scale study finds Terence Tao argues AI could bring division of labor to math for the first time in history Attackers abuse shared ChatGPT and Claude chats to spread malware OpenAI's Codex can now operate your Windows PC autonomously, hunting bugs and testing apps on its own Salesforce claims AI agents cut a 231-day migration to 13 days with fewer incidents Meta's leaked memo reveals AI pendant, supersensing glasses, and enterprise wearables strategy OpenAI gives GPT-5.5 Instant a readability upgrade while phasing out two older models Google fixes several bugs in Gemini usage limits that burned through quotas too fast One company reportedly spent $500 million on Claude in one month after failing to cap AI usage OpenAI is giving away its life sciences AI model to help governments prepare for the next pandemic New review paper argues code is how AI agents think and act, not just what they produce Amazon kills internal AI leaderboard after employees gamed it with pointless tasks Claude company Anthropic nears a trillion-dollar valuation after raising $65 billion in Series H Anthropic ships Claude Opus 4.8 as a "modest but tangible improvement" that tops GPT-5.5 in most benchmarks Google Cloud responds to AI-accelerated cyberattacks with a platform that aims to close security gaps in minutes Google launches a tiny board that runs Gemma 3 locally Mistral rebrands LeChat as Vibe, betting its chatbot's future is as a full-blown work agent Meta One: Zuckerberg finally puts a price tag on all that AI spending Amazon builds its own AI production platform and greenlights three AI animated series for Prime Video ElevenLabs Music v2 promises opera-to-metal transitions without losing musical coherence

Matthias Bastian · 2026-06-13 · via The Decoder

Moonshot AI has released Kimi K2.7 Code, a new AI model built specifically for programming tasks and agent-based coding workflows. The model builds on its predecessor, Kimi K2.6, and is available as an open-weights version on Hugging Face.

According to Moonshot AI, K2.7 Code is designed to outperform its predecessor on long-running, complex software engineering tasks. For general tasks outside of coding, the company still recommends K2.6. Kimi is also the model that coding tool provider Cursor resells in a modified form.

Gains over K2.6, but still behind the leaders

On Moonshot's in-house Kimi Code Bench v2, performance jumps from 50.9 to 62.0. On Program Bench, it climbs from 48.3 to 53.6, and on MLS Bench Lite, it rises from 26.7 to 35.1. K2.7 Code also improves on agentic benchmarks, hitting 76.0 on MCP Atlas (up from 69.4) and 81.1 on MCPMark Verified (up from 72.8).

In a head-to-head comparison with GPT-5.5 and Claude Opus 4.8, though, K2.7 Code trails on most coding benchmarks. GPT-5.5 scores 69.1 on Program Bench versus 53.6 for K2.7 Code. On Kimi Code Bench v2, it's 69.0 versus 62.0. Program Bench is a particularly tough test. Agents have to reproduce a program's behavior using only a compiled binary and its documentation wihtout source code access, decompilation, or internet.

K2.7 Code shows strong agent performance: while it still trails competitors on pure coding benchmarks, it holds its own on agent-focused tests. | Image: Kimi

There's one outlier: MCPMark Verified, a benchmark that tests AI agents across five real-world software environments, including Notion, GitHub, file systems, Postgres databases, and browser automation via Playwright. Here, K2.7 Code beats Claude Opus 4.8 with 81.1 versus 76.4, but falls well short of GPT-5.5 at 92.9. As always, benchmark results and real-world performance can diverge.

A trillion parameters, but only 32 billion active at a time

K2.7 Code uses a Mixture-of-Experts (MoE) architecture with one trillion total parameters, according to its model card. Only 32 billion of those are active per token. The model has 384 experts, with eight selected per token. Context length is 256,000 tokens.

The model is multimodal and can process images and video alongside text. It uses a custom vision encoder called MoonViT with 400 million parameters. The architecture is identical to K2.5 and K2.6, so existing deployment configs can be reused directly.

One key improvement, according to Moonshot AI, is more efficient reasoning. K2.7 Code uses about 30 percent fewer thinking tokens than K2.6, which means less "overthinking." The model enforces thinking mode and a "preserve_thinking" mode that keeps full reasoning content across multiple conversation turns to boost performance in agent-based coding scenarios.

Moonshot AI has also announced a "6x High-Speed Mode" coming soon. The model can be accessed through the Kimi API, Kimi Code CLI, and inference engines like vLLM and SGLang. A native INT4 quantization is available too. The model weights are available for download on Hugging Face. A native INT4 quantization is also available, making it possible to run the model on less powerful or cheaper hardware.

A fraction of the cost of Western competitors

API pricing for K2.7 Code is $0.95 per million input tokens and $4.00 per million output tokens. Cache hits drop the input price to $0.19 per million tokens. That puts K2.7 Code at the same input price as its predecessor K2.6 ($0.95/$4.00, cache $0.16).

Compared to the competition, K2.7 Code is dramatically cheaper. GPT-5.5 costs $5.00 per million input tokens and $30.00 per million output tokens. Claude Opus 4.8 runs $5.00/$25.00. And Anthropic's latest—and currently suspended—top model, Claude Fable 5, charges $10.00/$50.00 per million tokens. On output alone, Fable 5 is more than twelve times as expensive.

Model	Input / MTok	Output / MTok
Kimi K2.7 Code	$0.95	$4.00
Kimi K2.6	$0.95	$4.00
Claude Opus 4.8	$5.00	$25.00
GPT-5.5	$5.00	$30.00
Claude Fable 5	$10.00	$50.00

Even if K2.7 Code trails Western top models on some benchmarks, the same budget lets you run it many times more often, making the main question not whether it's the best model overall, but whether it's good enough for the task at hand.

That can only be answered case by case with your own task-specific benchmarks. Given the price gap, those evaluations pay for themselves quickly with heavy use. Cost per token is becoming just as important a competitive factor as raw model quality, another sign of an emerging token economy.

Modified MIT license with a big-customer clause

The model ships under a modified MIT license that allows free use, modification, and redistribution. Anyone using K2.7 Code or its derivatives in commercial products with more than 100 million monthly active users or more than $20 million in monthly revenue has to display "Kimi K2.7 Code" prominently in the UI.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Subscribe now

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

The Decoder

Gains over K2.6, but still behind the leaders

A trillion parameters, but only 32 billion active at a time

A fraction of the cost of Western competitors

Modified MIT license with a big-customer clause

AI News Without the Hype – Curated by Humans