


















Moonshot AI has released Kimi K2.7 Code, a new AI model built specifically for programming tasks and agent-based coding workflows. The model builds on its predecessor, Kimi K2.6, and is available as an open-weights version on Hugging Face.
According to Moonshot AI, K2.7 Code is designed to outperform its predecessor on long-running, complex software engineering tasks. For general tasks outside of coding, the company still recommends K2.6. Kimi is also the model that coding tool provider Cursor resells in a modified form.
On Moonshot's in-house Kimi Code Bench v2, performance jumps from 50.9 to 62.0. On Program Bench, it climbs from 48.3 to 53.6, and on MLS Bench Lite, it rises from 26.7 to 35.1. K2.7 Code also improves on agentic benchmarks, hitting 76.0 on MCP Atlas (up from 69.4) and 81.1 on MCPMark Verified (up from 72.8).
In a head-to-head comparison with GPT-5.5 and Claude Opus 4.8, though, K2.7 Code trails on most coding benchmarks. GPT-5.5 scores 69.1 on Program Bench versus 53.6 for K2.7 Code. On Kimi Code Bench v2, it's 69.0 versus 62.0. Program Bench is a particularly tough test. Agents have to reproduce a program's behavior using only a compiled binary and its documentation wihtout source code access, decompilation, or internet.

There's one outlier: MCPMark Verified, a benchmark that tests AI agents across five real-world software environments, including Notion, GitHub, file systems, Postgres databases, and browser automation via Playwright. Here, K2.7 Code beats Claude Opus 4.8 with 81.1 versus 76.4, but falls well short of GPT-5.5 at 92.9. As always, benchmark results and real-world performance can diverge.
K2.7 Code uses a Mixture-of-Experts (MoE) architecture with one trillion total parameters, according to its model card. Only 32 billion of those are active per token. The model has 384 experts, with eight selected per token. Context length is 256,000 tokens.
The model is multimodal and can process images and video alongside text. It uses a custom vision encoder called MoonViT with 400 million parameters. The architecture is identical to K2.5 and K2.6, so existing deployment configs can be reused directly.
One key improvement, according to Moonshot AI, is more efficient reasoning. K2.7 Code uses about 30 percent fewer thinking tokens than K2.6, which means less "overthinking." The model enforces thinking mode and a "preserve_thinking" mode that keeps full reasoning content across multiple conversation turns to boost performance in agent-based coding scenarios.
Moonshot AI has also announced a "6x High-Speed Mode" coming soon. The model can be accessed through the Kimi API, Kimi Code CLI, and inference engines like vLLM and SGLang. A native INT4 quantization is available too. The model weights are available for download on Hugging Face. A native INT4 quantization is also available, making it possible to run the model on less powerful or cheaper hardware.
API pricing for K2.7 Code is $0.95 per million input tokens and $4.00 per million output tokens. Cache hits drop the input price to $0.19 per million tokens. That puts K2.7 Code at the same input price as its predecessor K2.6 ($0.95/$4.00, cache $0.16).
Compared to the competition, K2.7 Code is dramatically cheaper. GPT-5.5 costs $5.00 per million input tokens and $30.00 per million output tokens. Claude Opus 4.8 runs $5.00/$25.00. And Anthropic's latest—and currently suspended—top model, Claude Fable 5, charges $10.00/$50.00 per million tokens. On output alone, Fable 5 is more than twelve times as expensive.
| Model | Input / MTok | Output / MTok |
|---|---|---|
| Kimi K2.7 Code | $0.95 | $4.00 |
| Kimi K2.6 | $0.95 | $4.00 |
| Claude Opus 4.8 | $5.00 | $25.00 |
| GPT-5.5 | $5.00 | $30.00 |
| Claude Fable 5 | $10.00 | $50.00 |
Even if K2.7 Code trails Western top models on some benchmarks, the same budget lets you run it many times more often, making the main question not whether it's the best model overall, but whether it's good enough for the task at hand.
That can only be answered case by case with your own task-specific benchmarks. Given the price gap, those evaluations pay for themselves quickly with heavy use. Cost per token is becoming just as important a competitive factor as raw model quality, another sign of an emerging token economy.
The model ships under a modified MIT license that allows free use, modification, and redistribution. Anyone using K2.7 Code or its derivatives in commercial products with more than 100 million monthly active users or more than $20 million in monthly revenue has to display "Kimi K2.7 Code" prominently in the UI.
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。