惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

V
Visual Studio Blog
MongoDB | Blog
MongoDB | Blog
Engineering at Meta
Engineering at Meta
云风的 BLOG
云风的 BLOG
Microsoft Azure Blog
Microsoft Azure Blog
B
Blog RSS Feed
T
The Exploit Database - CXSecurity.com
P
Privacy & Cybersecurity Law Blog
Know Your Adversary
Know Your Adversary
月光博客
月光博客
I
InfoQ
阮一峰的网络日志
阮一峰的网络日志
NISL@THU
NISL@THU
爱范儿
爱范儿
S
Securelist
博客园 - 叶小钗
C
CERT Recently Published Vulnerability Notes
Recorded Future
Recorded Future
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
aimingoo的专栏
aimingoo的专栏
D
DataBreaches.Net
G
GRAHAM CLULEY
P
Proofpoint News Feed
A
About on SuperTechFans
Google DeepMind News
Google DeepMind News
C
Cyber Attacks, Cyber Crime and Cyber Security
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
T
Tor Project blog
Stack Overflow Blog
Stack Overflow Blog
T
Threat Research - Cisco Blogs
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
Hugging Face - Blog
Hugging Face - Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Recent Announcements
Recent Announcements
P
Proofpoint News Feed
The GitHub Blog
The GitHub Blog
The Cloudflare Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
Jina AI
Jina AI
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
罗磊的独立博客
博客园 - 【当耐特】
H
Help Net Security
F
Fortinet All Blogs
T
The Blog of Author Tim Ferriss

The JetBrains Blog

JetBrains Air lands on Windows - The JetBrains Blog The Role of Static Code Analysis in Fintech Compliance Kotlin Notebook Sunset - The JetBrains Blog Open-Sourcing the LSP Client API in IntelliJ IDEA 2026.2 - The JetBrains Blog The Dev Containers Story: Introducing EelApi for Plugin Authors - The JetBrains Blog Cursor's $60B Acquisition - Qodana SSH Connections Are Moving to JetBrains Daemon in the Toolbox App 3.6 EAP - The JetBrains Blog Your AI Agent Keeps Missing The Real Bottleneck. JetBrains Rider Can Fix It Now. - The JetBrains Blog Rust Web Development 2026: The Problems Nobody Talks About Our Research on Membership Inference Attacks and Preventing Privacy Leaks - The JetBrains Blog Explicit Lazy Imports Are Coming to Python 3.15 - The JetBrains Blog Kotlin Toolchain 0.11: The Next Step for Amper - The JetBrains Blog YouTrack Helpdesk Now Includes Customer Groups - The JetBrains Blog How to Win a Hackathon: Notes From the Judging Table - The JetBrains Blog How We Measure the ROI of JetBrains IDEs - The JetBrains Blog AWS Image Builder Plugin for TeamCity - The JetBrains Blog PHP Version Migration | Jetbrains Qodana Bamboo End of Life: How to Prepare and Choose the Right CI/CD Replacement - The JetBrains Blog Structuring IntelliJ Plugins with Optional Content Modules - The JetBrains Blog YouTrack Security Update: Upgrade Required for YouTrack Server - The JetBrains Blog Qodana Is a Finalist in the 2026 CODiE Awards for Best DevOps Tool - The JetBrains Blog JetBrains Marketplace Ecosystem Security Update: Addressing Malicious Third-Party AI Plugins - The JetBrains Blog Your JetBrains IDE Expertise, Now on LinkedIn - The JetBrains Blog The JetBrains AI Coding Agent moves to general availability Step Rejection Fine-Tuning: Squeezing More Signal from Noisy Agent Trajectories - The JetBrains Blog The Anthropic Debate - The Qodana Blog dotInsights | June 2026 | The .NET Tools Blog Inside JetPride: How JetBrains Employees Built an LGBTQIA+ Community | The Life at JetBrains Blog MPS 2026.1 Release Candidate Arrives | The MPS Blog Best Python AI Frameworks in 2026 | The PyCharm Blog Contribute to the State of PHP Survey | The PhpStorm Blog The Rules of Zero, Three and Five - The Qodana Blog Modern C++ Support in CLion: What’s New | The CLion Blog Agentic AI Governance: Designing for Accountability and Control | The JetBrains AI Blog JetBrains Plugin Developer Conf 2026 – Call for Speakers | The JetBrains Platform Blog Fewer False Positives in RustRover 2026.2|The RustRover Blog Rider 2026.2 EAP 5: Code Quality Checks for Your AI Agents, and More. | The .NET Tools Blog Why Zig Isn’t 1.0 (Yet) | The JetBrains Blog Java Annotated Monthly – June 2026  | The IntelliJ IDEA Blog IntelliJ IDEA 2026.1.3 Is Out! | The IntelliJ IDEA Blog RustRover at RustWeek 2026 | The RustRover Blog WPF Hot Reload Is Here: Edit Your XAML and Watch It Update Live in Rider | The .NET Tools Blog Kotlin 2.4.0 Released | The Kotlin Blog IntelliJ IDEA 2025.3.6 Is Out! | The IntelliJ IDEA Blog Async VFS Content Writes - What Plugin Authors Need to Know | The JetBrains Platform Blog Top Agentic Frameworks for Building Applications 2026 | The PyCharm Blog Toolbox App 3.5: Better Remote Development Observability, More Reliable Enterprise Configuration, and Smoother Everyday Interactions | The Toolbox App Blog Stop Pasting Tokens: OAuth2 Login for JetBrains IDE Plugins | The JetBrains Platform Blog Fix Common TypeScript Issues | The Qodana Blog Mellum2 Goes Open Source: A Fast Model for AI Workflows | The JetBrains AI Blog What Does It Actually Take for an IDE to Understand Rust? Hibernate 7.4 New Features | The IntelliJ IDEA Blog How We Use AlphaEvolve to Make Complex IDE Algorithms Faster | The JetBrains AI Blog JetBrains Academy – May Digest | The JetBrains Academy Blog TeamCity 2026.1.1 Is Now Available | The TeamCity Blog The Upcoming Sunset of DataSpell | The DataSpell Blog Deprecating dotMemory Unit | The .NET Tools Blog Koog 1.0 Is Out: Stable Core, Better Interop, and Multiplatform Observability | The JetBrains AI Blog Introducing the Cloud9 JetStream Theme for JetBrains IDEs | The JetBrains Blog Build a Live Object Detection App for the Reachy Mini With TensorFlow and PyCharm | The PyCharm Blog IntelliJ IDEA 2026.2 EAP Is Open | The IntelliJ IDEA Blog How AI Agents Can Work with TeamCity | The TeamCity Blog
Codex is now the recommended agent in JetBrains IDEs - The JetBrains Blog
Anna Maltseva · 2026-06-25 · via The JetBrains Blog
Ai logo

Supercharge your tools with AI-powered features inside many JetBrains products

AI News

Codex is now the recommended agent in JetBrains IDEs

JetBrains AI supports multiple coding agents, including Junie, Codex, Claude Agent, and any ACP-compatible agent you bring yourself. Previously, AI users in JetBrains IDEs started in Chat mode and had to choose an agent themselves.

As models became more advanced, agents became more capable and their adoption grew. We recognize that agents help users achieve more, so we recommend to use an agent from the get-go.

To make that experience simpler, we’ve selected a specific agent to be the default. This post explains how we made the choice.

You can still switch to any other agent at any time.

“JetBrains evaluated coding agents on the things that matter in practice: can they solve real software engineering tasks, quickly and at a cost that makes sense. We’re proud that Codex is the recommended starting point in JetBrains AI. It’s a meaningful step in the shift from AI chat to agents that meet developers where they are, work in the tools they already use, and take on complex, multi-step work.”

Stuart McMeechan, EMEA Deployment Engineering Lead, OpenAI

Evaluation using real-world development tasks

We evaluated candidate agents using a benchmark dataset built from real software engineering tasks across three ecosystems: Java (225 tasks), C# (38 tasks), and Python (90 tasks).

Each task is grounded in a real codebase – with a prompt describing what needs to be done and automated tests that verify the result. Together, these tasks cover bug fixes, feature development, enhancements, and other common development tasks across real applications, libraries, frameworks, and developer tools.

Data points used for choosing the recommended agent are accessible in the Developer Productivity AI Arena (DPAIA) repository – JetBrains’ open benchmark for evaluating AI coding tools, making the evaluation reproducible. The C# dataset is internal and not publicly available.

The Java dataset was our primary evaluation set. It’s the largest of the three, spanning 17 repositories across five organizations and covering a broad mix of task types. 

The С# and Python datasets produced a similar overall ranking of candidate agents, giving us additional confidence that the results were not specific to a single ecosystem.

Our methodology

We compared candidates within the same model tier. Our goal was not to find the most powerful model available, but the best agent behavior at comparable model capability and cost. We projected what agent usage would cost, taking into account JetBrains AI token usage. Setups that would push more than 2% of users over $20/month were ruled out before we ranked candidates on quality and latency.

In choosing which agent to recommend, we focused on three questions:

  1. Can it handle the task? → Here, we measured by solve rate: the percentage of benchmark tasks where all tests passed.
  1. Is the cost reasonable? → We looked at the median cost per task.
  1. Is it fast enough? → We looked at median end-to-end latency.

These three metrics (solve rate, cost, and latency) formed the basis of our ranking. We also tracked additional signals, including compilation success and average tool calls, but they did not materially affect the results.

Alongside the offline benchmark, we ran an online A/B test with real users. This experiment served as a validation layer, helping us understand whether the offline results translated into real-world usage. Because it’s difficult to measure task success reliably at scale, we focused on behavioral signals such as engagement and how often users switched to another agent or returned to the chat. The online results were consistent with the offline benchmark, giving us additional confidence in our choice.

Candidate configurations

We tested agents available with JetBrains AI (Codex, Junie, and Claude Agent) – across multiple model configurations. Candidates were selected based on prior benchmarking and internal assessment; we focused on the most promising options within each agent’s model family rather than testing every possible setup. Eventually Codex and Junie were shortlisted. 

Codex – we started with an initial sweep across GPT-5.2 and GPT-5.3. When GPT-5.4 mini became available, it outshined the previous top performer in terms of both solve rate and cost, making the model choice straightforward. The remaining question was reasoning level: medium vs. low. GPT-5.4 mini with default medium reasoning had the best solve rate within reasonable cost range across all three ecosystems and was selected for the final evaluation.

Junie - Junie can work with different model providers. We evaluated the Gemini model family, pre-selected based on the Junie team's own benchmarks as the most promising options. Gemini 3 Flash was selected as the winning model.

Final showdown: Junie vs Codex

The offline results were too close to call on their own. Neither agent dominated across all metrics and ecosystems.

We included both in an online A/B test to see which held up better in real-world usage. We tracked activation, churn, and failure rate. Codex came out ahead. That tipped the decision.

What is next for the recommended agent

Codex is now the recommended agent, having delivered the strongest combination of solve rate and cost across the tasks we tested. This isn't a permanent decision, however. As models evolve, new agents join, and our benchmark coverage grows, we'll re-evaluate the decision and update our recommendation based on what the data tells us.

And if a different agent works better for your workflow, you can switch at any time. Our recommendation is a starting point, not a constraint.

Subscribe to JetBrains AI Blog updates

Discover more