惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

L
LangChain Blog
博客园 - 司徒正美
美团技术团队
WordPress大学
WordPress大学
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
人人都是产品经理
人人都是产品经理
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
T
Troy Hunt's Blog
S
Schneier on Security
T
The Exploit Database - CXSecurity.com
P
Proofpoint News Feed
云风的 BLOG
云风的 BLOG
Engineering at Meta
Engineering at Meta
Cisco Talos Blog
Cisco Talos Blog
T
Tor Project blog
B
Blog
NISL@THU
NISL@THU
月光博客
月光博客
博客园 - 【当耐特】
AWS News Blog
AWS News Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
腾讯CDC
L
Lohrmann on Cybersecurity
The Cloudflare Blog
L
LINUX DO - 最新话题
S
Security @ Cisco Blogs
S
Secure Thoughts
Spread Privacy
Spread Privacy
有赞技术团队
有赞技术团队
The Last Watchdog
The Last Watchdog
Project Zero
Project Zero
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Vercel News
Vercel News
H
Hacker News: Front Page
S
SegmentFault 最新的问题
Schneier on Security
Schneier on Security
aimingoo的专栏
aimingoo的专栏
P
Privacy & Cybersecurity Law Blog
博客园 - 三生石上(FineUI控件)
Forbes - Security
Forbes - Security
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
T
Tailwind CSS Blog
Application and Cybersecurity Blog
Application and Cybersecurity Blog
G
GRAHAM CLULEY
W
WeLiveSecurity
小众软件
小众软件
Recorded Future
Recorded Future
Cyberwarzone
Cyberwarzone
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org

NVIDIA Newsroom

NVIDIA Stockholder Meeting Set for June 24; Individuals Can Participate Online Save Big and Play Bigger: GeForce NOW Summer Sale Brings Major Membership Savings For Robotaxis, Safety Must Be Built In, Not Bolted On NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI NVIDIA Confidential Computing to Help Expand Apple’s Private Cloud Compute How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies NVIDIA and LG Group Build an AI Factory to Advance Physical AI, Mobility and AI Infrastructure NVIDIA and Doosan Group Collaborate to Advance Physical AI and AI Factory Infrastructure NVIDIA and SK hynix Announce Multiyear Technology Partnership to Advance Memory for AI Factories SK Telecom and NVIDIA Build AI Infrastructure to Power Korea’s AI Innovation NAVER Expands AI Infrastructure With NVIDIA to Serve Surging Global AI Demand NVIDIA, KRAFTON, NC and Reigning ‘League of Legends’ Champions T1 Celebrate RTX Spark at Korea’s PC Bangs Seoul Purpose: How NVIDIA and South Korea Are Building the Future of AI Forecast: Fun Ahead — 18 Games Join in June to Stream on GeForce NOW NVIDIA Research Unlocks Advanced Grasping, Smarter Autonomous Driving and Agent Training at Scale NVIDIA Enables the Next Era Of Physical AI Research With Agent Skills For Autonomous Vehicles, Robotics And Vision AI Industrial Software Leaders Build Secure, Autonomous AI Engineers With NVIDIA NemoClaw NVIDIA Partners With Microsoft on Unified Stack for Agentic AI Deployment, From Windows Devices to Cloud to Local Why Financial Institutions Are Converging on Transaction Foundation Models to Build Their Own Intelligence NVIDIA Jetson Brings Agentic AI to the Physical World NVIDIA AI Cloud Ecosystem Expands Worldwide to Meet Global AI Compute Demand NVIDIA Factory Operations Blueprint Gives Factories a New AI Brain Taiwan’s Industry Titans Turbocharge World’s AI Infrastructure Buildout With NVIDIA NVIDIA and TSMC Bring AI Into Fabs to Advance Semiconductor Design and Manufacturing NVIDIA, Foxconn and Taiwan Medical Centers Bring Agentic and Physical AI to ‘Healthy Taiwan’ NVIDIA Releases Major Collection of Open Source Agent Tools and Skills for Physical AI NVIDIA Announces NVIDIA Isaac GR00T Reference Humanoid Robot for Academic Research NVIDIA DRIVE Hyperion Becomes the Global Platform for a Robotaxi-Ready World NVIDIA Launches Alpamayo 2 Super Open Reasoning Model for Robotaxis How Cosmos 3 Helps Physical AI Think Before It Acts NVIDIA Launches Cosmos 3, the Open Frontier Foundation Model for Physical AI NVIDIA DGX Station for Windows Puts a Trillion-Parameter AI Supercomputer on Every Enterprise Desk NVIDIA Levels Up Local AI Agents Across RTX PCs and DGX Spark NVIDIA and Microsoft Reinvent Windows PCs for the Age of Personal AI Enterprise Software Leaders Build AI Agents With NVIDIA NVIDIA Unveils Vera, the CPU for Agents NVIDIA Vera BlueField-4 STX Brings Agentic AI Storage Processing With In-Silicon Security NVIDIA Vera Rubin Ramps Into Full Production to Power Agentic AI Factories Worldwide NVIDIA DSX Gives Infrastructure Builders the Playbook for AI Factories NVIDIA Research Advances Robotics From Simulation to the Real World The Name’s Gaming … Cloud Gaming: ‘007 First Light’ Launches on GeForce NOW NVIDIA GTC Taipei at COMPUTEX: Live Updates on What’s Next in AI NVIDIA CEO Jensen Huang at Dell Technologies World: ‘Demand Is Going Parabolic, Utterly Parabolic’ Linked and Loaded: Gaijin Single Sign-On Now Available on GeForce NOW NVIDIA and ServiceNow Partner on New Autonomous AI Agents for Enterprises It’s Gonna Be May: 16 Games Hit the Cloud This Month, With More NVIDIA GeForce RTX 5080 Power NVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents Into the Omniverse: Manufacturing’s Simulation-First Era Has Arrived Tag, You’re It: GeForce NOW Levels Up Game Discovery With Xbox Game Pass and Ubisoft+ Labels Making Sense of the Early Universe From Rainforests to Recycling Plants: 5 Ways NVIDIA AI Is Protecting the Planet NVIDIA and Google Cloud Collaborate to Advance Agentic and Physical AI Autonomous AI at Scale: Adobe Agents Unlock Breakthrough Creative Intelligence With NVIDIA and WPP No Need for Space Gear — Capcom’s ‘PRAGMATA’ Joins GeForce NOW on Launch Day Rethinking AI TCO: Why Cost per Token Is the Only Metric That Matters New Adobe Premiere Color Grading Mode Accelerated on NVIDIA GPUs Strength and Destiny Collide: ‘Samson: A Tyndalston Story’ Arrives in the Cloud National Robotics Week — Latest Physical AI Research, Breakthroughs and Resources From RTX to Spark: NVIDIA Accelerates Gemma 4 for Local Agentic AI Press Start on April: GeForce NOW Brings 10 Games to the Cloud Efficiency at Scale: NVIDIA, Energy Leaders Accelerating Power‑Flexible AI Factories to Fortify the Grid Into the Omniverse: NVIDIA GTC Showcases Virtual Worlds Powering the Physical AI Era Game On: Five New Titles Now Streaming on GeForce NOW The Future of AI Is Open and Proprietary Blowing Off Steam: How Power-Flexible AI Factories Can Stabilize the Global Energy Grid Advancing Open Source AI, NVIDIA Donates Dynamic Resource Allocation Driver for GPUs to Kubernetes Community How Autonomous AI Agents Become Secure by Design With NVIDIA OpenShell NVIDIA's CEO Projects $1 Trillion in AI Chip Sales as New Computing Era Begins Nvidia CEO: We have the most energy efficient architecture in the world An Interview with Nvidia CEO Jensen Huang About Accelerated Computing NVIDIA GTC 2026: Live Updates on What’s Next in AI Smooth Moves: 90 Frames-Per-Second Virtual Reality Arrives on GeForce NOW From Simulation to Production: How to Build Robots With AI More Than Meets the Eye: NVIDIA RTX-Accelerated Computers Now Connect Directly to Apple Vision Pro
NVIDIA Blackwell Leads on First Agentic AI Infrastructure Benchmark
Shruti Koparkar · 2026-06-13 · via NVIDIA Newsroom

AgentPerf from Artificial Analysis, the industry’s first agentic AI benchmark, gives developers, enterprises and infrastructure providers a clear way to compare systems for agentic AI. In the first round of published results, the NVIDIA Blackwell Ultra NVL72 platform delivers leading performance across the agentic AI workloads tested, running 20x more agents per megawatt than NVIDIA Hopper.

Agentic AI is a fundamentally different workload than conversational AI. A single chat completion is a sprint: one large language model (LLM) call, one response. An agent functions more like a relay: It breaks a goal into many steps and keeps going until the task is done. 

Agents chain together multiple LLM calls and tool calls to gather context, observe, reason and act.

That results in dozens to hundreds of LLM calls chained together, each passing growing context to the next, with tool calls like code compile and execution, database search and web browsing at every handoff. The complexity isn’t additive; it’s multiplicative. 

The distinction matters enormously for performance measurement. Existing AI inference benchmarks measure one LLM call: how fast an LLM responds to a single request and how many simultaneous requests a system can handle. They weren’t designed for agentic workloads, where chained LLM calls, tool call delays and growing context stress accelerated computing systems in fundamentally different ways than a single LLM call ever could. 

For companies building and deploying agents at scale, it’s important to understand how responsive agents are, how many can be deployed simultaneously and how much useful work AI infrastructure can deliver for every dollar and watt invested.

NVIDIA GB300 NVL72 Runs 20x More Agents per Megawatt

In this first round, AgentPerf measures agentic performance with DeepSeek V4 Pro, a large mixture-of-experts (MoE) model that represents the class of frontier models powering today’s most capable agents. On this workload, NVIDIA GB300 NVL72 delivers the highest performance in the benchmark, running up to 20x more agents per megawatt than the NVIDIA HGX H200 system.

NVIDIA GB300 NVL72 supports far more concurrent agents per megawatt than NVIDIA H200 at both service-level objectives of 20 and 60 tokens per second per agent.

The performance advantage comes from extreme codesign across the full stack. GB300 NVL72 connects 72 GPUs into a single rack-scale system, enabling large MoE models like DeepSeek V4 Pro to distribute model execution efficiently at scale. 

CUDA kernels accelerate this further by overlapping communication and compute, so the cost of coordinating across experts is absorbed rather than added to latency. 

NVIDIA TensorRT LLM sustains efficiency as concurrent agent sessions scale. For example, it separates the processing of inputs from the generation of outputs so each can be optimized independently. 

These results are grounded in a benchmark methodology built from the ground up to reflect how agentic AI actually works in production.

Artificial Analysis AgentPerf: Built on Real-World Agentic Workloads

AgentPerf is built based on real coding agent trajectories: an agent receives a task, reads files, writes and edits code, executes commands and iterates based on the results — all drawn from real public code repositories across 12+ programming languages. The long sequence lengths, tool call patterns and delays are all representative of real-world coding workflows. 

AgentPerf then measures how many of these agentic tasks a platform can support simultaneously while meeting defined performance thresholds for responsiveness and output token rate. Tool calls are not executed but simulated using representative CPU processing time, so differences in results reflect accelerated computing performance only. 

The results translate directly into infrastructure decisions: how many concurrent agentic tasks can be run per accelerator and per megawatt of power. For enterprises deploying AI agents at scale, those numbers determine how much productive work a given infrastructure investment can actually deliver.

NVIDIA Ecosystem Partners Harness Blackwell’s Leading Performance

Leading inference providers including Baseten, DeepInfra and Together AI are already serving agentic workloads on frontier models such as DeepSeek V4 Pro on NVIDIA Blackwell and powering production agentic applications today. 

Together AI powers real-time inference for Cursor, an AI-powered agentic coding platform, on NVIDIA Blackwell. Cursor’s agents debug issues, generate features and execute refactors while developers continue working.  

DeepInfra powers Pam.ai, an AI workforce platform for car dealerships, which deploys agents to book service appointments, handle calls and run outbound sales campaigns, entirely on NVIDIA Blackwell. 

As NVIDIA and the open source ecosystem continue to optimize inference software, performance and efficiency on agentic workloads will only improve. The NVIDIA Vera Rubin architecture is now in full production, bringing the next generation of infrastructure capacity to meet the growing demands of agentic AI at scale. 

Dive deeper into AgentPerf’s methodology and NVIDIA’s full-stack optimizations for agentic AI in this technical blog.