惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
V
Vulnerabilities – Threatpost
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Visual Studio Blog
月光博客
月光博客
IT之家
IT之家
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Tailwind CSS Blog
罗磊的独立博客
S
SegmentFault 最新的问题
博客园 - 三生石上(FineUI控件)
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
量子位
V
V2EX
Jina AI
Jina AI
The GitHub Blog
The GitHub Blog
小众软件
小众软件
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
阮一峰的网络日志
阮一峰的网络日志
Recent Announcements
Recent Announcements
MongoDB | Blog
MongoDB | Blog
Y
Y Combinator Blog
H
Help Net Security
博客园_首页
Cyberwarzone
Cyberwarzone
T
Tenable Blog
A
Arctic Wolf
C
CERT Recently Published Vulnerability Notes
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
T
Threat Research - Cisco Blogs
aimingoo的专栏
aimingoo的专栏
Google DeepMind News
Google DeepMind News
博客园 - 叶小钗
C
Cyber Attacks, Cyber Crime and Cyber Security
美团技术团队
Attack and Defense Labs
Attack and Defense Labs
GbyAI
GbyAI
博客园 - 【当耐特】
Cloudbric
Cloudbric
NISL@THU
NISL@THU
B
Blog RSS Feed
K
Kaspersky official blog
Hugging Face - Blog
Hugging Face - Blog
P
Privacy International News Feed
博客园 - Franky
博客园 - 司徒正美
Microsoft Azure Blog
Microsoft Azure Blog
Apple Machine Learning Research
Apple Machine Learning Research
Webroot Blog
Webroot Blog
Microsoft Security Blog
Microsoft Security Blog

Moor Insights & Strategy

RESEARCH NOTE: Computex 2026 Shows How Infrastructure Fragments as AI Scales Is SAP's AI Transformation the Future of SaaS? - Pulse Brief OpenAI Flexes Enterprise Ambitions With Colin Fleming As Business CMO RESEARCH NOTE: Rayfin Turns Microsoft Fabric Into a Runtime for Agent-Built Apps RESEARCH NOTE: Google I/O 2026 — More Details on AI and AR Glasses, Including Project Aura BROADCAST ANALYSIS: Patrick Moorhead Discusses the AI Market, Semiconductors, SpaceX, and Big IPOs on The Street, June 10, 2026 At Cisco Live 2026, Cisco Bets The Network Is The AI Platform MI&S Weekly Analyst Insights — Week Ending June 5, 2026 Apple WWDC 2026 - Resetting Siri, OS Improvements, and Parental Controls BROADCAST ANALYSIS: Patrick Moorhead Discusses NVIDIA Computex, China Trade Restrictions, and Berkshire’s Google Investment on CNBC Asia, June 1, 2026 RESEARCH NOTE: Dell Makes Its Case for Owning the Enterprise AI Stack Microsoft Work Trend Index 2026 Shows AI Productivity Is Not Enough Huawei's Chip Claims, SpaceX IPO Insights, Network X, Starcloud, AT&T & Amazon Leo Updates RESEARCH NOTE: Can Intel Wildcat Lake Challenge Apple’s MacBook Neo and Make Cheap PCs Great Again? ANALYST INSIGHT: Tenstorrent Is Disrupting the Inference Market MI&S Weekly Analyst Insights — Week Ending May 29, 2026 RESEARCH NOTE: Panasonic TOUGHBOOK 56 Brings Much-Needed Updates to the Rugged Form Factor RESEARCH NOTE: Amazon’s Acquisition of Globalstar Accelerates Amazon Leo Ambitions RESEARCH NOTE: IBM Turns Sovereignty Into a Product ANALYST INSIGHT: Mission-Critical ERP Needs Mission-Critical Agents RESEARCH NOTE: Cadence Leans into EDA Super Agents at Cadence LIVE 2026 MI&S Weekly Analyst Insights — Week Ending May 22, 2026 RESEARCH NOTE: Distance Technologies Partners on Kia Vision Meta Turismo Concept Car Retail AI Requires a Fundamentally Different Approach to Implementation — Research Brief BROADCAST ANALYSIS: Patrick Moorhead Discusses NVIDIA Earnings on CNBC, May 20, 2026 Enterprises Need To Be Careful Before They Go All-In On Anthropic RESEARCH NOTE: AT&T, T-Mobile, and Verizon Create Unprecedented Joint Venture for D2D Satellite Simplicity MI&S Weekly Analyst Insights — Week Ending May 15, 2026 Carriers Form D2D Satellite JV, 6G Expectations Cool & Data Center Pushback in Socorro RESEARCH NOTE: Google’s Gemini Enterprise Agent Platform Is a Serious Bid for the Agentic Control Plane BROADCAST ANALYSIS: Patrick Moorhead Discusses NVIDIA and U.S.–China Trade Relations on CNBC, May 13, 2026 RESEARCH NOTE: Motorola’s All-New Razr Fold Headlines a Mostly Unchanged Razr Lineup RESEARCH NOTE: SAP’s Bet on an Open Data Foundation for Agentic AI RESEARCH NOTE: Samsung Galaxy S26 Ultra — Samsung’s Halo Is Better Than Ever MI&S Weekly Analyst Insights — Week Ending May 8, 2026 Nvidia & Corning Unite, NTIA Report, ConnectX, FWA Uplink and 6G Spectrum News RESEARCH NOTE: Adobe CX Enterprise, An Agentic Control Plane for Orchestrated Customer Experience and AI Discovery RESEARCH NOTE: T-Mobile’s New SuperBroadband Aims to Solve Business Broadband Pain Points BROADCAST ANALYSIS: Patrick Moorhead Discusses AMD Earnings and Arm on CNBC, May 6, 2026 RESEARCH NOTE: Samsung’s Redesigned Galaxy Book6 Pro with Intel Core Ultra 3 Is a Welcome Upgrade RESEARCH PAPER: From Devices to the Cloud — Arm's Relevance in the Age of AI RESEARCH NOTE: Qlik’s Bet on Production-Grade Agentic AI RESEARCH NOTE: Google TPU 8: Architecture, Context, and Enterprise Relevance ANALYST INSIGHT: How Google’s Agentic Data Cloud Redefines What Context Means for the Enterprise MI&S Weekly Analyst Insights — Week Ending May 1, 2026 T-Mobile Super Broadband, Fiber Expansion, Satellite MVNO Rumors, & Big Tech Earnings — The 6G Podcast RESEARCH BRIEF: Oracle's Blueprint for Agentic AI RESEARCH NOTE: Devices Launched at MWC 2026 — Smartphones, Robots, AI, and PCs BROADCAST ANALYSIS: Patrick Moorhead Discusses Hyperscaler Earnings on CNBC, April 29, 2026 RESEARCH NOTE: Meta Ray-Ban Display: Bridging the Gap Between Smart Glasses and AR AI Canvases Move From Collaboration To Core Revenue And IT Operations RESEARCH NOTE: Samsung Galaxy XR Headset: A Strong Hardware Foundation Waiting on Software DataCenter Podcast: Episode 58 — We’re Talking AI Bottlenecks, Google Cloud Next TPU 8 Review MI&S Weekly Analyst Insights — Week Ending April 24, 2026 RESEARCH NOTE: First-Take Analysis: Nuvacore Emerges From Stealth Mode RESEARCH NOTE: The HP Z2 Mini G1a: A Tiny Powerhouse for the AI Workstation Era RESEARCH NOTE: HP Imagine 2026: HP Evolves in the Era of AI BROADCAST ANALYSIS: Patrick Moorhead Discusses Apple's New CEO and Future Strategic Direction on CNBC, April 20, 2026 RESEARCH NOTE: Lenovo Closes Infinidat Acquisition — What Does It Mean for Enterprise Storage? MI&S Weekly Analyst Insights — Week Ending April 17, 2026 Amazon’s Globalstar Deal, Verizon’s FIFA Play, and Millimeter Wave Insights — The 6G Podcast RESEARCH NOTE: Galileo Brings Cisco a Purpose-Built Agent Evaluation Layer RESEARCH NOTE: Cohesity Positions AI Resilience as the Foundation for Enterprise AI Adoption DataCenter Podcast: Episode 57 — We’re Talking Beyond the Border, Nutanix .NEXT Recap RESEARCH NOTE: The HP EliteBoard G1a: A Capable PC in an Innovative Form Factor RESEARCH NOTE: Samsung’s Galaxy S26 Lineup Leads with AI and Privacy RESEARCH NOTE: Velaura AI’s Titan Core Targets the Biggest Problem in AI Datacenter Silicon: Power RESEARCH NOTE: The ASUS ROG Xbox Ally X Has Rekindled My Hope for Windows Gaming Handhelds RESEARCH NOTE: Infor Positions Industry Context as the Foundation for Agentic ERP BROADCAST ANALYSIS: Patrick Moorhead Discusses Advanced Chip Packaging on CNBC, April 8, 2026 PULSE BRIEF: Navigating Supply Chain Constraints with Architectural Flexibility RESEARCH NOTE: MWC 2026 Showcases Semiconductors for 5G, 6G, and Many Kinds of AI RESEARCH BRIEF: From Infrastructure to Resilience Foundation — Reframing Cyber Resilience for Data Management PULSE BRIEF: Cloud-Native Edge AI Platforms RESEARCH PAPER: The Economic Impact of a Domestic Semiconductor Foundry RESEARCH NOTE: Arm Enters the Silicon Business with AGI CPU RESEARCH NOTE: The Inference Inflection Point: What NVIDIA’s Groq 3 LPX Really Signals for Enterprise AI BROADCAST ANALYSIS: Patrick Moorhead Discusses Arm AGI CPU on CNBC, March 25, 2026 DataCenter Podcast: Episode 56 — Artificial “Stupidity” and Arm Enters the AI Race PULSE BRIEF: Density Is Destiny — Rethinking AI Infrastructure in the AI Data Era BROADCAST ANALYSIS: Patrick Moorhead Discusses Arm's New AGI CPU on CNBC, March 24, 2026 BROADCAST ANALYSIS: Patrick Moorhead Discusses NVIDIA GTC Announcements on CNBC, March 16, 2026 RESEARCH NOTE: WD Innovation Day and FY2026 Q2 Earnings Reflect Disciplined Execution RESEARCH NOTE: AWS and Cerebras Partner to Deliver Disaggregated AI Inference The Enterprise Applications Podcast, Ep 26: AI Agents - The New Control Layer for Enterprise Apps DataCenter Podcast: Episode 55 — The AI Power Problem: Data Centers, Nuclear SMRs, and AWS + Cerebras RESEARCH NOTE: VAST Forward 2026 Positions the Data Platform as the Persistent Operational Layer for AI Game Time Tech Ep 28: MLB 2026 Season – AI, XR, Stadium Tech, and the Future of Baseball BROADCAST ANALYSIS: Patrick Moorhead Discusses AI Chip Export Controls and Oracle's Upcoming Earnings on Yahoo Finance, March 9, 2026 RESEARCH NOTE: Digging into the AMD–Meta Deal RESEARCH NOTE: Zoom Promotes ‘System of Action’ via AI-First Canvases and Agentic Workflows Game Time Tech Ep 27: How AI Is Transforming Pro Sports RESEARCH NOTE: IBM FlashSystem — Advancing Toward an Intent-Aware Storage Control Layer The Enterprise Applications Podcast - Ep 25: Is Enterprise ERP Ready for Agentic AI? RESEARCH NOTE: RPT-1 Is Turning SAP Data Into Insightful AI RESEARCH NOTE: Dell Pro 14 Premium Laptop with 5G Connectivity BROADCAST ANALYSIS: Patrick Moorhead Discusses NVIDIA Earnings on Yahoo Finance, February 25, 2026
ANALYST INSIGHT: Google Cloud’s AI Hypercomputer at Next 2026: Real Co-Design, Targeted Reach
Patrick Moorhead · 2026-05-01 · via Moor Insights & Strategy
Patrick Moorhead interviews Mark Lohmeyer, Google Cloud’s head of AI and compute infrastructure, at the Google Cloud Next Analyst Summit in April 2026. (Credit: Six Five Media)

I just got back from Google Cloud Next 2026, and the line I keep coming back to in conversations is this: AI Hypercomputer is what a decade of Google’s co-design finally looks like, assembled on one stage. I attended the original public TPU reveal at Google I/O on May 18, 2016. The first TPU had been running quietly inside Google’s datacenters for a year before that. Ten years later, the long arc of development has produced the eighth-generation TPU lineup, the new Virgo Network, the Managed Lustre file system at 10 terabytes per second, the GKE Agent Sandbox, and Axion-powered N4A instances.

There can be no doubt now whether Google can build infrastructure. Google Cloud exited Q4 2025 as the fastest-growing of the Big Three CSPs, with quarterly revenue up 48% year-over-year to $17.66 billion and a backlog of $240 billion. Alphabet has guided 2026 capex of $175 billion to $185 billion, primarily for technical infrastructure. The demand is there across Google’s consumer and commercial franchises. I wrote a year ago, after Next 2025, that Google was finally getting serious about infrastructure as a service. This year confirms it.

My core takeaway: AI Hypercomputer is no longer a marketing wrapper around a TPU launch. The TPU split into TPU 8t and TPU 8i, the new Virgo Network, Managed Lustre, GKE Agent Sandbox, and the now-GA Axion N4A instances represent a genuinely vertically integrated AI stack across compute, storage, networking, and orchestration. The question for enterprise buyers isn’t whether the stack is impressive. It is. The question is how much of it they actually get to use, on what terms, and where the real risk concentrations lie. I wrote in November after the Google Public Sector Summit that Google Cloud CEO Thomas Kurian was already telegraphing exactly this bifurcation, with two AI stacks for two different jobs. The TPU 8t and 8i split is what Google actually delivered.

What Hypercomputer Actually Is Now: Compute, Storage, Networking, Orchestration

Digging into compute, TPU 8t is the training chip, with appropriately impressive stats at scale: 9,600 chips per superpod, two petabytes of shared high-bandwidth memory, 121 exaflops, 2.7x performance-per-dollar over Ironwood, native FP4 in the matrix units, and Axion Arm hosts. x86 processors still do some of the heavy lifting for agentic orchestration; I am assuming this is from Intel, given the strategic announcement between the two companies. With Google’s Pathways AI architecture and JAX library, a single logical training cluster now scales past one million TPUs across multiple datacenter sites.

I think TPU 8i for inference is the more interesting bet. Google’s new Boardfly topology was co-designed with DeepMind to optimize for latency, not bandwidth. That’s exactly the right call for agents and inference, where minimum time-to-response, not raw throughput, is the key customer need. MediaTek joined Broadcom as a confirmed silicon design partner for the eighth-gen program, with Marvell reportedly in talks to become a third partner in the future, per The Information. That’s genuine multi-partner co-design, not slideware.

As for storage, Managed Lustre now delivers 10 terabytes per second of bandwidth, which Google claims is a 10x improvement over last year and up to 20x faster than other hyperscalers. Storage was the silent bottleneck in agentic workloads, and Google has finally moved on it.

For networking, Virgo connects 134,000 TPUs in a single fabric in one datacenter, and more than one million TPUs across multiple sites. For NVIDIA Vera Rubin NVL72 on Google Cloud’s A5X bare-metal instance, the same Virgo fabric supports up to 80,000 GPUs in a single datacenter and 960,000 across sites. In other words, customers who want NVIDIA don’t have to leave Google Cloud, which is the right answer for enterprises.

On orchestration, I think GKE Agent Sandbox with Axion-powered N4A is the under-covered piece of the AI Hypercomputer puzzle. Google claims 30% better price-performance for agent workloads compared to other hyperscalers. That’s a metric I want to see hold up under independent assessment, because orchestration is where compute, storage, and networking actually meet the agent. In my conversation with Mark Lohmeyer, Google Cloud’s head of AI and compute infrastructure, the through-line was the same: The stack only matters if it composes cleanly for agents. Also last week, Lohmeyer told the press that, uniquely within Google, the company is co-designing this offering across the full infrastructure stack. That co-design claim is the entire AI Hypercomputer thesis in one sentence.

We also shouldn’t gloss over the importance of “goodput”: not just the raw throughput attained, but the part of it that actually yields good output. Amin Vahdat, Google’s SVP and chief technologist for AI and infrastructure, has been making a point on stage about cluster reliability that almost nobody outside the operators thinks about. At 10,000-chip scale, fail-stop failures and silent data corruption quietly eat training throughput. Google claims that TPU 8t is engineered to target over 97% goodput. After 35 years in this industry, I’ll tell you that peak FLOPS is just a marketing number. Goodput is what determines whether your training cycles get wasted. As my colleague Matt Kimball wrote in his Ironwood analysis last year, Google’s TPU strategy was already tacking toward the age of inference. TPU 8t and 8i embody the architectural commitment to that direction, and Vahdat’s goodput framing explains the operating philosophy behind it.

The Google-Versus-NVIDIA Framing Is Wrong

Now I want to talk about NVIDIA. The trope I keep seeing is that Google’s TPU is “taking on” NVIDIA. I don’t buy it, and I haven’t bought it. Saying Google is taking on NVIDIA’s chips is like saying Apple’s M-series chips are taking on Intel and AMD. Apple competes with Dell, HP, and Lenovo at the system level. Its chips don’t compete with Intel and AMD as merchant silicon. Similarly, there’s nothing standard about Google’s AI Hypercomputer. It’s dialed in after a decade-plus of work, it’s currently very proprietary, and it’s primarily built to serve Google’s own workloads, including Gemini, Search, YouTube, and Android.

Look at the strongest customer validation point Google has for this. The Anthropic deal was expanded earlier this month via a Google and Broadcom agreement to deliver multiple gigawatts of next-generation TPU capacity beginning in 2027. But we’re talking about a single customer with very specific characteristics, and other named TPU customers like Citadel Securities are likewise jumbo-scale consumers of compute. Meanwhile, most enterprises will engage AI Hypercomputer through the Gemini Enterprise managed front door, not through bare-metal TPU access.

So even if you wanted to compare TPU and NVIDIA head-to-head, you can’t yet. Right now, only Google knows for sure how it stacks up against NVIDIA’s chips. And before I weigh in on relative price-performance, I want to see credible third-party assessments across a wide variety of workloads, touching entire stacks, from more than one outlet. To be fair, Google has now exposed native PyTorch support for TPU via TorchTPU in preview, which is a clear signal it wants to reduce framework lock-in friction. Whether that’s enough to peel real workloads off of NVIDIA is an open question. The benchmarking gap is significant, to say the least, and the responsible analyst position is to wait for the answers, not declare a winner prematurely. I am digging into Prism, which Google tells me is a way for customers to compare TPU versus GPU, and the Signal65 team will be doing the same.

What This Means for Buyers

Based on what I’ve seen and heard, Google customer outcomes are starting to land. I heard a non-enterprise video enhancement company at the show describe running on Google Cloud and getting 40 to 50% cost savings, with inference times moving from 15 to 20 minutes down to under a minute. That’s the kind of workload-specific outcome that matters more than peak FLOPS claims, and the kind of customer story Google needs more of in public.

If you’re an Anthropic-class buyer with frontier-model workloads, AI Hypercomputer is now a genuine second source. Native PyTorch via TorchTPU removes a historical friction point. The economics will compete with NVIDIA on inference for specific shapes of workload, and the multi-gigawatt commitments from Anthropic suggest that Google’s supply story is real for buyers operating at that scale.

If you are a large enterprise, you will want to feel comfortable that you can jockey around between different AI accelerators — be it TPU, GPU, or some future chip that is still to be defined. Just because the industry has seemingly chopped up the AI workflow only across training, prefill, and decode doesn’t mean there won’t be a fourth or fifth variant.

If you’re a mid-market enterprise, you’re getting most of the value through Gemini Enterprise and the Agent Platform, not through AI Hypercomputer in your tenancy. Both of these assessments can be true, based on the scope and needs of the customer. Don’t confuse one for the other when you’re sizing budgets and capabilities.

A note on the competitive context: AWS went the opposite direction with Trainium3 at re:Invent 2025, converging training and inference into a single SKU. Meanwhile, NVIDIA is scaling up within the rack with Vera Rubin NVL72, prefill with CPX, and decode with LPX. Google, as we’ve seen, is bifurcating across specialized silicon. Three different bets, based on three different theories of where the agentic workload actually lands. None of them is obviously right yet, and it’s possible that more than one of these bets could pay off. Given that each processor does something specialized, for as long as the work to orchestrate is less than the extra work of specialized processors, then it’s a win.

AI Hypercomputer Is Real

Google Cloud has built what appears on the surface to be the most coherent vertically integrated AI stack outside of NVIDIA, and Thomas Kurian deserves credit for staying with the bet for six-plus years. The TPU split, Virgo, Managed Lustre, GKE Agent Sandbox, and the partnership with NVIDIA on Vera Rubin all reinforce a stack that’s genuinely co-designed rather than bolted together. Assuming it’s accurate, Google’s own claim that nearly 75% of its cloud customers are now using its AI products validates that demand is there.

What I’m watching for next: credible third-party benchmarks across full stacks, real production agent reliability metrics from more enterprise customers, and whether native PyTorch on TPU is enough to pull workloads away from GPUs. Until then, AI Hypercomputer is a real accomplishment for Google. Just don’t oversell what it means for the rest of the market.

Note: Google Cloud is a client of Moor Insights & Strategy.