
























Artificial intelligence is no longer a discrete technology category. It is becoming a foundational capability embedded across devices, enterprise workflows, and cloud infrastructure. While much of the public discourse continues to focus on large language models and training breakthroughs, AI’s real economic impact occurs at inference, where models are deployed, interacted with, and scaled in production. This distinction matters. Today, AI is primarily constrained not by algorithms or software innovation but by system design. Power availability, efficiency, latency, scalability, and deployability across heterogeneous environments now define what is practical. Consulting giant McKinsey reports that while the majority of enterprises are using AI to some extent, nearly two-thirds have not yet scaled AI organization-wide. This statistic underscores a widening gap between AI ambition and infrastructure readiness. This paper examines why that gap exists and why it is structural rather than transitional. It also explores how Arm has evolved from its mobile origins into a foundational compute platform for the AI era, spanning the edge, the physical world, and enterprise environments as well as hyperscale cloud infrastructure. It also explains why the architectural principles behind Arm’s evolution — and the emergence of purpose-built designs such as the Arm AGI CPU — align closely with AI’s long-term requirements.
Certain technologies do more than improve productivity; they reset economic and social constructs. The steam engine mechanized labor. Electrification accelerated mass production and urbanization. The internet digitized information and communication. AI has cemented its position as the next major technological shift with the potential to radically change both our personal and professional lives. In personal use, AI has moved beyond experimentation. It is already shaping how individuals interact with information, services, and digital systems, from personalized assistants and recommendation engines to real-time translation and automated content creation. Its influence extends into personalized medicine, where models help tailor treatments to individual patients; entertainment, where content discovery and creation are increasingly individualized; and consumer protection, where AI is used to detect fraud, monitor transactions, and identify risk in real time. Intelligence is becoming ambient, continuous, and increasingly expected. Professionally, AI is already reshaping how individuals interact with technology and how organizations operate. It simultaneously influences decision-making, productivity, and value creation across industries. Unlike prior technology waves that advanced vertically, AI is progressing horizontally, touching nearly every sector in parallel. Its effects are broad, concurrent, and cumulative. As with previous revolutions, progress is constrained by the enabling infrastructure. Industrialization required mechanical reliability and power. Digital transformation required global connectivity and scalable compute. AI requires something more demanding: compute that is efficient, scalable, and deployable across vastly different environments. AI is not a single revolution but several unfolding simultaneously. Personal AI in devices and vehicles, physical AI powering robots and other autonomous machines, professional AI embedded into enterprise workflows, and industrial-scale AI in the cloud each impose different constraints around power, latency, cost, and scale. In these environments, the durable advantage is driven less by the applications themselves and more by the platforms that make AI economically viable. Hybrid AI — distributing intelligence across the cloud and edge — has emerged as a preferred approach to running AI, tailored to each application’s needs for performance, latency, and cost.
Across personal and enterprise contexts, AI is already moving from experimentation into production. On the personal side, smartphones, wearables, vehicles, and PCs increasingly rely on continuous inference both on-device and in the cloud. In this context, the essential constraint is not capability but efficiency, because always-on intelligence must operate within tight power and thermal limits. Today’s personal devices deliver responsiveness and personalization while reducing network dependence and preserving privacy. Within enterprises, AI has moved beyond pilots into core workflows — customer engagement, IT operations, security, analytics, and document processing. However, adoption doesn’t yet equate to scale. According to the same McKinsey study cited earlier, more than 70% of organizations are now experimenting with or deploying generative AI, yet only a fraction report material impact at scale, citing infrastructure and operational limitations as primary barriers.
AI workloads stress systems in ways traditional enterprise applications never did. They are latency-sensitive, memory-intensive, and constrained by power and thermals. Indeed, Deloitte’s 2025 Infrastructure Survey found that power availability and infrastructure readiness now rank among the top constraints on AI deployment, ahead of model availability and software tooling. As models scale, persist, and interact with live data, data movement and orchestration increasingly dominate overall system behavior. In this context, platforms that fail to balance performance, efficiency, and flexibility across infrastructure components struggle to scale, regardless of model quality. This is more challenging because the AI boom has driven rapid specialization across silicon. GPUs power large-scale training and parallel compute, while neural processing units (NPUs) deliver efficient, low-latency inference in client and edge devices. But none of these accelerators operate in isolation. The CPU remains the crucial orchestrator of the system, coordinating specialized compute engines, managing workloads and memory, and enabling AI to function as a unified, scalable system from cloud to edge. The more specialized the silicon landscape becomes, the more critical the CPU’s role. That dynamic becomes even more pronounced as AI systems evolve to handle agentic behavior. Agent-based systems do more than generate output; they plan, retrieve data, call tools, execute multi-step workflows, and repeat these processes continuously. This increases orchestration overhead — scheduling, memory coordination, I/O handling, and control-plane management across GPUs, networking, and storage. As AI shifts from static inference toward persistent, interactive agentic systems, demand for system-level coordination rises with it. In practical terms, this further increases the architectural importance of the CPU even as accelerators capture attention. The Arm AGI CPU, discussed in detail later in this paper, is Arm’s direct response to this shift.
Arm’s relevance in the AI era is rooted in design principles established long before AI became mainstream. From its earliest designs, Arm emphasized performance-per-watt, modularity, and broad ecosystem enablement. These priorities were shaped by the constraints of mobile and embedded systems, where efficiency, scalability, and flexibility were mandatory. Those same principles now map directly to the realities of AI discussed just above. Edge inference, on-device intelligence, and sustainable cloud infrastructure all depend on architectures that scale efficiently across orders of magnitude, from milliwatts in embedded devices to megawatts in datacenters. This alignment is increasingly visible in the cloud. Hyperscalers have adopted Arm-based CPUs not as experiments, but as production platforms optimized for scale, efficiency, and cost control. AWS’s Graviton, Microsoft Azure’s Cobalt, and Google Cloud’s Axion processors all reflect a common conclusion: Performance-per-watt and architectural flexibility matter as much as peak throughput in modern cloud environments. The value to cloud service providers is backed by hard numbers. Over the last two years, more than 50% of AWS’s new instance deployments have been on Graviton, and Google recently revealed it has ported 30,000-plus production packages to Arm-based systems. This makes it clear that these platforms are not niche alternatives. They are deployed broadly across general-purpose workloads, data services, and, increasingly, AI-adjacent infrastructure where CPUs serve as control planes, inference engines, and orchestration layers. For hyperscalers operating at massive scale, gains in efficiency from Arm-based platforms translate directly into material improvements in utilization, power consumption, and total cost of ownership. Moor Insights & Strategy (MI&S) sees Arm’s significance as structural. Pervasive hyperscaler adoption validates that the same architectural DNA that enabled success in mobile and embedded devices scales upward without breaking economic or operational assumptions, reinforcing Arm’s position as a foundational compute architecture.
Arm’s role in AI spans the full continuum, from personal devices to autonomous machines to hyperscale infrastructure. In personal AI, on-device processing reduces latency, conserves bandwidth, and limits data exposure. Vision, speech, personalization, and lightweight models already execute predominantly on Arm-based CPUs across smartphones, wearables, and embedded systems. These workloads are persistent and invisible to users, which makes sustained efficiency far more important than peak performance. The use of advanced AI algorithms and on-device AI is also central to perception, prediction, and decision-making in automotive systems, with nearly every global OEM leveraging Arm technology in its vehicles today. These same principles apply to autonomous machines, such as robotics and industrial automation systems, with most systems relying on Arm-based processors for intelligence, control, perception, and system coordination. Automotive and autonomous environments combine real-time inference, sensor fusion, and tight power and thermal constraints, demanding architectures that can coordinate CPUs, NPUs, and accelerators reliably at the edge. Arm’s ability to support deterministic behavior, low-latency control loops, and efficient on-device intelligence makes it ideally suited to physical AI systems that are moving from pilots into scaled deployment. Looking forward, any new consumer device category that emerges — smart glasses, spatial computing platforms, ambient AI endpoints, and so on — will inherit the same constraints. Always-on inference, tight thermal envelopes, and limited battery capacity demand architectures built around efficiency and mature developer ecosystems. In client and professional environments — including AI-enabled PCs, edge servers, and enterprise endpoints — heterogeneous architectures have become the norm. As touched on earlier, CPUs coordinate workloads, NPUs handle continuous low-power inference, and GPUs are invoked selectively for higher-throughput tasks. As enterprises push AI closer to users and data sources, architectural balance and power discipline increasingly determine viability. This shift toward heterogeneous system design also lowers historical barriers to architectural diversity in environments that might previously have been dominated by x86. As operating systems and frameworks increasingly abstract the full system rather than working from a single instruction-set assumption, the platform becomes more modular. That modularity creates space for alternative CPU architectures optimized for efficiency and integration, with Arm at the head of the list. As we have already seen, the same dynamics apply at cloud scale. Arm-based cloud CPUs such as Graviton, Cobalt, and Axion now operate broadly across general-purpose and AI-adjacent workloads, where orchestration, control-plane intelligence, and inference efficiency matter as much as raw throughput. MI&S sees Arm’s ability to span these environments with a common architectural foundation — while allowing specialization at each layer — as a defining advantage.
These advantages extend to developer experience and ecosystem maturity, which can make or break AI deployments in the real world. Arm’s strategy emphasizes enablement rather than proprietary optimization, prioritizing collaboration with operating systems, toolchains, and frameworks instead of vertical integration. The scale of this ecosystem is material. Arm reports more than 22 million software developers worldwide working on Arm-based platforms, spanning edge, physical, and cloud environments. This breadth matters because AI workloads increasingly need to move across environments without being rewritten or re-architected. Just as important is where long-term capital investment is being directed. Graviton, Cobalt, and Axion represent sustained investments in Arm-based CPU roadmaps by some of the biggest, toughest-minded tech vendors in the world. Those investments align with where AI infrastructure is heading: persistent inference, heavier orchestration, and growing pressure on power availability. The ecosystem signal here is not speculative, but structural. Relative to x86, the distinction is less about displacement and more about growth vectors. x86 remains deeply entrenched in enterprise infrastructure, but Arm’s ecosystem has expanded rapidly into segments where power efficiency, customization, and scale economics are decisive advantages. Cloud adoption has accelerated developer familiarity with Arm, while AI-enabled client devices and edge deployments are pulling that familiarity downward into enterprise and industrial environments. In the automotive sector, Arm’s technology has long been the standard for most of the silicon within the vehicle and continues to see growth in areas like infotainment, advanced driver-assistance systems (ADAS), and autonomous systems. This only becomes more important as automotive silicon architectures move from clusters of microcontrollers to zonal architectures with higher compute. With this in mind, we see that Arm’s developer ecosystem helps drive future standards for the industry. All of this also creates a compounding effect. As more developers build and deploy AI workloads on Arm-based systems, the tooling, libraries, and frameworks mature in parallel. This yields a broadly portable ecosystem — one capable of supporting AI across devices, enterprises, and cloud infrastructure with minimal friction.
Arm’s differentiation is not limited to architecture; it is reinforced by a business model that prioritizes flexibility for its semiconductor and OEM partners. Unlike vertically integrated approaches that assume a single optimal path, Arm’s licensing model allows partners to align their level of architectural control, investment, and differentiation with their specific needs. Arm supports a range of engagement models — from standard core licenses that emphasize time-to-market and ecosystem compatibility, to more advanced architectural licenses that enable deep customization. This flexibility allows organizations to decide where they want to differentiate performance-per-watt, security features, memory hierarchy, accelerators, or system integration. In the context of AI, where workloads vary so dramatically across environments, that optionality is strategically significant. The result is an ecosystem that allows participants to innovate in parallel. Device makers can prioritize power efficiency and integration. Enterprise and edge players can balance general-purpose compute with domain-specific acceleration. Hyperscalers can pursue custom silicon strategies optimized for their vast scale. Arm enables all these paths without forcing convergence on a single design philosophy. This model also changes the economics of iteration. By decoupling architectural innovation from full-stack ownership, Arm enables faster design cycles and broader experimentation while maintaining a coherent software ecosystem. Arm Compute Subsystems (CSS) further lower the barrier to entry by providing validated building blocks that reduce integration risk and accelerate deployment without eliminating differentiation. In an AI market defined mostly by fragmentation, the combination of architectural consistency and business-model flexibility becomes a durable advantage. Arm is not prescribing how AI systems should be built; it is enabling a wide range of organizations to define and build them efficiently, at scale, and on their own terms.
We have seen that Arm’s technical relevance is grounded in efficiency, flexibility, and scale. From an investor’s perspective, the company’s crucial value is durability. The growth of AI favors optimizing for inference over training, distribution over centralization, and efficiency over brute-force scaling. As AI workloads proliferate across disparate environments, demand increasingly shifts toward architectures that can be tuned to specific power, cost, and performance envelopes rather than optimized for a single dominant workload. From an investor’s perspective, this changes the slope of value creation. Platforms tied to narrow deployment models or fixed assumptions about silicon design face rising marginal costs as AI diversifies across deployment scenarios and use cases. In contrast, Arm’s business model enables a broad ecosystem of partners to absorb that complexity through custom silicon strategies — accelerating innovation while preserving architectural coherence. This creates resilience across cycles, workloads, and end markets. Independent research supports this framing. A study conducted by MIT Sloan Management Review highlights how difficult it is to translate enterprise AI activity into measurable value at scale. The study found that only a small minority of organizations were achieving significant financial benefits from AI, but that having stronger foundational capabilities (including AI infrastructure, talent, and strategy) materially improves the likelihood of realizing impact. For investors, the takeaway is not that Arm benefits from any single AI trend, but that its technology is foundational across all of them. Its exposure spans devices, enterprise endpoints, robotics, vehicles, and cloud infrastructure, while its licensing model allows value capture without requiring Arm to predict which AI architectures or deployment models ultimately dominate.
Arm’s new AGI CPU extends this thesis further, representing Arm’s first production silicon and an entirely new revenue vector — one that targets the fast-growing agentic AI infrastructure market directly, but without displacing the IP licensing and CSS businesses that underpin the company’s existing financial model. Throughout this paper, a consistent theme has emerged: As AI moves from discrete inference tasks to persistent, multi-step agentic workflows, the center of gravity in the datacenter is shifting. These systems are no longer defined by how fast they execute a model, but by how well they coordinate work across a distributed environment — functions like managing accelerators, orchestrating memory and storage, and maintaining continuous control-plane operations. This shift changes how enterprise IT teams need to think about AI infrastructure, particularly the role of the CPU. The Arm AGI CPU is a direct response to this shift. Its design reflects a view that the next phase of AI infrastructure will be defined less by raw compute performance and more by system-level coordination, predictability, and sustained parallel execution. Far from being displaced by accelerators, the CPU is being repositioned — and in some cases, elevated. From a product standpoint, the AGI CPU is a meaningful shift for Arm, because it is the first time the company is bringing production silicon to market. Built on a leading-edge process and based on the latest Neoverse CSS platform, the design focuses on high core density, strong memory bandwidth, and tight integration with accelerators and high-speed interconnects. And it does this within a power envelope that aligns with modern datacenters. What stands out is not just the specs, but the design philosophy. Each core runs a single thread, with no simultaneous multithreading. In a traditional server environment, that might look like a tradeoff. In an agentic one, it’s a better fit. Agentic systems depend on consistent, predictable performance across thousands of concurrent tasks, where thread-level variability can ripple across the entire system. When coordination is continuous and tightly coupled, latency spikes and scheduling contention don’t stay isolated — they propagate. This is why consistency is so important at scale — how systems behave under load, how efficiently resources are used, and how much operational risk is introduced by variability. In this context, determinism is a requirement, and AGI CPU is clearly designed with that in mind. The ecosystem already forming around AGI CPU shows its appeal in the market. Meta helped co-develop the platform and is deploying it alongside its own MTIA accelerators. Other partners across cloud, enterprise software, networking, and AI infrastructure are already engaged. With systems coming from vendors like Lenovo and Supermicro, there are clear paths to deployment for both hyperscale and enterprise environments. The opportunity also extends well beyond organizations already building custom silicon. A much larger part of the market sees where agentic infrastructure is headed, but can’t take on the cost, complexity, and multi-year effort of building processors for themselves. For this group, the question isn’t whether to move to a new architecture but how to do it without taking on that overhead. That’s where AGI CPU fits. It creates an additional path for enterprises and cloud operators to adopt a production-ready Arm platform that’s purpose-built for these workloads, but still deployable and supported out of the box. Considering the compressed timelines for implementing agentic AI at enterprise scale, that option is already becoming increasingly relevant. In the bigger picture, the AGI CPU isn’t just a denser server processor; it’s designed to sit at the center of the coordination layer for agentic environments. And as these environments grow, they’re going to require more CPU capacity per gigawatt than has historically been planned. From an MI&S perspective, this points to a broader shift in how AI infrastructure is being designed — one where coordination and control increasingly define system performance. And that dynamic positions Arm to play a more central role.
AI is no longer an experimental layer that can be bolted onto existing infrastructure. It represents a structural shift encompassing traditional machine learning, generative models, agentic systems, and emerging forms of physical AI. Each aspect of this increases the pressure to effectively implement environments marked by power efficiency, architectural diversity, and system-level coordination. As AI moves from centralized experimentation to pervasive deployment, the constraints become clear. Infrastructure decisions made today will determine which organizations can scale AI sustainably tomorrow. This places renewed emphasis on architectures that support heterogeneity, customization, and efficiency across devices, enterprises, and cloud environments. Arm’s evolution reflects these realities. Its architectural approach, licensing flexibility, and ecosystem enablement position the company not as a single-solution provider, but as a foundational platform upon which diverse AI strategies can be built. For technology leaders and investors alike, the implication is straightforward: In an AI era defined by distribution and specialization, durable advantage will accrue to platforms that make AI deployable everywhere — efficiently, coherently, and at scale. For more information on Arm and its AI strategy, please visit: www.arm.com. For more information on Oracle AI Database agentic AI capabilities, visit Oracle.
Matt Kimball, Vice President and Principal Analyst, Datacenter Compute and Storage
Anshel Sag, Vice President and Principal Analyst, AR/VR/XR, 5G Mobility, PCs, Smartphones, Graphics
Patrick Moorhead, CEO, Founder and Chief Analyst at Moor Insights & Strategy
Contact us if you would like to discuss this report, and Moor Insights & Strategy will respond promptly.
This paper can be cited by accredited press and analysts but must be cited in-context, displaying author’s name, author’s title, and “Moor Insights & Strategy.” Non-press and non-analysts must receive prior written permission by Moor Insights & Strategy for any citations.
This document, including any supporting materials, is owned by Moor Insights & Strategy. This publication may not be reproduced, distributed, or shared in any form without Moor Insights & Strategy’s prior written permission.
Arm commissioned this paper. Moor Insights & Strategy provides research, analysis, advising, and consulting to many high-tech companies mentioned in this paper. No employees at the firm hold any equity positions with any companies cited in this document.
The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions, and typographical errors. Moor Insights & Strategy disclaims all warranties as to the accuracy, completeness, or adequacy of such information and shall have no liability for errors, omissions, or inadequacies in such information. This document consists of the opinions of Moor Insights & Strategy and should not be construed as statements of fact. The opinions expressed herein are subject to change without notice.
Moor Insights & Strategy provides forecasts and forward-looking statements as directional indicators and not as precise predictions of future events. While our forecasts and forward-looking statements represent our current judgment on what the future holds, they are subject to risks and uncertainties that could cause actual results to differ materially. You are cautioned not to place undue reliance on these forecasts and forward-looking statements, which reflect our opinions only as of the date of publication for this document. Please keep in mind that we are not obligating ourselves to revise or publicly release the results of any revision to these forecasts and forward-looking statements in light of new information or future events.
© 2026 Moor Insights & Strategy. Company and product names are used for informational purposes only and may be trademarks of their respective owners.
Matt Kimball is a Moor Insights & Strategy senior datacenter analyst covering servers and storage. Matt’s 25 plus years of real-world experience in high tech spans from hardware to software as a product manager, product marketer, engineer and enterprise IT practitioner. This experience has led to a firm conviction that the success of an offering lies, of course, in a profitable, unique and targeted offering, but most importantly in the ability to position and communicate it effectively to the target audience.
Anshel Sag is Moor Insights & Strategy’s in-house millennial with over 18 years of experience in the IT industry. Anshel has had extensive experience working with consumers and enterprises while interfacing with both B2B and B2C relationships, gaining empathy and understanding of what users really want. Some of his earliest experience goes back as far as his childhood when he started PC gaming at the ripe of old age of 5, building his first PC at 11, and learning his first programming languages at 13.
Founder, CEO and Chief Analyst | + posts
Patrick Moorhead is the founder, CEO, and chief analyst of Moor Insights & Strategy. His big-picture view of technology is grounded in more than 20 years as an executive leading strategy, product management, product marketing, and corporate marketing functions at NCR, AT&T, Compaq, and AMD. He has shared his expertise in areas from silicon to infrastructure to enterprise SaaS and everything in-between in thousands of national broadcast appearances (CNBC, Yahoo Finance), articles (Forbes, CIO), research-based analyses, and podcast episodes. Today, he has 100+ CXO-level advisory clients and is often ranked the #1 technology industry analyst by ARInsights.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。