


























Building an enterprise AI factory is a complex endeavor that few organizations can tackle alone. The solution requires infrastructure capable of managing massive compute workloads generated by AI training and inferencing, high-capacity/low-latency networking within data centers and to the edge, and security to mitigate the risks that AI introduces.
Abhinav Joshi, leader of AI solutions and product marketing at Cisco, identifies three key challenges inherent in building enterprise AI infrastructure: deployment complexity, security vulnerabilities, and performance bottlenecks. Agentic AI, with its heavy reliance on inferencing, places greater demands on infrastructure across all three dimensions.
3 challenges in building enterprise AI factories
The deployment complexity challenge is driven by the need to quickly operationalize an AI infrastructure that fully integrates compute, networking, storage, security, and observability. A Kubernetes-based container management platform and a robust AI software toolchain are likewise essential to ensure the consistent development, testing, and deployment of containerized AI applications, Joshi says.
The second challenge is mitigating security vulnerabilities. “Many organizations lack integrated security measures to protect the AI models, frameworks, applications, and the supporting infrastructure throughout the stack,” Joshi says. Attackers can exploit vulnerabilities by manipulating large language models (LLMs) with malicious inputs, which can disrupt operations and extract sensitive information. As AI agents ingest diverse data and act independently, they introduce new attack surfaces, including prompt injection, model poisoning, and data leaks.
Performance, especially around networking, is the third challenge. Tasks such as pre-training, post-training, and fine-tuning AI models, along with retrieval-augmented generation (RAG) pipelines and inferencing (including reasoning and agentic) all generate enormous amounts of network traffic. This creates severe bottlenecks across three critical communication paths: high-speed interconnects between graphics processing unit (GPU) servers, data throughput to storage layers, and real-time response delivery to end users.
Without high-performance network connections, GPUs may be underutilized and jobs may take longer to complete, affecting token economics. If bottlenecks reduce infrastructure utilization, organizations may pay more for every useful token generated. High-performance networking helps keep AI workloads moving efficiently as agents retrieve context, coordinate tools, and execute multi-step workflows.
Address all 3 issues at the same time
Cisco and NVIDIA jointly address these challenges with Cisco Secure AI Factory with NVIDIA, a modular reference design for rapid, core-to-edge AI adoption. The solution integrates high-performance compute, networking, and storage infrastructure with Kubernetes and AI software. With built-in security and observability, it ensures resilient AI operations across a variety of AI use cases, enabled by a robust software provider and technology partner ecosystem. The full stack is also pre-validated, reducing deployment risk and accelerating time to value — particularly as enterprises move beyond pilots toward production-scale agentic AI deployments.
The design is modular and compliant with NVIDIA Enterprise Reference Architectures. It provides flexibility for users to choose the components that best meet their immediate needs, with the assurance that they can add capacity later.
Security is embedded at every layer of the full stack, including AI models, applications, and agents to provide protection from the supply chain to runtime. This protection is delivered through Cisco products such as Cisco AI Defense, Cisco Hybrid Mesh Firewall, Cisco Isovalent Runtime Security, and Splunk Enterprise Security.
Tight solution integration also enables quicker response to critical exposures. Cisco’s Live Protect capability puts guardrails around AI jobs, enabling them to keep running despite vulnerability, an important consideration given that jobs like model training can take days to complete.
Another challenge Cisco helps organizations overcome is a lack of in-house IT talent with AI experience. Enterprises can take advantage of professional services from Cisco and its channel partners.
At a recent Cisco Live event, Cisco announced new deployment automation software, Stack Automation by Quali. “It further reduces deployment time from a few days to a few hours for secure AI infrastructure,” Joshi says. “It will help both our own professional services teams and our customers who want to stand up environments on their own.”
Taken together, these offerings reduce the risk of deployment errors, accelerate time to value, and provide a foundation for deploying efficient, secure agents grounded in enterprise context.
As enterprises move from experimentation to production-scale agentic AI, success will depend on more than raw compute. Organizations will need AI factories that securely deliver valuable outcomes while operating efficiently at scale.
Learn how Cisco Secure AI Factory with NVIDIA helps you build a sound foundation for your enterprise AI projects.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。