analysis. Cisco announced an Nvidia-based GPU server for AI workloads and a plug-and-play AI POD with “optional” storage, but Cisco is not included in Nvidia’s Enterprise Reference Architecture partner list.
Switchzilla has introduced a family of AI servers purpose-built for GPU-intensive AI workloads with Nvidia-accelerated computing and AI PODs to simplify and reduce risk of AI infrastructure investments. The server portion is a UCS C885A M8 rack server with Nvidia H100 and H200 Tensor Core GPUs and a BlueField-3 DPU to accelerate GPU access to data. AI POD for Inferencing is a full-stack converged infrastructure design that includes servers, networking, and Nvidia’s AI for Enterprise software portfolio (NVAIE), but does not actually specify storage for that data.
Cisco’s webpage states that AI POD is a “CVD-based solution for edge inference, RAG, and large-scale inference,” which means it’s not AI training. CVD stands for Cisco Validated Designs, which are “comprehensive, rigorously tested guidelines that enable customers to effectively deploy and manage their IT infrastructure.”
The web page has an AI POD for Inferencing diagram showing the components, including the Accelerated Compute (server) element.
The Cisco AI Infrastructure POD for Inference has independent scalability at each layer of the infrastructure and is said to be ideal for DC or edge AI deployments. There are four configurations that vary in the amount of CPU and GPU in the POD. Regardless of configuration, they all include:
Cisco UCS X-Series Modular System Cisco UCS X9508 Chassis Cisco UCS-X-Series M7 Compute Node
Take a look at the M7 compute nodes. This means Cisco’s 7th generation UCS. The new M8 generation GPU servers are not included and therefore are not part of this AI POD. Nvidia’s BlueField-3 SuperNIC/DPU is also not included.
Therefore, Cisco’s AI POD for Inferencing is not considered to meet Nvidia’s Enterprise Reference Architecture (RA) needs, which is why Cisco was not listed as a partner by Nvidia. The Enterprise RA announcement states, “Solutions based on Nvidia Enterprise RA are available from Nvidia’s global partners, including Dell Technologies, Hewlett Packard Enterprise, Lenovo, and Supermicro.”
We asked both Cisco and Nvidia about Cisco being an Enterprise RA partner and that the AI POD is an Enterprise RA-validated system. A Cisco spokesperson answered our questions.
Blocks and Files: Is Cisco AI POD part of the NVIDIA RA program? If not, why?
Cisco: Nvidia previously introduced reference architectures for cloud providers and hyperscalers, and recent announcements extend their RA to enterprise deployments. Their RA program is similar to Cisco’s Validated Designs. A key component of Nvidia’s RA is SpectrumX Ethernet networking, which is not offered as part of Cisco’s AI POD. Additionally, AI POD will offer a choice of GPU providers over time. Whether it’s PODS or RA, Cisco and Nvidia rely on us to support this effort by providing proven solutions that simplify our offerings and help customers migrate faster. I agree that there is.
Blocks and Files: Does AI POD include the latest UCS C885A M8 server?
Cisco: UCS C885A M8 is not currently part of the AI POD, but will be included in a future POD. The UCS C885A M8 was just announced at the Cisco Partner Summit and will begin shipping in December. At that point, Cisco will develop a validated design that will be used as the basis for creating an AI POD for training and large-scale inference. All I want to say is and will continue to be.
****
There is no storage component in the AI POD diagram above, even though the AI POD is described as a “pre-sized and configured bundle of infrastructure.” This takes the guesswork out of deploying AI inference solutions.
Instead, Pure Storage or NetApp are illustrated as providing converged infrastructure (CI) components. The web page states, “Optional storage is also available through NetApp (FlexPod) and Pure Storage (FlashStack).”
I find this strange in two ways. Clearly, AI inference is highly dependent on the potentially large amount of data that needs to be stored. However, the storage portion of the AI POD is “optional” and does little to “take the guesswork out of deploying AI inference solutions.”
Blocks and Files: Why is storage optional on AI POD?
Cisco: The AI POD introduced at Partner Summit is for inference and RAG use cases. Inference does not necessarily require large amounts of storage. To suit our customer’s needs, we wanted to make the storage component optional for this use case. Customers using AI POD for RAG can add NetApp or Pure as part of a converged infrastructure stack (FlexPod, FlashStack) delivered through a meet-in-the-channel model. As future PODs require more storage for use cases, we will work with our storage partners to fully integrate them.
****
FlexPod is also an entire CI system in itself, with over 170 specific configurations, including Cisco servers (UCS), Cisco networking (Nexus and/or MDS), and NetApp storage. Storage can be ONTAP all-flash, hybrid arrays, or the StorageGRID object system.
Cisco’s AI POD design, which purports to be the entire CI stack for AI inference, must include specific NetApp storage, rather than a NetApp entity (FlexPod) that is itself a CI stack.
Pure’s FlashStack, like FlexPod, is a complete CI stack with “over 25 pre-validated solutions to quickly deploy and support any application or workload.” It has “integrated storage, compute, and network layers.”
Again, Cisco’s AI POD design not only references Pure’s complete CI FlashStack, but also what Pure Storage products, FlashArrays or FlashBlades, and permissible configurations are valid components of an AI POD. must be specified.
It could make more sense if you had a specific FlexPod for AI inference design or FlashStack for AI inference design. That way, at least customers can get an integrated AI infrastructure from one supplier or its partners instead of going to Cisco and then having to access NetApp or Pure separately. The concept of AI POD for CI inference can be confusing when referring to FlexPod and FlashStack CI systems.