The AI inference models and agents are set to change the industry, but to provide large -scale potential, large -scale calculations and optimized software are required. The “reasoning” process contains multiple models, generates many additional tokens, requires infrastructure in high -speed communication, memory, and computing to ensure high -quality results in real time. 。
To satisfy this demand, Coreweave has launched an NVIDIA GB200 NVL72 -based instance and has become the first cloud service provider to generally use the NVIDIA Blackwell platform.
Use 72 NVIDIA BLACKWELL GPU and 36 NVIDIA Grace CPU NVIDIA NVLINK to scaling up to 110,000 GPUs with NVIDIA Quantum-2 Infini Band Networking. These instances provide the scale and performance necessary to build and develop the next -generation AI inference model. And agent.
Coreweave’s NVIDIA GB200 NVL72
The NVIDIA GB200 NVL72 is a liquefied rack scale solution with a 72-GPU NVLINK domain, enabling a 6-do-do GPU to function as a single huge GPU.
NVIDIA BLACKWELL features many technical break -throughs that accelerate the production of inference tokens, and enhances performance while reducing service costs. For example, the fifth generation NVLINK enables the 130TB/s GPU bandwidth in one 72-GPU NVLINK domain, and can maintain the AI performance of FP4 while maintaining high accuracy with the second-generation trans engine.
The Coreweave portfolio of Managed Cloud Services is only for Blackwell. Coreweave Kubernetes Service NVLINK Demanding the NVLINK domain ID is optimized for workload orchestration and ensuring efficient scaling in the same rack. Kubernetes’s Slurm (Sunk) supports the topology block plug -in and enables intelligent workload distribution throughout the GB200 NVL72 rack. In addition, Coreweave’s observation platform provides real -time insights on NVLINK performance, GPU use, and temperature.
Coreweave’s GB200 NVL72 instance has a NVIDIA Quantum-2 Infini Band Networking, which provides 400GB/s bandwidth per GPU of up to 110,000 GPU clusters. The NVIDIA BlueField-3 DPU also provides these instances accelerating multi-tenant cloud networking, high-performance data access, and GPU computing elasticity.
Full stack acceleration computing platform for enterprise AI
NVIDIA’s Full Stack AI platform combines infrastructure equipped with Blackwell and state -of -the -art software in order to support companies building quickly, accurate and accurate AI agents.
NVIDIA BLUEPRINTS offers a pre -defined and deployed reference workflow that is defined in advance so that developers can create actual applications. NVIDIA NIM is a set of easy -to -use micro services designed for safe and reliable development of high -performance AI models for inference. NVIDIA NEMO includes training, customizes, and continuous improvements for modern enterprise uses AI models. Companies can build and fine -tune specialized AI agents using NVIDIA Blueprint, NIM, and NEMO.
These software components are all parts of the NVIDIA AI Enterprise software platform, it is important to distribute agent AI on a large scale, and can be easily expanded to Coreweave.
Bring the next generation AI to the cloud
The general availability of NVIDIA GB200 NVL72 -based instances in Coreweave emphasizes the latest collaboration of companies that focus on providing the latest acceleration computing solution to the cloud. With the release of these instances, companies can now access the scale and performance required to supply power to the next waves of the AI Printing Model and Agents.
Customers can use the GB200-4X instance ID to launch GB200 NVL72-based instances through the US-West-01 region Coreweave Kubernetes service. Please contact Coreweave to start.