According to a TrendForce report citing Economical Daily News and MoneyDJ, Nvidia is considering adopting a socket design for AI and HPC applications in at least some of its upcoming Blackwell B300 GPUs. is. The company is said to be using a new socket design codenamed GB300, but so far that information seems unconvincing to say the least. But given the fact that there’s supply chain chatter, it’s at least worth considering.
MoneyDJ reports that considering AI GPU failure rates under high loads, motherboard replacement costs, and cooling issues, Nvidia and other AI GPU designers are considering next-generation You might consider using a socket design for your GPU.
EDN quoted CLSA analyst Chen Shuowen as saying that based on supply chain checks, Nvidia is designing GPU sockets for its own products, likely starting with the GB200 Ultra. Chen reportedly mentioned a 4-way Nvidia GPU design with one Nvidia CPU. Neither report mentions anything called GB300, so TrendForce likely added this part based on additional chatter.
There are a few things to note about reports. The initial report is inaccurate because the socket design adds to the power and cooling challenges rather than solving them. The most power-hungry GPUs typically use BGA packages.
A 4-way Blackwell GPU with a single CPU motherboard doesn’t seem special, given that DGX servers have an 8-way GPU baseboard and a 2-way CPU motherboard, but such a design is incredible. It looks about.
Nvidia’s data center nomenclature splits the company’s GPUs (A100, H100, B100/B200) and Grace CPU + GPU platforms (GH100, GB200). Currently, the GB200 platform uses BGA packages for both the CPU and GPU. I’m not sure if anything will need to change, as there could be a B200 Ultra refresh, especially a GB200 Ultra refresh sometime later this year.
Everyone loves standard CPU sockets because they’re easy to repair and upgrade. However, in servers they take up more space and have greater power and thermal constraints than BGA packages or SXM/OAM modules. The module is repairable, but the process may vary depending on the specific motherboard design, and OAM/SXM module removal requires careful handling, so it’s not as good as a socket.
There is one more thing to note. Currently, most Nvidia SXM modules are manufactured by Foxconn because add-in cards, SXM, and OAM modules are difficult and expensive to manufacture. Moving from cards or modules to sockets reduces cost but limits performance.
The possibilities of Blackwell hardware
Before we move on to Blackwell-based data center products that are said to have socketed GPUs (GB300, GB200 Ultra, etc.), let’s recall the Blackwell-based data center GPUs that Nvidia has already introduced.
Nvidia currently supports the B200 GPU (1,000W+) used in its GB200 boards (codenamed Bianca with one Grace CPU and two Blackwell GPUs, and Ariel with one Ariel CPU and one Blackwell GPU). is officially announced. BGA form factor. Additionally, Nvidia also has Umbriel GPU boards that support eight B200 (1000W) and B100 (700W) SXM module form factors. Additionally, there will also be a platform codenamed “Miranda” (with added performance (considering higher TDP), PCIe 6.0, and 800G networking) and “Oberon GB200” codenamed.
There are also Nvidia H100 and H200 add-in cards (based on the Hopper architecture), which have reduced performance to match the typical power and thermal budgets offered by classic servers, but Nvidia has also introduced add-in cards with Blackwell-based GPUs. has never been announced.
However, based on unofficial information, Nvidia is preparing a product codenamed B200A based on a monolithic B102 processor with four HBM3E memory stacks connected using TSMC’s CoWoS-S packaging technology. I know that. This is in contrast to TSMC’s dual-die B100/B200 design, which is packaged using CoWoS-L and connected to eight HBM3E memory stacks.
Given that the alleged B200A deals with a single-die product that isn’t designed to be a performance champion, it’s likely that the product will adopt multiple form factors. This includes the SXM modular design (particularly the China-specific B20 format) and add-in card form factors. Is it a socket? perhaps. We’ll take a look at that. Intel ran a socketed Xeon CPU Max 9480 ‘Sapphire Rapids’ with HBM onboard, but it was not a success beyond select supercomputing aurals. Does Nvidia want to develop something similar? We’ll see.