SC24 Nvidia’s accelerators are among the most power-hungry machines in their class, yet the chip continues to top the Green500 ranking of the world’s most sustainable supercomputers.
Eight of the 10 most power-efficient systems on our semi-annual list feature Nvidia components, including five powered by the GPU giant’s 1,000-watt Grace Hopper superchip (GH200). Ta.
This part combines a 72-core Grace CPU based on Arm’s Neoverse V2 design with an H100 GPU with 480 GB of LPDDR5x memory and 96-144 GB of HBM3 or HBM3e memory, making it extremely popular in the HPC community. there is.
On the latest Green 500 list, this chip powers both the first and second most efficient systems (EuroHPC’s JEDI and Romeo HPC Center’s Romeo-2025 machines), with Achieving 72.7 and 70.9 GigaFLOPS. This is FP64. of course.
The two systems are nearly identical, built using Eviden’s BullSequana XH3000 platform and employing the same GH200 accelerator. Nvidia’s GH200 ranks 4th, 6th, and 7th on the list, along with Isambard-AI Phase 1 (68.8 GigaFLOPS/Watt), Jupiter Exascale Transition Instrument (67.9 GigaFLOPS/Watt), and Helios GPU (66.9 GigaFLOPS/Watt) is also ranked.
Jupiter exascale development equipment … Image by Forschungszentrum Jülich / Ralf-Uwe Limbach
Meanwhile, Nvidia’s venerable H100 powers the fifth, eighth, and ninth most efficient machines, including Capella, Henri, and HoreKa-Teal systems.
It is doubtful that Nvidia will be able to maintain its high ranking on the Green 500. The company’s Grace-Blackwell superchips are already in development in the form of the 2.7-kilowatt GB200 and 5.4-kilowatt GB200 NVL4.
New products don’t always deliver more computing power per watt.
From A100 in 2020 to H100 in 2022, FP64 performance has jumped approximately 3.5x. However, compared to the 1,200 watt Blackwell, the 700 watt H100 is actually faster in FP64 matrix operations. In fact, for FP64, the only improvement is in vector operations, where the upcoming chip boasts 32% higher performance.
So while Nvidia currently holds a high position on the Green500, AMD is not out of the game yet. In fact, House of Zen’s MI300A high-speed processing unit took third place on the latest list of Adastra 2 systems.
For those unfamiliar, AMD’s MI300A was announced just under a year ago and fuses 24 CPU cores and 6 CDNA-3 GPU dies into a single APU, delivering up to 128 GB of Equipped with HBM3 memory and configurable TDP all in one. 550-760 watts. And, at least in theory, this part already boasts 1.8x the HPC performance of the H100.
Built by HPE Cray using EX255a blades used in the world’s most powerful publicly known supercomputers, Adastra 2 managed performance of 69 GigaFLOPS/Watt. And you’re not alone. The 10th most efficient machine was another MI300A-based machine at Lawrence Livermore National Laboratory called RZAdams, which managed 62.8 GigaFLOPS/Watt.
scale up
All of these systems in the Green500 top 10 far exceed the 50 GigaFLOPS/W target needed to achieve exaflop computing in a 20 megawatt envelope. However, maintaining these levels of efficiency in large-scale environments has proven to be quite difficult.
Looking at the three most efficient Green500 machines, they are all small. JEDI has a power rating of only 67 kilowatts. For comparison, the Swiss National Supercomputing Center’s Alps machine, the most powerful GH200 system in the Top500, achieves 434 PetaFLOPS on the HPL benchmark while consuming 7.1 Megawatts, or 61 GigaFLOPS per Watt at 14 It is the most efficient machine.
It’s a similar story with the 37-kilowatt Adastra 2, which is even smaller than JEDI. If you can sustain 69 gigaFLOPS per watt at scale, you only need about 25.2 megawatts to match El Capitan’s real-world performance of 1.742 exaFLOPS. In fact, El Capitan required nearly 29.6 megawatts of power to achieve record-breaking operation. ®