Nvidia unveils its Hopper data center architecture
Nvidia’s GPU Technology Conference (GTC) is underway. During CEO Jensen Huang’s keynote, details of Nvidia’s next-generation Hopper architecture were revealed. Although this is an AI and data center focused GPU, it gives us some clues as to what we can expect from Nvidia’s gaming oriented Ada Lovelace GPU architecture, which is due out later. in 2022.
The H100 is a major step up from the current flagship A100. The full GPU contains 80 billion transistors, 26 billion more than the A100. It is built on a custom TSMC 4nm process. It supports up to 80 GB of HBM 3 memory delivering up to 3 TB/s of bandwidth.
The H100 supports PCIe 5.0 and NVLink to connect multiple GPUs together. It can deliver 2,000 TFLOPS of FP16 and 1,000 TFLOPS of TF32 performance, which is triple that of the A100. Hopper introduces a new instruction set called DPX. It is designed to accelerate performance in areas as diverse as disease diagnosis, quantum simulation, graph analysis, and routing optimizations.
The full H100 GPU features 18432 CUDA cores and 576 Tensor cores. This compares to the A100 with 8192 and 512 respectively, although not all cores are unlocked at this time, presumably to maximize yields. The base clocks are also not finalized. Despite being made on such an advanced node, the SXM version of the H100 comes with a TDP of 700W. That’s right, seven hundred. watt.
The H100 is poised to be a card monster, but is it relevant for PC gamers? The answer is sort of. H100 is about compute performance and not graphics, but we can take some information and use it to predict what the game version might look like.
The move to a custom TSMC 4nm node is a major step up from the 8nm Samsung process used for the RTX-30 series. It is likely to be used for RTX-40 series cards as well. Also worth noting is support for PCIe 5.0. While by itself it shouldn’t offer any real performance advantages over PCIe 4.0, it may well do over PCIe 3.0, which is still widely used on many gaming systems.
But perhaps the biggest nugget of all is the rather astonishing 700W TDP of the high-end configuration. Just look at the VRM of this card! 700W for a data center product is something that can be managed, but if we get something like that for a flagship RTX 4090 we’d be shocked. Unfortunately, rumors of sharp increases in power consumption continue to surface. Even 500W is a leap and that means quad-slot graphics cards could become the norm, at the high end of the market anyway.
Nvidia is still working on the H100. While its key specs are shared with the RTX 40 series, it’s fair to say that the high-end cards will be hot and power-hungry, but packed with tech and much faster than the RTX 3090 (and upcoming RTX 3090 Ti) . AMD will be competing with its RDNA3-based cards and it’s shaping up to be one hell of a battle, with all performance clearly a priority for both companies at the expense of power efficiency. We can’t wait!