At CES 2025, Digital Foundry interviewed Bryan Catanzaro, Vice President of Applied Deep Learning Research, about the announcement of NVIDIA DLSS 4 and the many improvements to super-resolution, ray reconstruction, and frame generation.
Catanzaro talked about new transformer models that replace CNNs (convolutional neural networks) for super-resolution and ray reconstruction. They are simply much smarter and can be trained on large datasets, resulting in better choices and improving historical NVIDIA DLSS shortcomings such as flickering and ghosting. For example, the new super-resolution model has four times the computing power compared to the previous model. Catanzaro didn’t give an estimate of how long rendering would take, but NVIDIA said this is the best way to play with the new Blackwell-powered GeForce RTX 50 graphics cards, which will be available later this month. said he believed.
Frame generation has also been overhauled, moving away from the previous model based on optical flow hardware accelerators and moving to a fully AI-powered solution. Here’s why NVIDIA does this:
When we built NVIDIA DLSS 3 Frame Generation, we absolutely needed hardware acceleration to calculate optical flow. We didn’t have enough Tensor Cores, we didn’t have enough optical flow algorithms. We hadn’t developed a real-time optical flow algorithm running on Tensor Cores that would fit our computing budget. We have an optical flow accelerator that NVIDIA has been building for years as an evolution of video encoder technology and was also part of in-vehicle computer vision acceleration for self-driving cars.
It makes sense to use this for NVIDIA DLSS 3 frame generation. However, the difficulty with hardware implementations of algorithms like optical flow is that they are very difficult to improve upon. It was a kind of failure that arose from the hardware optical flow, and we couldn’t undo them with smarter neural networks until we decided to replace it and introduce a completely AI-based solution. Frame generation for DLSS 4 is complete.
Although the new frame generation model is heavier on Tensor Cores, it uses less VRAM and provides better image quality, which Catanzaro says is important, especially for the new multi-frame generation available on the new RTX 50 GPU. ), and is also more efficient. Costs are amortized over multiple frames.
So DF’s Alex Battaglia asked if the new model could be ported to older hardware like the GeForce RTX 30 series, but the head of NVIDIA DLSS didn’t close the door.
I think this is primarily an optimization problem, an engineering problem, and ultimately a user experience problem. We are launching this frame generation with the 50 series, the best multi-frame generation technology. You’ll see what you can squeeze out of old hardware in the future.
As a reminder, when NVIDIA introduced frame generation with its GeForce RTX 40 graphics cards, Catanzaro himself explained that this feature was exclusive to the then-new GPUs. That’s because the GPU features significantly improved optical flow hardware acceleration over the RTX 30 series. At the time, he also said that while it was theoretically possible to port it to older hardware, it probably wouldn’t be very useful.
The new model removes the optical flow hardware accelerator, so it seems like the door is open for that to happen. However, Catanzaro also said that Tensor Core requirements are higher and Tensor Core performance is clearly lower on older GPU architectures. Let’s see if NVIDIA can really make it happen.
Elsewhere in the interview, Bryan Catanzaro talks about decoupling updated flip metering from the CPU to reduce frame time variation by a factor of 5-10 compared to before (thus improving frame pacing) I emphasized the importance of this. Last but not least, he believes that playing games with Reflex 2 (which is also AI-based) feels more “connected” and that lag-sensitive gamers in particular will love it. He insisted.
Stay tuned for more coverage of NVIDIA DLSS 4 on Wccftech as we get closer to the launch of the new GeForce RTX 50 graphics cards.