A critical security flaw has been revealed in the NVIDIA Container Toolkit that, if successfully exploited, could allow a threat actor to breach the container perimeter and gain full access to the underlying host.
This vulnerability is tracked as CVE-2024-0132 and has a CVSS score of 9.0 out of a maximum of 10.0. This issue is addressed in NVIDIA Container Toolkit version v1.16.2 and NVIDIA GPU Operator version 24.6.2.
“NVIDIA Container Toolkit 1.16.1 and earlier contains a Time-of-Check Time-of-Use (TOCTOU) vulnerability when used in its default configuration, which allows specially created container images to ,” NVIDIA said in the article. Recommendation.
“Successful exploitation of this vulnerability could lead to code execution, denial of service, elevation of privilege, information disclosure, and data modification.”
This issue affects all versions of NVIDIA Container Toolkit (up to v1.16.1) and Nvidia GPU Operator (up to 24.6.1). However, it does not affect use cases where Container Device Interface (CDI) is used.
Cloud security company Wiz, which discovered and reported the flaw to NVIDIA on September 1, 2024, said that an attacker controlling a container image run by the toolkit could perform a container escape and create a full-fledged attack on the underlying host. I mentioned that I might be able to gain access.
In a hypothetical attack scenario, an attacker could weaponize this shortcoming by creating a malicious container image that, when executed directly or indirectly on a target platform, grants full access to the file system. may become.
This could materialize in the form of a supply chain attack where victims are tricked into running a malicious image, or it could occur through a service that grants shared GPU resources.
“This access allows the attacker to access the Container Runtime Unix socket (docker.sock/containerd.sock),” said security researchers Shir Tamari, Ronen Shustin, and Andres Riancho.
“These sockets can be used to execute arbitrary commands on the host system with root privileges, effectively giving you control of the machine.”
This issue is particularly important in orchestrated multi-tenant environments because an attacker could potentially escape the container and access data and secrets from other applications running on the same node or even the same cluster. poses serious risks.
The technical aspects of the attack are withheld at this stage to prevent abuse. We strongly recommend that you take steps to apply patches to protect against potential threats.
“While the hype around AI security risks tends to focus on futuristic AI-based attacks, ‘old school’ infrastructure vulnerabilities in the ever-growing AI technology stack remain a priority for security teams. “This is an urgent risk that must be protected against,” the researchers said. .