Nutanix AHV: deploy an Ubuntu 24.04 VM with a NVidia Tesla P4 GPU

Integrating graphics processing power within virtualized environments has become a must. Whether it’s to run Artificial Intelligence models, Machine Learning, or simply for intensive video processing, our virtual machines increasingly need muscle.

When I talk with clients, I often get questions about this: how do you assign a physical graphics card to a VM in a simple and performant way?

Today, I suggest we look together at how to deploy an NVIDIA Tesla P4 GPU on an Ubuntu Server 24.04 VM hosted on Nutanix AHV, using “Passthrough” mode.

1. Prerequisites

Before getting our hands dirty, let’s take a moment to check our equipment. Good preparation is half the work done! I myself have lost hours in the past due to a simple forgotten prerequisite.

To follow this tutorial, you will need:

A Nutanix node (physical cluster) equipped with at least one NVIDIA Tesla P4 card.
A virtual machine running Ubuntu Server 24.04.
Functional SSH access to this VM with sudo privileges.

Although Nutanix AHV handles this transparently for you, keep in mind that Passthrough mode relies on specific hardware instructions. It requires that I/O virtualization extensions (VT-d for Intel or AMD-Vi for AMD) are properly enabled in the BIOS of your physical node. If you ever build a “home-lab” cluster, this is the first thing to check!

2. Nutanix Configuration: Passthrough Mode

Now that our foundation is solid, let’s move on to the administration interface. This is where the magic happens. Whether you use Prism Element or Prism Central, the logic remains the same. First, make sure your virtual machine is powered off.

Go to your VM’s settings, select “Update”, and scroll down to the “GPUs” section. Click on “Add GPU”.

In the window that opens, the choice is crucial: in the “GPU Type” drop-down menu, select Passthrough mode, then choose your Tesla P4 from the list.

Passthrough mode is a special feature: it allows you to “hand over the keys” of the physical graphics card directly to the virtual machine. The guest OS has the illusion (and the benefits) of physically owning the card.

You might be wondering why we prefer Passthrough over vGPU? It’s a matter of use case, but also architecture.

vGPU allows you to virtually slice a card to share it among several VMs, which is great for VDI, but it requires the installation and maintenance of an NVIDIA license server (vGPU Software).

Passthrough, on the other hand, dedicates 100% of the Tesla P4’s power to our Ubuntu VM, without any additional license server. For a raw single-VM performance need, it’s clearly the best option.

3. Preparation and Ubuntu 24.04 Update

Once the GPU is attached, save the configuration, power on your VM, and connect via SSH.

Before we rush into installing the NVIDIA drivers, there’s one step I absolutely never skip: updating the system.

Simply type this command:

sudo apt update && sudo apt upgrade -y

Proprietary NVIDIA drivers rely on the DKMS (Dynamic Kernel Module Support) system to compile kernel modules on the fly during installation.

If your kernel headers are not perfectly synchronized with your current Linux kernel version, the installation will fail silently. A freshly updated system is your best guarantee for a clean, hitch-free compilation!

4. NVIDIA Drivers Installation (Server Branch)

Now that our system is clean and updated, let’s move on to the main course. On Ubuntu, we often have the reflex to use the ubuntu-drivers autoinstall command or install the latest trendy “desktop” version. Let me stop you right there!

For a server, especially in production, stability is key. That’s why we are going to install the “Server” branch of the driver. Type the following command:

sudo apt install nvidia-driver-535-server -y

💡 The Expert’s Tip: Why specifically the “server” package? NVIDIA maintains specific driver branches for data centers. The “Server” branch (or Tesla driver) is designed for long lifecycles (LTS) and minimizes the risk of regression. Installing a “Desktop” driver on a hypervisor or an AI server means risking a minor update breaking your production environment on a Friday at 5 PM!

Let the installation finish (this may take a few minutes while DKMS compiles the module for your kernel). Once completed, a reboot is mandatory to properly load the driver:

sudo reboot

5. Validation

After the reboot, reconnect via SSH to your virtual machine. To verify that the OS, the driver, and the hardware are communicating perfectly, NVIDIA provides us with an essential command-line tool:

nvidia-smi

If all went well, you should see a beautiful ASCII dashboard appear.

💡 Explications: At the top right, note the CUDA Version (here 12.2): this is the maximum version of the CUDA API supported by this driver, crucial info for AI developers. Also look at the Perf column: it indicates “P0”, which corresponds to the maximum performance state (the “P-states” range from P0 to P12 for power saving). If your card stays stuck on a low P-state while under load, there’s a hardware or thermal issue! Finally, this output confirms that the OS is ready to host the NVIDIA Container Toolkit (for Docker with GPU).

Conclusion

And there you have it! In a few simple steps, we successfully physically presented our NVIDIA Tesla P4 GPU to our Ubuntu 24.04 VM under Nutanix AHV. Passthrough mode allowed us to achieve a high-performance, zero-latency configuration without the overhead of an external license server.

Our Tesla P4 is now properly installed! The next logical step? Deploying LLMs (Large Language Models) or compute-heavy applications in isolated containers. But that will be for a future article on the blog!

Team Leader - Nutanix Technology Champion - Nutanix NTC Storyteller