NVidia GPU Passthrough to LXC on Proxmox
Configuring NVidia GPU passthrough to a Linux Container (LXC) on Proxmox can greatly enhance the performance of hardware-intensive workloads. With GPU passthrough, the LXC will be able to utilize the full power of the GPU, providing hardware acceleration for demanding tasks such as:
- Running Plex media server with hardware-based decoding and encoding support, resulting in smoother and higher-quality video playback.
- Machine learning with CUDA, allowing you to train and run deep learning models more efficiently.
- High-performance video editing, where the GPU can be used to perform demanding tasks such as video rendering and encoding.
- Scientific simulations, which can take advantage of the GPU's parallel processing capabilities to perform complex calculations and simulations.
It's important to note that GPU passthrough requires hardware that supports this technology, and that the host and LXC must be running the same version of the Nvidia driver.
Updates
- 2024-01-02: Updated lxc configuration to reflect changes to cgroups.
Driver Installation - Host
Download the latest driver off of the Nvidia website (https://www.nvidia.com/en-us/drivers/unix/), take note of the version number.
Install the driver
1root@pve:~# chmod +x NVIDIA-Linux-x86_64-515.76.run
2root@pve:~# ./NVIDIA-Linux-x86_64-515.76.run
Verify you can see your card with nvidia-smi
, you should be able to see your GPU. if you want to see the full name run nvidia-smi -L
1root@pve:~# nvidia-smi
2Sun Feb 12 10:50:12 2023
3+-----------------------------------------------------------------------------+
4| NVIDIA-SMI 515.76 Driver Version: 515.76 CUDA Version: 11.7 |
5|-------------------------------+----------------------+----------------------+
6| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
7| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
8| | | MIG M. |
9|===============================+======================+======================|
10| 0 NVIDIA GeForce ... Off | 00000000:08:00.0 Off | N/A |
11| 0% 47C P8 19W / 185W | 3MiB / 4096MiB | 0% Default |
12| | | N/A |
13+-------------------------------+----------------------+----------------------+
14
15+-----------------------------------------------------------------------------+
16| Processes: |
17| GPU GI CI PID Type Process name GPU Memory |
18| ID ID Usage |
19|=============================================================================|
20| No running processes found |
21+-----------------------------------------------------------------------------+
22root@pve:~# nvidia-smi -L
23GPU 0: NVIDIA GeForce GTX 980 (UUID: GPU-934aa0ef-64e0-4fb0-7910-46f1b818486b)
LXC Configuration - Host
Retrieve the cgroup
assigned to the NVidia GPU, as it will be necessary for configuring the LXC, In my case they were 195
and 511
1root@pve:~# ls -l /dev/nvidia*
2crw-rw-rw- 1 root root 195, 0 Jan 10 16:11 /dev/nvidia0
3crw-rw-rw- 1 root root 195, 255 Jan 10 16:11 /dev/nvidiactl
4crw-rw-rw- 1 root root 511, 0 Jan 10 16:11 /dev/nvidia-uvm
5crw-rw-rw- 1 root root 511, 1 Jan 10 16:11 /dev/nvidia-uvm-tools
6
7/dev/nvidia-caps:
8total 0
9cr-------- 1 root root 236, 1 Jan 10 16:11 nvidia-cap1
10cr--r--r-- 1 root root 236, 2 Jan 10 16:11 nvidia-cap2
11root@pve:~#
Note down the cgroup
number from /dev/nvidia0
and /dev/nvidia-uvm
Make sure that the LXC container is not running before making any configuration changes, as any changes made while the container is running will be lost once it is stopped.
Open the lxc configuration file in your preferred text editor. The configuration file is located at /etc/pve/lxc/<lxcid>.conf
.
Add the following two lines to the bottom of the lxc configuration file. This will grant the LXC permission to interact with the cgroups 195
and 511
, which have been assigned to your GPU.
Update: 2024-01-02: change lxc.cgroup
to lxc.cgroup2
to reflect changes to cgroups.
1lxc.cgroup2.devices.allow: c 195:* rwm
2lxc.cgroup2.devices.allow: c 511:* rwm
We now need to mount the devices into the LXC. Add the following five lines to the bottom of the configuration file to complete this step.
1lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
2lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
3lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
4lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
5lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
Driver Installation - LXC
Make sure to download the correct version of the Nvidia driver for the LXC. It should match the version that is already installed on the host. Using different versions of the driver between the host and the LXC can cause compatibility issues and result in the GPU passthrough not functioning properly.
Install the driver
1root@app-plex:~# chmod +x NVIDIA-Linux-x86_64-515.76.run
2root@app-plex:~# ./NVIDIA-Linux-x86_64-515.76.run
Verify you can see your card with nvidia-smi
, you should be able to see your GPU. if you want to see the full name run nvidia-smi -L
1root@app-plex:~# nvidia-smi
2Sun Feb 12 10:50:12 2023
3+-----------------------------------------------------------------------------+
4| NVIDIA-SMI 515.76 Driver Version: 515.76 CUDA Version: 11.7 |
5|-------------------------------+----------------------+----------------------+
6| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
7| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
8| | | MIG M. |
9|===============================+======================+======================|
10| 0 NVIDIA GeForce ... Off | 00000000:08:00.0 Off | N/A |
11| 0% 47C P8 19W / 185W | 3MiB / 4096MiB | 0% Default |
12| | | N/A |
13+-------------------------------+----------------------+----------------------+
14
15+-----------------------------------------------------------------------------+
16| Processes: |
17| GPU GI CI PID Type Process name GPU Memory |
18| ID ID Usage |
19|=============================================================================|
20| No running processes found |
21+-----------------------------------------------------------------------------+
22root@app-plex:~# nvidia-smi -L
23GPU 0: NVIDIA GeForce GTX 980 (UUID: GPU-934aa0ef-64e0-4fb0-7910-46f1b818486b)
With this setup, you should be able to use the GPU as if the LXC were a bare-metal machine, with some limitations to keep in mind. For example, the maximum number of active decoding/encoding processes is limited to 5
Troubleshooting
If the GPU is not visible after a host reboot, check if the cgroups have changed. If so, update the LXC configuration to reflect the changes. The reason for this occasional issue is not known.
If the GPU is no longer visible in the LXC after an update on the host, it may be necessary to reinstall the Nvidia driver on the host. It's important to ensure that the version of the driver installed on the host matches the version installed in the LXC.