Skip to content

nixos/nvidia-container-toolkit: CDI generator may run before NVIDIA driver is loaded #451912

@taha-yassine

Description

@taha-yassine

Nixpkgs version

  • Stable (25.05)

Describe the bug

Enabling the NVIDIA container toolkit: hardware.nvidia-container-toolkit.enable = true can lead to a boot-time failure of the nvidia-container-toolkit-cdi-generator.service. The service sometimes runs before the NVIDIA kernel modules are loaded, resulting in the error: failed to generate CDI spec: failed to create device CDI specs: failed to initialize NVML: Driver Not Loaded.
As a consequence, /var/run/cdi/nvidia.yaml is not generated, and containers that rely on --device=nvidia.com/gpu=all fail with: docker: Error response from daemon: CDI device injection failed: unresolvable CDI devices nvidia.com/gpu=all.

Steps to reproduce

  1. Enable the NVIDIA container toolkit in NixOS:
    hardware.nvidia-container-toolkit.enable = true;
  2. Reboot the system.
  3. Check the logs for the CDI generator:
    journalctl -u nvidia-container-toolkit-cdi-generator.service -b
  4. Observe the error about "Driver Not Loaded".
  5. Verify that /var/run/cdi/nvidia.yaml does not exist until the service is manually restarted.

Expected behaviour

The CDI generator service should reliably generate the NVIDIA CDI spec after the driver is loaded, without requiring manual intervention.

Screenshots

No response

Relevant log output

Oct 14 07:22:19 framework systemd[1]: Starting Container Device Interface (CDI) for Nvidia generator...
Oct 14 07:22:21 framework nvidia-cdi-generator[1065]: time="2025-10-14T07:22:21+02:00" level=error msg="failed to generate CDI spec: failed to create device CDI specs: failed to initialize NVML: Driver Not Loaded"

Additional context

No response

System metadata

  • system: "x86_64-linux"
  • host os: Linux 6.17.0, NixOS, 25.05 (Warbler), 25.05.20251002.879bd46
  • multi-user?: yes
  • sandbox: yes
  • version: nix-env (Nix) 2.28.5
  • channels(root): "nixos-23.11, unstable"
  • nixpkgs: /nix/store/hhg7xrkgh6y3w89cx80qczcm9qm5xsv3-source

Notify maintainers

@ereslibre


Note for maintainers: Please tag this issue in your pull request description. (i.e. Resolves #ISSUE.)

I assert that this issue is relevant for Nixpkgs

Is this issue important to you?

Add a 👍 reaction to issues you find important.

Metadata

Metadata

Assignees

Labels

0.kind: bugSomething is broken6.topic: nixosIssues or PRs affecting NixOS modules, or package usability issues specific to NixOS

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions