Add non-Kairos node to existing cluster or get Nvidia-gpu operator to work in Kairos. #3409
Unanswered
HurrsonBurrson
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm having a heck of a time trying to get either one of these routes to work.
I have successfully passed in a Nvidia gpu in k3s, so I'm curious if I can add a non Kairos node to my existing cluster.
running -
curl -sfL https://get.k3s.io | K3S_URL=https://10.2.4.2:6443 K3S_TOKEN='XXXXXXXX::server:XXXXX'
returns -
"Waiting to retrieve agent configuration; server is not ready: failed to retrieve configuration from server: not authorized"
Regarding Nvidia's gpu-operator, I've build a custom Kairos image with one or all of the following- drivers, container toolkit, nvidia-utils and have had zero luck. The closest I've gotten is the driver and nvidia utils existing, but 'Nvidia-smi' returns -
"NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running."
lspci shows the card and kernel modules. lsmod doesn't show anything, so I'm doubting it's actually loading the driver.
Thanks for any input.
Beta Was this translation helpful? Give feedback.
All reactions