You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When a oomkill occurs on our cluster (Based on Talos Linux), we are unable to get the full dmesg stack with the following error :
nsenter: failed to execute dmesg: No such file or directory
Expected behavior
We need the full dmesg output on that event
Additional context
We tried to debug the behavior using the toolbook image : robustadev/debug-toolbox.
It seens (at least on Talos) that the command used to get the dmesg isn't working (
Hi @donch, we're actually planning to disable the dmesg enrichment by default in the upcoming Robusta release, as there are a few edge cases we're not happy with right now.
To help us prioritize fixing it, can you share more details on what you're looking for in the dmesg output and why you care about it? It does have helpful information on OOMKills, but given the challenges in making it work properly everywhere, we're not sure if it is worth the effort.
Hi @aantn , usually, the dmesg output is more a debug trace we can transmit to developper to understand why their application is running out of memory.
I think it's a nice to have informations
Got it, thanks. We're looking at ways to let HolmesGPT surface up more of this data. If you're interested in discussing would love to chat and understand if it can work for your use case.
Describe the bug
When a oomkill occurs on our cluster (Based on Talos Linux), we are unable to get the full dmesg stack with the following error :
Expected behavior
We need the full dmesg output on that event
Additional context
We tried to debug the behavior using the toolbook image : robustadev/debug-toolbox.
It seens (at least on Talos) that the command used to get the dmesg isn't working (
robusta/src/robusta/integrations/kubernetes/custom_models.py
Line 287 in 147538b
When using
nsenter -t 1 "{cmd}"
it works as expected (at least from a debug-toolbox pods)The text was updated successfully, but these errors were encountered: