|
| 1 | += Why Linux Troubleshooting Advice Sucks |
| 2 | + |
| 3 | +A short post on how to create better troubleshooting documentation, prompted by me spending last evening trying to get builtin display of my laptop working with Linux. |
| 4 | + |
| 5 | +What finally fixed the blank screen for me was this advice from NixOS wiki: |
| 6 | + |
| 7 | +.12th Gen (Alder Lake) |
| 8 | +**** |
| 9 | +X Server may fail to start with the newer 12th generation, Alder Lake, iRISxe integrated graphics chips. |
| 10 | +If this is the case, you can give the kernel a hint as to what driver to use. |
| 11 | +First confirm the graphic chip's device ID by running in a terminal: |
| 12 | +
|
| 13 | +[source] |
| 14 | +---- |
| 15 | +$ nix-shell -p pciutils --run "lspci | grep VGA" |
| 16 | +00:02.0 VGA compatible controller: Intel Corporation Device 46a6 (rev 0c) |
| 17 | +---- |
| 18 | +
|
| 19 | +In this example, "46a6" is the device ID. You can then add this to your configuration and reboot: |
| 20 | +
|
| 21 | +[source] |
| 22 | +---- |
| 23 | +boot.kernelParams = [ "i915.force_probe=46a6" ]; |
| 24 | +---- |
| 25 | +**** |
| 26 | + |
| 27 | +While this particular approach worked, in contrast to a dozen different ones I tried before, I think it shares a very common flaw, which is endemic to troubleshooting documentation. |
| 28 | +Can you spot it? |
| 29 | + |
| 30 | +The advice tells you the remedy ("`add this kernel parameter`"), but it doesn't explain how to verify that this indeed is the problem. |
| 31 | +That is, if the potential problem is a not loaded kernel driver, it would really help me to know how to check which kernel driver is in use, so that I can do both: |
| 32 | + |
| 33 | +* _Before_ adding the parameter, check that `46a6` doesn't have a driver |
| 34 | +* _After_ the fix, verify that `i915` is indeed use. |
| 35 | +
|
| 36 | +If a "`fix`" doesn't come with a linked "`diagnostic`", a very common outcome is: |
| 37 | + |
| 38 | +. Apply some random fix from the Internet |
| 39 | +. Observe that the final problem (blank screen) isn't fixed |
| 40 | +. Wonder which of the two is the case: |
| 41 | + * the fix is not relevant for the problem, |
| 42 | + * the fix is relevant, but is applied wrong. |
| 43 | + |
| 44 | +So, call to action: if you are writing any kind of documentation, before explaining how to _fix_ the problem, teach the user how to _diagnose_ it. |
| 45 | + |
| 46 | +When helping with `git`, start with explaining `git log` and `git status`, not with `git reset` or `git reflog`. |
| 47 | + |
| 48 | +--- |
| 49 | + |
| 50 | +While the post might come as just a tiny bit angry, I want to explicitly mention that I am eternally grateful to all the people who write _any_ kind of docs for using Linux on desktop. |
| 51 | +I've been running it for more than 10 years at this point, and I am still completely clueless as to how debug issues from the first principles. |
| 52 | +If not for all of the wikis, stackoverflows and random forum posts out there, I wouldn't be able to use the OS, so thank you all! |
0 commit comments