|
| 1 | +.. SPDX-License-Identifier: GPL-2.0 |
| 2 | +
|
| 3 | +======================================== |
| 4 | +Debugging advice for driver development |
| 5 | +======================================== |
| 6 | + |
| 7 | +This document serves as a general starting point and lookup for debugging |
| 8 | +device drivers. |
| 9 | +While this guide focuses on debugging that requires re-compiling the |
| 10 | +module/kernel, the :doc:`userspace debugging guide |
| 11 | +</process/debugging/userspace_debugging_guide>` will guide |
| 12 | +you through tools like dynamic debug, ftrace and other tools useful for |
| 13 | +debugging issues and behavior. |
| 14 | +For general debugging advice, see the :doc:`general advice document |
| 15 | +</process/debugging/index>`. |
| 16 | + |
| 17 | +.. contents:: |
| 18 | + :depth: 3 |
| 19 | + |
| 20 | +The following sections show you the available tools. |
| 21 | + |
| 22 | +printk() & friends |
| 23 | +------------------ |
| 24 | + |
| 25 | +These are derivatives of printf() with varying destinations and support for |
| 26 | +being dynamically turned on or off, or lack thereof. |
| 27 | + |
| 28 | +Simple printk() |
| 29 | +~~~~~~~~~~~~~~~ |
| 30 | + |
| 31 | +The classic, can be used to great effect for quick and dirty development |
| 32 | +of new modules or to extract arbitrary necessary data for troubleshooting. |
| 33 | + |
| 34 | +Prerequisite: ``CONFIG_PRINTK`` (usually enabled by default) |
| 35 | + |
| 36 | +**Pros**: |
| 37 | + |
| 38 | +- No need to learn anything, simple to use |
| 39 | +- Easy to modify exactly to your needs (formatting of the data (See: |
| 40 | + :doc:`/core-api/printk-formats`), visibility in the log) |
| 41 | +- Can cause delays in the execution of the code (beneficial to confirm whether |
| 42 | + timing is a factor) |
| 43 | + |
| 44 | +**Cons**: |
| 45 | + |
| 46 | +- Requires rebuilding the kernel/module |
| 47 | +- Can cause delays in the execution of the code (which can cause issues to be |
| 48 | + not reproducible) |
| 49 | + |
| 50 | +For the full documentation see :doc:`/core-api/printk-basics` |
| 51 | + |
| 52 | +Trace_printk |
| 53 | +~~~~~~~~~~~~ |
| 54 | + |
| 55 | +Prerequisite: ``CONFIG_DYNAMIC_FTRACE`` & ``#include <linux/ftrace.h>`` |
| 56 | + |
| 57 | +It is a tiny bit less comfortable to use than printk(), because you will have |
| 58 | +to read the messages from the trace file (See: :ref:`read_ftrace_log` |
| 59 | +instead of from the kernel log, but very useful when printk() adds unwanted |
| 60 | +delays into the code execution, causing issues to be flaky or hidden.) |
| 61 | + |
| 62 | +If the processing of this still causes timing issues then you can try |
| 63 | +trace_puts(). |
| 64 | + |
| 65 | +For the full Documentation see trace_printk() |
| 66 | + |
| 67 | +dev_dbg |
| 68 | +~~~~~~~ |
| 69 | + |
| 70 | +Print statement, which can be targeted by |
| 71 | +:ref:`process/debugging/userspace_debugging_guide:dynamic debug` that contains |
| 72 | +additional information about the device used within the context. |
| 73 | + |
| 74 | +**When is it appropriate to leave a debug print in the code?** |
| 75 | + |
| 76 | +Permanent debug statements have to be useful for a developer to troubleshoot |
| 77 | +driver misbehavior. Judging that is a bit more of an art than a science, but |
| 78 | +some guidelines are in the :ref:`Coding style guidelines |
| 79 | +<process/coding-style:13) printing kernel messages>`. In almost all cases the |
| 80 | +debug statements shouldn't be upstreamed, as a working driver is supposed to be |
| 81 | +silent. |
| 82 | + |
| 83 | +Custom printk |
| 84 | +~~~~~~~~~~~~~ |
| 85 | + |
| 86 | +Example:: |
| 87 | + |
| 88 | + #define core_dbg(fmt, arg...) do { \ |
| 89 | + if (core_debug) \ |
| 90 | + printk(KERN_DEBUG pr_fmt("core: " fmt), ## arg); \ |
| 91 | + } while (0) |
| 92 | + |
| 93 | +**When should you do this?** |
| 94 | + |
| 95 | +It is better to just use a pr_debug(), which can later be turned on/off with |
| 96 | +dynamic debug. Additionally, a lot of drivers activate these prints via a |
| 97 | +variable like ``core_debug`` set by a module parameter. However, Module |
| 98 | +parameters `are not recommended anymore |
| 99 | +<https://lore.kernel.org/all/2024032757-surcharge-grime-d3dd@gregkh>`_. |
| 100 | + |
| 101 | +Ftrace |
| 102 | +------ |
| 103 | + |
| 104 | +Creating a custom Ftrace tracepoint |
| 105 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 106 | + |
| 107 | +A tracepoint adds a hook into your code that will be called and logged when the |
| 108 | +tracepoint is enabled. This can be used, for example, to trace hitting a |
| 109 | +conditional branch or to dump the internal state at specific points of the code |
| 110 | +flow during a debugging session. |
| 111 | + |
| 112 | +Here is a basic description of :ref:`how to implement new tracepoints |
| 113 | +<trace/tracepoints:usage>`. |
| 114 | + |
| 115 | +For the full event tracing documentation see :doc:`/trace/events` |
| 116 | + |
| 117 | +For the full Ftrace documentation see :doc:`/trace/ftrace` |
| 118 | + |
| 119 | +DebugFS |
| 120 | +------- |
| 121 | + |
| 122 | +Prerequisite: ``CONFIG_DEBUG_FS` & `#include <linux/debugfs.h>`` |
| 123 | + |
| 124 | +DebugFS differs from the other approaches of debugging, as it doesn't write |
| 125 | +messages to the kernel log nor add traces to the code. Instead it allows the |
| 126 | +developer to handle a set of files. |
| 127 | +With these files you can either store values of variables or make |
| 128 | +register/memory dumps or you can make these files writable and modify |
| 129 | +values/settings in the driver. |
| 130 | + |
| 131 | +Possible use-cases among others: |
| 132 | + |
| 133 | +- Store register values |
| 134 | +- Keep track of variables |
| 135 | +- Store errors |
| 136 | +- Store settings |
| 137 | +- Toggle a setting like debug on/off |
| 138 | +- Error injection |
| 139 | + |
| 140 | +This is especially useful, when the size of a data dump would be hard to digest |
| 141 | +as part of the general kernel log (for example when dumping raw bitstream data) |
| 142 | +or when you are not interested in all the values all the time, but with the |
| 143 | +possibility to inspect them. |
| 144 | + |
| 145 | +The general idea is: |
| 146 | + |
| 147 | +- Create a directory during probe (``struct dentry *parent = |
| 148 | + debugfs_create_dir("my_driver", NULL);``) |
| 149 | +- Create a file (``debugfs_create_u32("my_value", 444, parent, &my_variable);``) |
| 150 | + |
| 151 | + - In this example the file is found in |
| 152 | + ``/sys/kernel/debug/my_driver/my_value`` (with read permissions for |
| 153 | + user/group/all) |
| 154 | + - any read of the file will return the current contents of the variable |
| 155 | + ``my_variable`` |
| 156 | + |
| 157 | +- Clean up the directory when removing the device |
| 158 | + (``debugfs_remove_recursive(parent);``) |
| 159 | + |
| 160 | +For the full documentation see :doc:`/filesystems/debugfs`. |
| 161 | + |
| 162 | +KASAN, UBSAN, lockdep and other error checkers |
| 163 | +---------------------------------------------- |
| 164 | + |
| 165 | +KASAN (Kernel Address Sanitizer) |
| 166 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 167 | + |
| 168 | +Prerequisite: ``CONFIG_KASAN`` |
| 169 | + |
| 170 | +KASAN is a dynamic memory error detector that helps to find use-after-free and |
| 171 | +out-of-bounds bugs. It uses compile-time instrumentation to check every memory |
| 172 | +access. |
| 173 | + |
| 174 | +For the full documentation see :doc:`/dev-tools/kasan`. |
| 175 | + |
| 176 | +UBSAN (Undefined Behavior Sanitizer) |
| 177 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 178 | + |
| 179 | +Prerequisite: ``CONFIG_UBSAN`` |
| 180 | + |
| 181 | +UBSAN relies on compiler instrumentation and runtime checks to detect undefined |
| 182 | +behavior. It is designed to find a variety of issues, including signed integer |
| 183 | +overflow, array index out of bounds, and more. |
| 184 | + |
| 185 | +For the full documentation see :doc:`/dev-tools/ubsan` |
| 186 | + |
| 187 | +lockdep (Lock Dependency Validator) |
| 188 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 189 | + |
| 190 | +Prerequisite: ``CONFIG_DEBUG_LOCKDEP`` |
| 191 | + |
| 192 | +lockdep is a runtime lock dependency validator that detects potential deadlocks |
| 193 | +and other locking-related issues in the kernel. |
| 194 | +It tracks lock acquisitions and releases, building a dependency graph that is |
| 195 | +analyzed for potential deadlocks. |
| 196 | +lockdep is especially useful for validating the correctness of lock ordering in |
| 197 | +the kernel. |
| 198 | + |
| 199 | +PSI (Pressure stall information tracking) |
| 200 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 201 | + |
| 202 | +Prerequisite: ``CONFIG_PSI`` |
| 203 | + |
| 204 | +PSI is a measurement tool to identify excessive overcommits on hardware |
| 205 | +resources, that can cause performance disruptions or even OOM kills. |
| 206 | + |
| 207 | +device coredump |
| 208 | +--------------- |
| 209 | + |
| 210 | +Prerequisite: ``#include <linux/devcoredump.h>`` |
| 211 | + |
| 212 | +Provides the infrastructure for a driver to provide arbitrary data to userland. |
| 213 | +It is most often used in conjunction with udev or similar userland application |
| 214 | +to listen for kernel uevents, which indicate that the dump is ready. Udev has |
| 215 | +rules to copy that file somewhere for long-term storage and analysis, as by |
| 216 | +default, the data for the dump is automatically cleaned up after 5 minutes. |
| 217 | +That data is analyzed with driver-specific tools or GDB. |
| 218 | + |
| 219 | +You can find an example implementation at: |
| 220 | +`drivers/media/platform/qcom/venus/core.c |
| 221 | +<https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/platform/qcom/venus/core.c#L30>`__ |
| 222 | + |
| 223 | +**Copyright** ©2024 : Collabora |
0 commit comments