@@ -17,7 +17,7 @@ practice having no ABI, but reality is in this text simplified for brevity and
17
17
sanity.
18
18
19
19
- 1.[Theory of debugging](#theory)
20
- - 2.[Practical methods with tradeoffs ](#practice)
20
+ - 2.[Practical methods with trade-offs ](#practice)
21
21
- 3.[Uniform execution representation](#uniform_execution_representation)
22
22
- 4.[Abstraction problems during problem isolation](#abstraction_problems)
23
23
- 5.[Possible implementations](#possible_implementations)
@@ -35,34 +35,41 @@ on a specific program run. If the execution witness shows a "bad state",
35
35
then there must be a bug.
36
36
Thus a **debugger** can be seen **as query engine over states and transitions of
37
37
a buggy execution witness.**
38
+ In more simple terms, **debugging is not making bugs or removing them**.
38
39
Frequent operations are bug source isolation to deterministic components,
39
40
where encapsulation of non-determinism usually simplifies the process.
40
41
In contrast to that, concurrent code is tricky to debug, because one
41
42
needs to trace multiple execution flows to estimate where the origin of the
42
43
incorrect state is.
43
44
45
+ The process of debugging means to use static and dynamic program analysis
46
+ and its automation and adaption to speed up bug (classes) elimination for the
47
+ (classes of) target systems.
48
+
44
49
One can generally categorize methods into the following list (**asoul**)
45
- **a**utomate, **s**implify, **o**bserve, understand, learn )
50
+ **a**utomate, **s**implify, **o**bserve, **u**nderstand, **l**earn )
46
51
- **a**utomate the process to minimize errors/oversights during debugging,
47
52
against probabilistic errors, document the process etc
48
53
- **s**implify and isolate system components and changes over time
49
54
- **o**bserve the system while running it to *trace state or state changes*
50
55
- **u**nderstand the expected and actual code semantics to the degree necessary
51
56
- **l**earn, extend and ensure how and which system invariants are satisfied
52
57
necessary from *of the involved systems*,
53
- for example userspace processes, kernel, build system, compiler, source code, linker,
58
+ for example user-space processes, kernel, build system, compiler, source code, linker,
54
59
object code, assembly, hardware etc
55
60
56
61
with the fundamental constrains being (**feel**)
57
62
- **f**inding out correct system components semantics
58
63
- **ee**nsuring deterministic reproducibility of the problem
59
64
- **l**imited time and effort
60
65
61
- Common debugging methods to **feel a soul** with various tradeoffs from compile-time
62
- to runtime debugging and less to more run-time data collection are:
66
+ Common static and dynamic program analysis methods to
67
+ **run the system** to **feel a soul** for the purpose of eliminating the bug
68
+ (classes) are:
69
+ - **Specification** meaning to "compare/get/write the details".
63
70
- **Formal Verification** as ahead or compile-time invariant resolving.
64
- - **Validation** as runtime invariant checks.
65
- - **Testing** as sample based runtime invariant checks.
71
+ - **Validation** as runtime invariant checks. Sanitizers as compiler runtime checks are common tools.
72
+ - **Testing** as sample based runtime invariant checks. Coverage based fuzzers are common tools.
66
73
- **Stepping** via "classical debugger" to manipulate task execution
67
74
context, manipulate memory optionally via source code location translation
68
75
via REPL commands, graphically, scripting or (rarely) freely programmable.
@@ -73,12 +80,20 @@ to runtime debugging and less to more run-time data collection are:
73
80
- **Recording** Encoded dumping of runtime to replay runtime with
74
81
before specified time and state determinism.
75
82
76
- Simplification and isolation means to apply the meaning of both words on
77
- all potential sub-components including, but not limited to
78
- hardware, code versioning including dependencies, source system,
79
- compiler framework and target system. Typical methods are
80
- - **Bisection** via git or the actual binaries
81
- - **Reduction** via rmeoval of system parts or trying to reproduce with
83
+ The core ideas for **what software system to run** based on code with its
84
+ semantics are then typically a mix of
85
+ - **Machine code** execution on the actual hardware to get hardware and timing behavior.
86
+ - **Simulation** as **partial or full execution** on a simplified, imitative
87
+ representation of the target hardware to get information for the simplified model.
88
+ - **Virtualisation** as **isolation or simplification** of a hardware- or software
89
+ subsystem to reduce system complexity.
90
+
91
+ Isolation and simplification are typically applied on all potential
92
+ sub-components including, but not limited to hardware, code versioning
93
+ including dependencies, source system, compiler framework and target system.
94
+ Typical methods are
95
+ - **Bisection** via git or the actual binaries.
96
+ - **Reduction** via removal of system parts or trying to reproduce with
82
97
(a minimal) example.
83
98
- **Statistical analysis** from collected data on how the problem
84
99
manifests on given environment(s) etc.
@@ -87,18 +102,21 @@ compiler framework and target system. Typical methods are
87
102
of **the to be debugged system to provide necessary debug functionality**.
88
103
For example, software based hardware debugging relies on interfaces to
89
104
the hardware like JTAG, Kernel debugging on Kernel compilation or
90
- configuration and elevated (user), userspace debugging on process and
105
+ configuration and elevated (user), user-space debugging on process and
91
106
user permissions, system configuration or a child process to be debugged
92
- on Posix systems via ptrace.
107
+ on Posix systems via `ptrace`.
108
+
109
+ It depends on many factors, for example bug classes and target systems, to what degree the process of
110
+ debugging can and should be automated or optimized.
93
111
94
112
[]($section.id("practice"))
95
113
### Practical methods with tradeoffs
96
114
97
115
Usually semantics are not "set into stone" inclusive or do not offer
98
116
sufficient tradeoffs, so formal verification is rarely an option aside of
99
- usage of models as design and planning tool.
117
+ usage of models as design and planning tool or for fail-safe program functionality .
100
118
Depending on the domain and environment, problematic behavior of hardware
101
- or software components must be to be more or less 1. avoided and 2. traceable
119
+ or software components must be more or less 1. avoided and 2. traceable
102
120
and there exist various (domain) metrics as decision helper.
103
121
Very well designed systems explain users how to debug bugs regarding to
104
122
**functional behavior**, **time behavior** with **internal and
@@ -107,12 +125,41 @@ task execution correctness is intended.
107
125
Access restrictions limit or rule out stepping, whereas storage limitations
108
126
limit or rule out logging, tracing and recording.
109
127
128
+ **Sanitizers** are the most efficient and simplest debugging tools for C and C++,
129
+ whereas Zig implements them, besides thread sanitizer, as allocator and safety mode.
130
+ Instrumented sanitizers have a 2x-4x slowdown vs dynamic ones with 20x-50x slowdown.
131
+
132
+ Nr | Clang usage | Zig usage | Memory | Runtime | Comments |
133
+ -- | ---------------------------- | ----------------- | ---------------- | -------- | ----------------------------------- |
134
+ 1 | -fsanitize=address | alloc + safety | 1x (3x stack) | 2x | Clang 16+ TB of virt mem |
135
+ 2 | -fsanitize=leak | allocator | 1x | 1x | on exit ?x? more mem+time |
136
+ 3 | -fsanitize=memory | unimplemented | 2-3x | 3x | |
137
+ 4 | -fsanitize=thread | -fsanitize=thread | 5-10x+1MB/thread | 5-15x | Clang ?x? ("lots of") virt mem |
138
+ 5 | -fsanitize=type | unimplemented | ? | ? | not enough data |
139
+ 6 | -fsanitize=undefined | safety mode | 1x | ~1x | |
140
+ 7 | -fsanitize=dataflow | unimplemented | 1-2x? | 1-4x? | wip, get variable dependencies |
141
+ 8 | -fsanitize=memtag | unimplemented | ~1.0Yx? | ~1.0Yx? | wip, address cheri-like ptr tagging |
142
+ 9 | -fsanitize=cfi | unimplemented | 1x | ~1x | forward edge ctrl flow protection |
143
+ 10 | -fsanitize=safe-stack | unimplemented | 1x | ~1x | backward edge ctrl flow protection |
144
+ 11 | -fsanitize=shadow-call-stack | unimplemented | 1x | ~1x | backward edge ctrl flow protection |
145
+
146
+ Sanitizers 1-6 are recommended for testing purpose and 7-11 for production by LLVM.
147
+ Memory and slowdown numbers are only reported for LLVM sanitizers. Zig does not
148
+ report own numbers yet (2025-01-11). Slowdown for dynamic sanitizer versions
149
+ increases by a factor of 10x in contrast to the listed static usage costs.
150
+ The leak sanitizer does only check for memory leaks, not other system resources.
151
+ Besides various Kernel specific tools to track system resources,
152
+ Valgrind can be used on Posix systems for non-memory resources and
153
+ Application Verifier for Windows.
154
+ Address and thread sanitizers can not be combined in Clang and combined usage
155
+ of the Zig implementation is limited by virtual memory usage.
156
+ In Zig, aliasing can currently not be sanitized against, whereas in Clang only
157
+ typed based aliasing can be sanitized without any numbers reported by LLVM yet.
158
+
110
159
[TODO: requirements on system design for formal verification vs debugging.]::
111
160
[no surprise rule: core system enabling debugging (in any form) must be correct]::
112
161
[to the degree necessary.]::
113
-
114
162
[TODO: good argumentation on ignoring linker speak, language footguns etc.]::
115
-
116
163
[1.Bugs related to functional behavior.]::
117
164
[2.Bugs related to time behavior.]::
118
165
[3.Internal and external system resources.]::
0 commit comments