@@ -104,6 +104,100 @@ platforms. Plugging these numbers into the equation for code divergence gives
104
104
not to (more on that later).
105
105
106
106
107
+ Running ``cbi-tree ``
108
+ ####################
109
+
110
+ Running ``codebasin `` provides an overview of divergence and coverage, which
111
+ can be useful when we want to familiarize ourselves with a new code base,
112
+ compare the impact of different code structures upon certain metrics, or track
113
+ specialization metrics over time. However, it doesn't provide any *actionable *
114
+ insight into how to improve a code base.
115
+
116
+ To understand how much specialization exists in each source file, we can
117
+ substitute ``codebasin `` for ``cbi-tree ``::
118
+
119
+ $ cbi-tree analysis.toml
120
+
121
+ This command performs the same analysis as ``codebasin ``, but produces a tree
122
+ annotated with information about which files contain specialization:
123
+
124
+ .. code-block :: text
125
+ :emphasize-lines: 8,9,11,16
126
+
127
+ Legend:
128
+ A: cpu
129
+ B: gpu
130
+
131
+ Columns:
132
+ [Platforms | SLOC | Coverage (%) | Avg. Coverage (%)]
133
+
134
+ [AB | 33 | 93.94 | 72.73] o /home/username/code-base-investigator/docs/sample-code-base/src/
135
+ [AB | 13 | 100.00 | 92.31] ├── main.cpp
136
+ [A- | 7 | 85.71 | 42.86] ├─o cpu/
137
+ [A- | 7 | 85.71 | 42.86] │ └── foo.cpp
138
+ [AB | 6 | 100.00 | 100.00] ├─o third-party/
139
+ [AB | 1 | 100.00 | 100.00] │ ├── library.h
140
+ [AB | 5 | 100.00 | 100.00] │ └── library.cpp
141
+ [-B | 7 | 85.71 | 42.86] └─o gpu/
142
+ [-B | 7 | 85.71 | 42.86] └── foo.cpp
143
+
144
+ .. tip ::
145
+
146
+ Running ``cbi-tree `` in a modern terminal environment producers colored
147
+ output to improve usability for large code bases.
148
+
149
+ Each node in the tree represents a source file or directory in the code
150
+ base and is annotated with four pieces of information:
151
+
152
+ 1. **Platforms **
153
+
154
+ The set of platforms that use the file or directory.
155
+
156
+ 2. **SLOC **
157
+
158
+ The number of source lines of code (SLOC) in the file or directory.
159
+
160
+ 3. **Coverage (%) **
161
+
162
+ The amount of code in the file or directory that is used by all platforms,
163
+ as a percentage of SLOC.
164
+
165
+ 4. **Avg. Coverage (%) **
166
+
167
+ The amount of code in the file or directory that is used by each platform,
168
+ on average, as a percentage of SLOC.
169
+
170
+ The root of the tree represents the entire code base, and so the values in
171
+ the annotations match the ``codebasin `` results: two platforms (``A `` and
172
+ ``B ``) use the directory, there are 33 lines in total, 93.94% of those lines
173
+ (i.e., 31 lines) are used by at least one platform, and each platform uses
174
+ 72.73% of those lines (i.e., 24 lines) on average. By walking the tree, we can
175
+ break these numbers down across the individual files and directories in the
176
+ code base.
177
+
178
+ Starting with ``main.cpp ``, we can see that it is used by both platforms
179
+ (``A `` and ``B ``), and that 100% of the 13 lines in the file are used by at
180
+ least one platform. However, the average coverage is only 92.31%, reflecting
181
+ that each platform uses only 12 of those lines.
182
+
183
+ Turning our attention to ``cpu/foo.cpp `` and ``gpu/foo.cpp ``, we can see
184
+ that they are each specialized for one platform (``A `` and ``B ``,
185
+ respectively). The coverage for both files is only 85.71% (i.e., 6 of the 7
186
+ lines), which tells us that both files contain some unused code (i.e., 1 line).
187
+ The average coverage of 42.86% highlights the extent of the specialization.
188
+
189
+ .. tip ::
190
+
191
+ Looking at average coverage is the best way to identify highly specialized
192
+ regions of code. As the number of platforms targeted by a code base
193
+ increases, the average coverage for files used by only a small number of
194
+ platforms will approach zero.
195
+
196
+ The remaining files all have a coverage of 100.00% and an average coverage
197
+ of 100.00%. This is our ideal case: all of the code in the file is used by
198
+ at least one platform, and all of the platforms use all of the code.
199
+
200
+
107
201
Filtering Platforms
108
202
###################
109
203
@@ -125,3 +219,9 @@ platform as follows:
125
219
.. code :: sh
126
220
127
221
$ codebasin -p cpu analysis.toml
222
+
223
+ or
224
+
225
+ .. code :: sh
226
+
227
+ $ cbi-tree -p cpu analysis.toml
0 commit comments