Skip to content

Commit a34b098

Browse files
committed
Add cbi-tree example to the analysis tutorial
Signed-off-by: John Pennycook <john.pennycook@intel.com>
1 parent af62c0a commit a34b098

File tree

1 file changed

+100
-0
lines changed

1 file changed

+100
-0
lines changed

docs/source/analysis.rst

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,100 @@ platforms. Plugging these numbers into the equation for code divergence gives
104104
not to (more on that later).
105105

106106

107+
Running ``cbi-tree``
108+
####################
109+
110+
Running ``codebasin`` provides an overview of divergence and coverage, which
111+
can be useful when we want to familiarize ourselves with a new code base,
112+
compare the impact of different code structures upon certain metrics, or track
113+
specialization metrics over time. However, it doesn't provide any *actionable*
114+
insight into how to improve a code base.
115+
116+
To understand how much specialization exists in each source file, we can
117+
substitute ``codebasin`` for ``cbi-tree``::
118+
119+
$ cbi-tree analysis.toml
120+
121+
This command performs the same analysis as ``codebasin``, but produces a tree
122+
annotated with information about which files contain specialization:
123+
124+
.. code-block:: text
125+
:emphasize-lines: 8,9,11,16
126+
127+
Legend:
128+
A: cpu
129+
B: gpu
130+
131+
Columns:
132+
[Platforms | SLOC | Coverage (%) | Avg. Coverage (%)]
133+
134+
[AB | 33 | 93.94 | 72.73] o /home/username/code-base-investigator/docs/sample-code-base/src/
135+
[AB | 13 | 100.00 | 92.31] ├── main.cpp
136+
[A- | 7 | 85.71 | 42.86] ├─o cpu/
137+
[A- | 7 | 85.71 | 42.86] │ └── foo.cpp
138+
[AB | 6 | 100.00 | 100.00] ├─o third-party/
139+
[AB | 1 | 100.00 | 100.00] │ ├── library.h
140+
[AB | 5 | 100.00 | 100.00] │ └── library.cpp
141+
[-B | 7 | 85.71 | 42.86] └─o gpu/
142+
[-B | 7 | 85.71 | 42.86] └── foo.cpp
143+
144+
.. tip::
145+
146+
Running ``cbi-tree`` in a modern terminal environment producers colored
147+
output to improve usability for large code bases.
148+
149+
Each node in the tree represents a source file or directory in the code
150+
base and is annotated with four pieces of information:
151+
152+
1. **Platforms**
153+
154+
The set of platforms that use the file or directory.
155+
156+
2. **SLOC**
157+
158+
The number of source lines of code (SLOC) in the file or directory.
159+
160+
3. **Coverage (%)**
161+
162+
The amount of code in the file or directory that is used by all platforms,
163+
as a percentage of SLOC.
164+
165+
4. **Avg. Coverage (%)**
166+
167+
The amount of code in the file or directory that is used by each platform,
168+
on average, as a percentage of SLOC.
169+
170+
The root of the tree represents the entire code base, and so the values in
171+
the annotations match the ``codebasin`` results: two platforms (``A`` and
172+
``B``) use the directory, there are 33 lines in total, 93.94% of those lines
173+
(i.e., 31 lines) are used by at least one platform, and each platform uses
174+
72.73% of those lines (i.e., 24 lines) on average. By walking the tree, we can
175+
break these numbers down across the individual files and directories in the
176+
code base.
177+
178+
Starting with ``main.cpp``, we can see that it is used by both platforms
179+
(``A`` and ``B``), and that 100% of the 13 lines in the file are used by at
180+
least one platform. However, the average coverage is only 92.31%, reflecting
181+
that each platform uses only 12 of those lines.
182+
183+
Turning our attention to ``cpu/foo.cpp`` and ``gpu/foo.cpp``, we can see
184+
that they are each specialized for one platform (``A`` and ``B``,
185+
respectively). The coverage for both files is only 85.71% (i.e., 6 of the 7
186+
lines), which tells us that both files contain some unused code (i.e., 1 line).
187+
The average coverage of 42.86% highlights the extent of the specialization.
188+
189+
.. tip::
190+
191+
Looking at average coverage is the best way to identify highly specialized
192+
regions of code. As the number of platforms targeted by a code base
193+
increases, the average coverage for files used by only a small number of
194+
platforms will approach zero.
195+
196+
The remaining files all have a coverage of 100.00% and an average coverage
197+
of 100.00%. This is our ideal case: all of the code in the file is used by
198+
at least one platform, and all of the platforms use all of the code.
199+
200+
107201
Filtering Platforms
108202
###################
109203

@@ -125,3 +219,9 @@ platform as follows:
125219
.. code:: sh
126220
127221
$ codebasin -p cpu analysis.toml
222+
223+
or
224+
225+
.. code:: sh
226+
227+
$ cbi-tree -p cpu analysis.toml

0 commit comments

Comments
 (0)