Skip to content

Commit c13618d

Browse files
committed
Update documentation to reflect CodeBase behavior
A side-effect of tracking all source files in the code base is that analysis can now pick up unexpected files that were previously never encountered in a compilation database. For example, any C++ files automatically generated by CMake will be identified as unused code if codebasin is invoked in a directory containing both build/ and src/ directories. This commit updates the documentation to highlight the importance of running codebasin in the right directory (or otherwise separating build and src). Signed-off-by: John Pennycook <john.pennycook@intel.com>
1 parent ab645f8 commit c13618d

File tree

1 file changed

+20
-3
lines changed

1 file changed

+20
-3
lines changed

docs/source/analysis.rst

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,17 +36,24 @@ The table's name is the name of the platform, and we can use any meaningful
3636
string. The ``commands`` key tells CBI where to find the compilation database
3737
for this platform.
3838

39+
.. important::
40+
41+
By default, ``codebasin`` searches the current working directory for source
42+
files to include in its analysis. Since we'll be running in the ``src``
43+
directory, we need to specify the ``commands`` paths relative to the
44+
``src`` directory or as absolute paths.
45+
3946
In our example, we have two platforms that we're calling "cpu" and "gpu",
4047
and our build directories are called ``build-cpu`` and ``build-gpu``, so
4148
our platform definitions should look like this:
4249

4350
.. code-block:: toml
4451
4552
[platform.cpu]
46-
commands = "build-cpu/compile_commands.json"
53+
commands = "../build-cpu/compile_commands.json"
4754
4855
[platform.gpu]
49-
commands = "build-gpu/compile_commands.json"
56+
commands = "../build-gpu/compile_commands.json"
5057
5158
.. warning::
5259
Platform names are case sensitive! The names "cpu" and "CPU" would refer to
@@ -56,7 +63,8 @@ our platform definitions should look like this:
5663
Running ``codebasin``
5764
#####################
5865

59-
Running ``codebasin`` with this analysis file gives the following output:
66+
Running ``codebasin`` in the ``src`` directory with this analysis file gives
67+
the following output:
6068

6169
.. code-block:: text
6270
:emphasize-lines: 4,5,6,7,9
@@ -86,6 +94,15 @@ used only by the GPU compilation, and 17 lines of code shared by both
8694
platforms. Plugging these numbers into the equation for code divergence gives
8795
0.45.
8896

97+
.. caution::
98+
If we had run ``codebasin`` in the parent directory, everything in the
99+
``src``, ``build-cpu`` and ``build-gpu`` directories would have been
100+
included in the analysis. For our sample code base, this would have
101+
resulted in over 2000 lines of code being identified as unused! Why so
102+
many? CMake generates multiple ``*.cpp`` files, which it uses as part of
103+
the build process. ``codebasin`` will analyze such files unless we tell it
104+
not to (more on that later).
105+
89106

90107
Filtering Platforms
91108
###################

0 commit comments

Comments
 (0)