Skip to content

Profiling

ftilde edited this page Oct 29, 2024 · 1 revision

If you think your code is slow, it may be useful to get an idea where the Voreen spends its time. So let's get profiling.

Linux

General

As you likely really care for the Release-mode performance, you should build Voreen as such. In order to get easily interpretable results, you still want to include debug information in the build. To do so, edit the CMake configuration prior to building:

CMAKE_CXX_FLAGS_RELEASE = -O3 -g -DNDEBUG

perf

perf is a linux profiling tool that samples the call stack during execution. Thus it does not heavily impact performance itself, but is not as fine grained. Still, as you really want to improve the real world performance of voreen, this should probably be your first choice.

Record data:

perf record --call-graph dwarf -- bin/voreenve

Afterwards, you can review the results in the terminal by executing the following command in the same directory.

perf report -g graph --no-children

Beware: As perf records (with the flags used above) a lot of information per sample, the file of recorded data may grow quickly!

View perf data as a flamegraph:

perf script | stackcollapse-perf | flamegraph > graph.svg ; and chromium graph.svg; rm graph.svg

Note: FlameGraphTool has to be installed on the system.
https://github.com/brendangregg/FlameGraph

The author of the tool also provides much more information in his blog.
http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

valgrind / callgrind

Alternatively, valgrind (or rather the tool callgrind) can be used to collect information while running the executable in the valgrind virtual machine. This causes Voreen to run much more slowly!

Record data:

valgrind --tool=callgrind bin/voreenve

View results:

kcachegrind $NAME_OF_THE_GENERATED_DATA_FILE

Clone this wiki locally