Skip to content

Commit 9654745

Browse files
efaulhabermaleadt
andauthored
Add troubleshooting section for NSight Compute (#2442)
[skip tests] Co-authored-by: Tim Besard <tim.besard@gmail.com>
1 parent f291ab4 commit 9654745

File tree

1 file changed

+68
-0
lines changed

1 file changed

+68
-0
lines changed

docs/src/development/profiling.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -277,6 +277,74 @@ the API calls that have been made:
277277
!["NVIDIA Nsight Compute - API inspection"](nsight_compute-api.png)
278278
279279
280+
#### Troubleshooting NSight Compute
281+
282+
If you're running into issues, make sure you're using the same version of NSight Compute on
283+
the host and the device, and make sure it's the latest version available. You do not need
284+
administrative permissions to install NSight Compute, the `runfile` downloaded from the
285+
NVIDIA home page can be executed as a regular user.
286+
287+
##### `Could not load library "libpcre2-8`
288+
289+
This is caused by an incompatibility between Julia and NSight Compute, and should be fixed
290+
in the latest versions of NSight Compute. If it's not possible to upgrade, the following
291+
workaround may help:
292+
293+
```
294+
LD_LIBRARY_PATH=$(/path/to/julia -e 'println(joinpath(Sys.BINDIR, Base.LIBDIR, "julia"))') ncu --mode=launch /path/to/julia
295+
```
296+
297+
##### The Julia process is not listed in the "Attach" tab
298+
299+
Make sure that the port that is used by NSight Compute (49152 by default) is accessible via
300+
ssh. To verify this, you can also try forwarding the port manually:
301+
302+
```
303+
ssh user@host.com -L 49152:localhost:49152
304+
```
305+
306+
Then, in the "Connect to process" window of NSight Compute, add a connection to `localhost`
307+
instead of the remote host.
308+
309+
If SSH complains with `Address already in use`, that means the port is already in use. If
310+
you're using VSCode, try closing all instances as VSCode might automatically forward the
311+
port when launching NSight Compute in a terminal within VSCode.
312+
313+
##### Julia in NSight Compute only shows the Julia logo, not the REPL prompt
314+
315+
In some versions of NSight Compute, you might have to start Julia without the `--project`
316+
option and switch the environment from inside Julia.
317+
318+
##### "Disconnected from the application" once I click "Resume"
319+
320+
Make sure that everything is precompiled before starting Julia with NSight Compute,
321+
otherwise you end up profiling the precompilation process instead of your actual
322+
application.
323+
324+
Alternatively, disable auto profiling, resume, wait until the precompilation is finished,
325+
and then enable auto profiling again.
326+
327+
##### I only see the "API Stream" tab and no tab with details on my kernel on the right
328+
329+
Scroll down in the "API Stream" tab and look for errors in the "Details" column.
330+
If it says "The user does not have permission to access NVIDIA GPU Performance Counters
331+
on the target device", add this config:
332+
333+
```
334+
# cat /etc/modprobe.d/nvprof.conf
335+
options nvidia NVreg_RestrictProfilingToAdminUsers=0
336+
```
337+
338+
The `nvidia.ko` kernel module needs to be reloaded after changing this configuration, and
339+
your system may require regenerating the initramfs or even a reboot. Refer to your
340+
distribution's documentation for details.
341+
342+
##### NSight Compute breaks on various API calls
343+
344+
Make sure `Break On API Error` is disabled in the `Debug` menu, as CUDA.jl purposefully
345+
triggers some API errors as part of its normal operation.
346+
347+
280348
## Source-code annotations
281349
282350
If you want to put additional information in the profile, e.g. phases of your application,

0 commit comments

Comments
 (0)