Skip to content

Nsight Profiling to Phoenix Benchmark Cases #929

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Malmahrouqi3
Copy link
Collaborator

@Malmahrouqi3 Malmahrouqi3 commented Jul 8, 2025

User description

Description

Adding this feature to compare next to each other nsys reports of master vs. pr. I left it now for visual comparison.
To compare reports, use diff or csv-diff after exporting readable files with nsys analyze -f <format e.g. csv, txt, etc.> -o <output-file>.
Nsight Docs: https://docs.nvidia.com/nsight-systems/UserGuide/index.html#report-scripts

Variety of Reports to Display:
nvtx_sum, osrt_sum, cuda_api_sum, cuda_gpu_kern_sum, cuda_gpu_mem_time_sum, cuda_gpu_mem_size_sum, openmp_sum, opengl_khr_range_sum, opengl_khr_gpu_range_sum, vulkan_marker_sum, vulkan_gpu_marker_sum, dx11_pix_sum, dx12_gpu_marker_sum, dx12_pix_sum, wddm_queue_sum, um_sum, um_total_sum, um_cpu_page_faults_sum, openacc_sum


PR Type

Enhancement


Description

  • Add Nsight Systems profiling to Phoenix benchmark workflow

  • Generate profiling reports for master and PR branches

  • Display NVTX, CUDA API, and GPU kernel summaries

  • Archive profiling reports as workflow artifacts


Changes diagram

flowchart LR
  A["Benchmark Execution"] --> B["nsys profile"]
  B --> C["Generate .nsys-rep"]
  C --> D["Extract Stats"]
  D --> E["Display Reports"]
  E --> F["Archive Artifacts"]
Loading

Changes walkthrough 📝

Relevant files
Enhancement
bench.sh
Enable Nsight profiling for benchmark execution                   

.github/workflows/phoenix/bench.sh

  • Wrap benchmark execution with nsys profile -o report command
  • Apply profiling to both GPU and CPU benchmark runs
  • +2/-2     
    bench.yml
    Add profiling report processing and archival                         

    .github/workflows/bench.yml

  • Add new step to process Nsight profiling reports
  • Extract and display NVTX, CUDA API, and GPU kernel statistics
  • Include profiling reports in archived artifacts
  • Compare profiling data between master and PR branches
  • +26/-0   

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • @Copilot Copilot AI review requested due to automatic review settings July 8, 2025 16:48
    @Malmahrouqi3 Malmahrouqi3 requested a review from sbryngelson as a code owner July 8, 2025 16:48
    Copy link

    qodo-merge-pro bot commented Jul 8, 2025

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Missing Error Handling

    The nsys profile command is added without error handling. If nsys is not available or fails, the benchmark execution will fail silently or with unclear error messages.

        nsys profile -o report ./mfc.sh bench --mem 12 -j $(nproc) -o "$job_slug.yaml" -- -c phoenix-bench $device_opts -n $n_ranks
    else
        nsys profile -o report ./mfc.sh bench --mem 1 -j $(nproc) -o "$job_slug.yaml" -- -c phoenix-bench $device_opts -n $n_ranks
    Incomplete Artifact Collection

    The workflow only archives the PR's nsys report but not the master's report, making comparison difficult. Also, the profiling step only checks for PR report existence but processes both master and PR reports.

    pr/report.nsys-rep
    master/bench-${{ matrix.device }}.*
    Inconsistent Logic

    The condition checks for pr/report.nsys-rep existence but then processes both master and pr reports without verifying master report exists, which could cause command failures.

    if [ -f "pr/report.nsys-rep" ]; then
      echo "=== Nsight Profiling Summary ==="
      echo "Master"
      (cd master && nsys stats --report nvtx_sum report.nsys-rep)
      echo "Pr"
      (cd pr && nsys stats --report nvtx_sum report.nsys-rep)
    
      echo "=== CUDA API CALLS ==="
      echo "Master"
      (cd master && nsys stats --report cuda_api_sum --format table report.nsys-rep | head -100)
      echo "Pr"
      (cd pr && nsys stats --report cuda_api_sum --format table report.nsys-rep | head -100)
    
      echo "=== GPU KERNELS ==="
      echo "Master"
      (cd master && nsys stats --report cuda_gpu_kern_sum --format table report.nsys-rep | head -100)
      echo "Pr"
      (cd pr && nsys stats --report cuda_gpu_kern_sum --format table report.nsys-rep | head -100)

    Copy link
    Contributor

    @Copilot Copilot AI left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Pull Request Overview

    Adds Nsight Systems profiling to the Phoenix benchmarking workflows to allow side-by-side comparison of master vs. PR performance.

    • Wraps both GPU and CPU benchmark invocations in nsys profile within bench.sh.
    • Processes and prints key profiling metrics (NVTX, CUDA API calls, GPU kernels) in the CI via bench.yml.
    • Uploads the generated report.nsys-rep as an artifact for later inspection.

    Reviewed Changes

    Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

    File Description
    .github/workflows/phoenix/bench.sh Prepend nsys profile to the existing benchmark commands
    .github/workflows/bench.yml Add a “Process Nsight Profiling Report” step and include the report.nsys-rep artifact
    Comments suppressed due to low confidence (4)

    .github/workflows/phoenix/bench.sh:19

    • Using a fixed output name report may cause collisions or overwrite data when running multiple jobs; consider parameterizing the output file (e.g. -o "$job_slug") to improve traceability.
        nsys profile -o report ./mfc.sh bench --mem 12 -j $(nproc) -o "$job_slug.yaml" -- -c phoenix-bench $device_opts -n $n_ranks
    

    .github/workflows/phoenix/bench.sh:21

    • Same as above: the static report filename will be reused for CPU runs—consider including $job_slug or device name in the output filename to avoid overwrites.
        nsys profile -o report ./mfc.sh bench --mem 1 -j $(nproc) -o "$job_slug.yaml" -- -c phoenix-bench $device_opts -n $n_ranks
    

    .github/workflows/bench.yml:97

    • The workflow checks for pr/report.nsys-rep, but bench.sh emits report.nsys-rep in the workspace root (or TMPDIR). The path should match where the file is actually written or copy the report into pr/ beforehand.
              if [ -f "pr/report.nsys-rep" ]; then
    

    .github/workflows/bench.yml:104

    • [nitpick] The profiling section hardcodes just three report types—consider looping over the full set of reports listed in the PR description to reduce duplication and ensure all metrics are covered.
                echo "=== CUDA API CALLS ==="
    

    Copy link

    qodo-merge-pro bot commented Jul 8, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Validate both report files exist

    Check for the existence of both master and PR report files before processing.
    The current code only checks for the PR report but attempts to process both,
    which could cause errors if the master report is missing.

    .github/workflows/bench.yml [97-102]

    -if [ -f "pr/report.nsys-rep" ]; then
    +if [ -f "pr/report.nsys-rep" ] && [ -f "master/report.nsys-rep" ]; then
       echo "=== Nsight Profiling Summary ==="
       echo "Master"
       (cd master && nsys stats --report nvtx_sum report.nsys-rep)
       echo "Pr"
       (cd pr && nsys stats --report nvtx_sum report.nsys-rep)
    Suggestion importance[1-10]: 8

    __

    Why: This suggestion correctly points out a logic flaw where only one of two required files is checked, preventing a potential workflow failure.

    Medium
    General
    Add nsys availability check

    Add error handling to check if nsys command is available before execution. This
    prevents the script from failing silently if Nsight Systems is not installed on
    the system.

    .github/workflows/phoenix/bench.sh [19]

    -nsys profile -o report ./mfc.sh bench --mem 12 -j $(nproc) -o "$job_slug.yaml" -- -c phoenix-bench $device_opts -n $n_ranks
    +if command -v nsys >/dev/null 2>&1; then
    +    nsys profile -o report ./mfc.sh bench --mem 12 -j $(nproc) -o "$job_slug.yaml" -- -c phoenix-bench $device_opts -n $n_ranks
    +else
    +    echo "Warning: nsys not found, running without profiling"
    +    ./mfc.sh bench --mem 12 -j $(nproc) -o "$job_slug.yaml" -- -c phoenix-bench $device_opts -n $n_ranks
    +fi
    Suggestion importance[1-10]: 7

    __

    Why: The suggestion correctly identifies a potential point of failure and proposes a robust fallback mechanism, improving the script's resilience.

    Medium
    • Update

    Copy link

    codecov bot commented Jul 8, 2025

    Codecov Report

    All modified and coverable lines are covered by tests ✅

    Project coverage is 43.71%. Comparing base (adcc0dd) to head (d5969b3).
    Report is 2 commits behind head on master.

    Additional details and impacted files
    @@           Coverage Diff           @@
    ##           master     #929   +/-   ##
    =======================================
      Coverage   43.71%   43.71%           
    =======================================
      Files          68       68           
      Lines       18360    18360           
      Branches     2292     2292           
    =======================================
      Hits         8026     8026           
      Misses       8945     8945           
      Partials     1389     1389           

    ☔ View full report in Codecov by Sentry.
    📢 Have feedback on the report? Share it here.

    🚀 New features to boost your workflow:
    • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

    @sbryngelson sbryngelson self-requested a review July 12, 2025 17:09
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Development

    Successfully merging this pull request may close these issues.

    2 participants