06. FAQ

This page addresses common questions and issues encountered when using TritonParse.

🚀 Getting Started

Q: What is TritonParse?

A: TritonParse is a comprehensive visualization and analysis tool for Triton IR files. It helps developers analyze, debug, and understand Triton kernel compilation processes by:

Capturing structured compilation logs
Providing interactive visualization of IR stages
Mapping transformations between compilation stages
Offering side-by-side IR comparison

Q: Do I need to install anything to use TritonParse?

A: It depends on your use case:

For analysis only: No installation needed! Just visit https://pytorch-labs.github.io/tritonparse/
For generating traces: You need to install the Python package (pip install -e .)
For development: You need the full development setup

Q: What are the system requirements?

A:

Python >= 3.10
Triton > 3.3.1 (must be compiled from source)
GPU Support (for GPU tracing):
- NVIDIA GPUs: CUDA 11.8+ or 12.x
- AMD GPUs: ROCm 5.0+ (MI100, MI200, MI300 series)
Modern browser (Chrome 90+, Firefox 88+, Safari 14+, Edge 90+)

🔧 Installation and Setup

Q: How do I install Triton from source?

A: Triton must be compiled from source for TritonParse to work:

# First, uninstall any existing PyTorch-bundled Triton
pip uninstall -y pytorch-triton triton || true

# Install Triton from source
git clone https://github.com/triton-lang/triton.git
cd triton
pip install -e .

For detailed instructions, see our Installation Guide or the official Triton installation guide.

Q: I'm getting "No module named 'triton'" errors. What's wrong?

A: This usually means:

Triton isn't installed - Install from source (see above)
Wrong Python environment - Make sure you're in the right virtual environment
Installation failed - Check for compilation errors during Triton installation

Q: Do I need a GPU to use TritonParse?

A: Yes, a GPU is required because Triton itself depends on GPU:

For generating traces: GPU is required (either NVIDIA with CUDA or AMD with ROCm)
For web interface only: No GPU needed (just to view existing trace files from others)

Note: Triton kernels can only run on GPU, so you need GPU hardware to generate your own traces.

📊 Generating Traces

Q: How do I generate trace files?

A: Add these lines to your Python code:

import tritonparse.structured_logging
import tritonparse.utils

# Initialize logging
tritonparse.structured_logging.init("./logs/")

# Your Triton/PyTorch code here
...

# Parse logs
tritonparse.utils.unified_parse(source="./logs/", out="./parsed_output")

Q: My trace files are empty. What's wrong?

A: Common causes:

Logging not initialized - Make sure you call tritonparse.structured_logging.init() before kernel execution
No kernel execution - Ensure your code actually executes Triton kernels
Cache issues - Set TORCHINDUCTOR_FX_GRAPH_CACHE=0 environment variable
Permissions - Check that the log directory is writable

Q: What's the difference between .ndjson and .gz files?

A:

.ndjson: Raw trace logs, no source mapping, good for debugging
.gz: Compressed parsed traces with full source mapping, recommended for analysis

Always use .gz files for full functionality in the web interface.

Q: How do I trace PyTorch 2.0 compiled functions?

A: Use the same setup, but make sure to set the environment variable:

export TORCHINDUCTOR_FX_GRAPH_CACHE=0

Then run your code:

compiled_fn = torch.compile(your_function)
result = compiled_fn(input_data)  # This will be traced

🌐 Web Interface

Q: Can I use TritonParse without uploading my traces to the internet?

A: Yes! The web interface runs entirely in your browser:

No data is sent to servers
Files are processed locally
Use the online interface safely at https://pytorch-labs.github.io/tritonparse/

Q: Why can't I see source mappings in the web interface?

A:

Using .ndjson files - Switch to .gz files from parsed_output directory
Parsing failed - Check that unified_parse() completed successfully
Browser issues - Try refreshing the page or using a different browser

Q: The web interface is slow. How can I improve performance?

A:

Use smaller trace files - Trace specific kernels instead of entire programs
Enable hardware acceleration in your browser
Close other browser tabs to free up memory
Use a modern browser (Chrome recommended)

Q: How do I share my analysis with others?

A:

Upload your .gz file to a file hosting service

Share the URL with parameters:

https://pytorch-labs.github.io/tritonparse/?json_url=YOUR_FILE_URL

Take screenshots of important findings
Export specific views using browser bookmark features

🔍 Analysis and Debugging

Q: How do I understand the different IR stages?

A: Here's the compilation pipeline:

Stage	Description	When to Use
TTIR	Triton IR - High-level language constructs	Understanding kernel logic
TTGIR	Triton GPU IR - GPU-specific operations	GPU-specific optimizations
LLIR	LLVM IR - Low-level operations	Compiler optimizations
PTX	NVIDIA assembly	Final code generation
AMDGCN	AMD assembly	AMD GPU final code

Q: What should I look for when analyzing performance?

A: Key areas to examine:

Memory access patterns in PTX/AMDGCN
Register usage in kernel metadata
Vectorization in TTGIR → PTX transformation
Memory coalescing in assembly code
Branch divergence in control flow

Q: How do I debug compilation failures?

A:

Check the call stack for error location
Start with TTIR to identify syntax issues
Look at LLIR for type problems
Check PTX generation for hardware compatibility
Enable debug logging with TRITONPARSE_DEBUG=1

Q: Can I compare different versions of the same kernel?

A: Yes! Generate separate traces for each version:

# Version 1
tritonparse.structured_logging.init("./logs_v1/")
# ... run kernel v1
tritonparse.utils.unified_parse(source="./logs_v1/", out="./output_v1")

# Version 2
tritonparse.structured_logging.init("./logs_v2/")
# ... run kernel v2
tritonparse.utils.unified_parse(source="./logs_v2/", out="./output_v2")

Then analyze both sets of traces in the web interface.

🐛 Common Issues

Q: I'm getting "WARNING: SourceMapping: No frame_id found" messages

A: This is usually normal for:

PyTorch 2.0 compiled functions - The warning is expected
Complex call stacks - Some frame information might be missing
Cached kernels - Set TORCHINDUCTOR_FX_GRAPH_CACHE=0

The warning doesn't affect functionality.

Q: My kernels aren't showing up in the trace

A: Check these common issues:

Kernel not actually executed - Ensure your code path runs
Compilation cache - Set TORCHINDUCTOR_FX_GRAPH_CACHE=0
Logging initialized too late - Call init() before kernel execution
Multiple processes - Each process needs its own log directory

Q: The web interface shows "No kernels found"

A:

Check file format - Use .gz files from parsed_output
Verify parsing - Ensure unified_parse() completed successfully
Check file size - Very large files might not load properly
Try different browser - Some browsers have stricter limits

Q: How do I trace only specific kernels?

A: Use the kernel allowlist:

export TRITONPARSE_KERNEL_ALLOWLIST="my_kernel*,important_*"

This traces only kernels matching the specified patterns.

🚀 Advanced Usage

Q: Can I use TritonParse with multiple GPUs?

A: Yes! For multi-GPU setups:

# Parse all ranks
tritonparse.utils.unified_parse(
    source="./logs/",
    out="./parsed_output",
    all_ranks=True
)

# Or parse specific rank
tritonparse.utils.unified_parse(
    source="./logs/",
    out="./parsed_output",
    rank=1
)

Q: How do I automate trace generation in CI/CD?

A: Example CI script:

#!/bin/bash
export TORCHINDUCTOR_FX_GRAPH_CACHE=0
export TRITONPARSE_DEBUG=1

# Run your tests with tracing
python -m pytest tests/ --trace-kernels

# Parse and archive traces
python -c "
import tritonparse.utils
tritonparse.utils.unified_parse('./logs/', './artifacts/')
"

# Archive results
tar -czf traces.tar.gz ./artifacts/

Q: Can I extend TritonParse with custom IR formats?

A: Yes! See the Developer Guide for details on:

Adding new IR format support
Extending metadata extraction
Contributing new features

Q: How do I handle very large trace files?

A: Strategies for large files:

Filter kernels using TRITONPARSE_KERNEL_ALLOWLIST
Split by rank using rank=N parameter
Use compression with TRITON_TRACE_GZIP=1
Process in chunks - analyze subsets of kernels

🤝 Contributing and Community

Q: How can I contribute to TritonParse?

A:

Check the Contributing Guide
Look at GitHub Issues
Join GitHub Discussions
Submit bug reports with detailed reproduction steps

Q: Where can I get help?

A:

GitHub Discussions - Community Q&A
GitHub Issues - Bug reports and feature requests
Documentation - Comprehensive guides in this wiki
Discord/Slack - Real-time community chat (if available)

Q: How do I report bugs?

A: When reporting bugs, include:

System information - Python version, OS, GPU
TritonParse version - Check pip list
Reproduction steps - Minimal code example
Error messages - Complete error logs
Expected vs actual behavior

📚 Learning Resources

Q: Where can I learn more about Triton?

A:

Official Documentation: https://triton-lang.org/
Triton Tutorials: GitHub Tutorials
Triton Puzzles: Third-party puzzles

Q: Are there example use cases?

A: Yes! Check out:

Basic Examples - Simple usage scenarios
Advanced Examples - Complex use cases
Test files in the tests/ directory
Community examples in GitHub Discussions

Q: How do I stay updated on new features?

A:

Watch the GitHub repository for releases
Follow the project on GitHub
Join GitHub Discussions for announcements
Check the Release Notes

🔗 Quick Links

🏠 Home - Wiki home page
📚 Installation - Setup instructions
📖 Usage Guide - Complete usage tutorial
🌐 Web Interface Guide - Interface walkthrough
🔧 Developer Guide - Contributing and development
❓ GitHub Discussions - Community Q&A

Can't find your question?

Search the GitHub Issues for similar problems
Ask in GitHub Discussions
Check the other wiki pages for more detailed information

06. FAQ

🚀 Getting Started

Q: What is TritonParse?

Q: Do I need to install anything to use TritonParse?

Q: What are the system requirements?

🔧 Installation and Setup

Q: How do I install Triton from source?

Q: I'm getting "No module named 'triton'" errors. What's wrong?

Q: Do I need a GPU to use TritonParse?

📊 Generating Traces

Q: How do I generate trace files?

Q: My trace files are empty. What's wrong?

Q: What's the difference between .ndjson and .gz files?

Q: How do I trace PyTorch 2.0 compiled functions?

🌐 Web Interface

Q: Can I use TritonParse without uploading my traces to the internet?

Q: Why can't I see source mappings in the web interface?

Q: The web interface is slow. How can I improve performance?

Q: How do I share my analysis with others?

🔍 Analysis and Debugging

Q: How do I understand the different IR stages?

Q: What should I look for when analyzing performance?

Q: How do I debug compilation failures?

Q: Can I compare different versions of the same kernel?

🐛 Common Issues

Q: I'm getting "WARNING: SourceMapping: No frame_id found" messages

Q: My kernels aren't showing up in the trace

Q: The web interface shows "No kernels found"

Q: How do I trace only specific kernels?

🚀 Advanced Usage

Q: Can I use TritonParse with multiple GPUs?

Q: How do I automate trace generation in CI/CD?

Q: Can I extend TritonParse with custom IR formats?

Q: How do I handle very large trace files?

🤝 Contributing and Community

Q: How can I contribute to TritonParse?

Q: Where can I get help?

Q: How do I report bugs?

📚 Learning Resources

Q: Where can I learn more about Triton?

Q: Are there example use cases?

Q: How do I stay updated on new features?

🔗 Quick Links

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally