Skip to content

Add Comprehensive CI Scripts and Documentation #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 33 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
5fc74ac
Add CI scripts for environment setup and testing
FindHao Jul 8, 2025
e7d51b3
Update GitHub Actions workflow to include new branch for CI testing
FindHao Jul 8, 2025
b9e8f21
Enhance CI scripts and GitHub Actions workflow
FindHao Jul 8, 2025
e5d2b26
Add cuDNN installation script and update README
FindHao Jul 8, 2025
482a7e2
Refactor GitHub Actions workflow to remove linting job
FindHao Jul 8, 2025
04b3fbf
Enhance cuDNN installation script for architecture detection
FindHao Jul 8, 2025
564ee23
fix chmod error when install cudnn
FindHao Jul 8, 2025
50b2d92
fix ls
FindHao Jul 8, 2025
3d295a8
let triton run use system's libstdc++
FindHao Jul 8, 2025
6607487
upgrade libstdc++ in conda
FindHao Jul 8, 2025
6c794ba
use clangd
FindHao Jul 8, 2025
301bbb4
Update CI setup script to install CUDA and streamline dependency inst…
FindHao Jul 8, 2025
9beb34b
use llvm 17
FindHao Jul 8, 2025
775a81c
fix gpg key issues
FindHao Jul 8, 2025
4af92e3
fix gpg key
FindHao Jul 8, 2025
38b0974
gpg key
FindHao Jul 8, 2025
cd5b169
add libstdc++-12-dev
FindHao Jul 8, 2025
bb678a7
add other libs
FindHao Jul 8, 2025
2a32055
update readme
FindHao Jul 9, 2025
d4cb5aa
update test command
FindHao Jul 9, 2025
c6a9711
sync with unit tests
FindHao Jul 9, 2025
a2426b3
fix lint
FindHao Jul 9, 2025
e519951
Enhance CUDA version detection in setup script and improve caching in…
FindHao Jul 10, 2025
05d008d
Refactor Triton installation script for improved caching and reproduc…
FindHao Jul 10, 2025
5e1c78d
Enhance Triton installation script with commit validation and improve…
FindHao Jul 10, 2025
60db4e0
Refactor Triton installation script to improve cache handling and log…
FindHao Jul 10, 2025
21403f4
Enhance Triton installation script with improved logging, pre-flight …
FindHao Jul 10, 2025
87e6c72
Update GitHub Actions workflow to include daily cache timestamp for p…
FindHao Jul 10, 2025
273e4a1
Remove redundant Triton uninstallation steps from the installation sc…
FindHao Jul 10, 2025
fca2d1e
Enhance setup script with smart APT update strategy, improved CUDA ve…
FindHao Jul 10, 2025
da8857b
Update GitHub Actions workflow to ignore specific paths during push a…
FindHao Jul 10, 2025
33d9e09
Remove cuDNN installation script and update setup process to use PyTo…
FindHao Jul 10, 2025
d8c2ecb
Update cuDNN installation command in setup script to use sudo for ele…
FindHao Jul 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions .ci/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# CI Scripts

This directory contains scripts for setting up and running the tritonparse CI environment.

## Scripts Overview

### `setup.sh`
Sets up the conda environment, installs dependencies, configures CUDA, and installs cuDNN.

**Environment Variables:**
- `CONDA_ENV`: Conda environment name (default: "tritonparse")
- `PYTHON_VERSION`: Python version (default: "3.11")
- `CUDA_VERSION`: CUDA version (default: "12.8")
- `CUDNN_VERSION`: cuDNN version (default: "9.10.2.21")

**Usage:**
```bash
CONDA_ENV=tritonparse PYTHON_VERSION=3.11 bash .ci/setup.sh
```

> **Note:**
> `setup.sh` will automatically download and execute the official PyTorch cuDNN installation script from:
> https://github.com/pytorch/pytorch/blob/main/.ci/docker/common/install_cudnn.sh
> There is no need to maintain a local cuDNN install script.

### `install-triton.sh`
Installs Triton from source by cloning the repository and building it.

**Environment Variables:**
- `CONDA_ENV`: Conda environment name (required)

**Usage:**
```bash
CONDA_ENV=tritonparse bash .ci/install-triton.sh
```

### `install-project.sh`
Installs the tritonparse project in editable mode with test dependencies.

**Environment Variables:**
- `CONDA_ENV`: Conda environment name (required)

**Usage:**
```bash
CONDA_ENV=tritonparse bash .ci/install-project.sh
```

### `run-tests.sh`
Runs the test suite with proper environment setup.

**Environment Variables:**
- `CONDA_ENV`: Conda environment name (required)
- `TEST_TYPE`: Type of tests to run (default: "all")
- `cpu`: CPU tests only
- `cuda`: CUDA tests only
- `all`: All tests
- `VERBOSE`: Enable verbose output (default: "true")
- `COVERAGE`: Enable coverage reporting (default: "false")

**Usage:**
```bash
# Run all tests
CONDA_ENV=tritonparse bash .ci/run-tests.sh

# Run CPU tests only with coverage
CONDA_ENV=tritonparse TEST_TYPE=cpu COVERAGE=true bash .ci/run-tests.sh

# Run CUDA tests only
CONDA_ENV=tritonparse TEST_TYPE=cuda bash .ci/run-tests.sh
```

## Complete Workflow

For a complete setup and test run:

```bash
# 1. Setup environment (includes cuDNN installation)
CONDA_ENV=tritonparse PYTHON_VERSION=3.11 bash .ci/setup.sh

# 2. Install Triton
CONDA_ENV=tritonparse bash .ci/install-triton.sh

# 3. Install project
CONDA_ENV=tritonparse bash .ci/install-project.sh

# 4. Run tests
CONDA_ENV=tritonparse TEST_TYPE=all COVERAGE=true bash .ci/run-tests.sh
```

## Local Development

For local development, you can use these scripts to set up the same environment as CI:

```bash
# Setup local environment (includes cuDNN installation)
CONDA_ENV=tritonparse-local bash .ci/setup.sh

# Install Triton
CONDA_ENV=tritonparse-local bash .ci/install-triton.sh

# Install project
CONDA_ENV=tritonparse-local bash .ci/install-project.sh

# Run tests
CONDA_ENV=tritonparse-local bash .ci/run-tests.sh
```

## Script Features

- **Error handling**: All scripts use `set -e` to stop on errors
- **Environment validation**: Scripts check for required environment variables
- **Verbose output**: Detailed logging for debugging
- **Modular design**: Each script has a single responsibility
- **Reusable**: Scripts can be used in different contexts (CI, local development)

## Dependencies

The scripts assume:
- Linux environment
- Git available
- Internet access for downloading packages
- Sufficient disk space for conda and packages
33 changes: 33 additions & 0 deletions .ci/install-project.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/bin/bash

# Install tritonparse project dependencies
# This script installs the project in editable mode with test dependencies

set -e

echo "Installing tritonparse project dependencies..."

# Ensure we're in the conda environment
if [ -z "$CONDA_ENV" ]; then
echo "ERROR: CONDA_ENV is not set"
exit 1
fi

# Activate conda environment
source /opt/miniconda3/etc/profile.d/conda.sh
conda activate "$CONDA_ENV"

# Upgrade pip
echo "Upgrading pip..."
python -m pip install --upgrade pip

# Install project in editable mode with test dependencies
echo "Installing tritonparse in editable mode..."
pip install -e ".[test]"

# Verify installation
echo "Verifying installation..."
python -c "import tritonparse; print(f'tritonparse installed successfully')"
python -c "import coverage; print(f'coverage version: {coverage.__version__}')"

echo "Project installation completed successfully!"
169 changes: 169 additions & 0 deletions .ci/install-triton.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
#!/bin/bash

# Install Triton from source
# This script clones and installs Triton from the main repository

set -e

echo "🚀 Installing Triton from source..."
START_TIME=$(date +%s)

# Function to show elapsed time
show_elapsed() {
CURRENT_TIME=$(date +%s)
ELAPSED=$((CURRENT_TIME - START_TIME))
echo "⏱️ Elapsed time: ${ELAPSED}s"
}

# Pre-flight checks
echo "🔍 Running pre-flight checks..."

# Set Triton version/commit for cache consistency
TRITON_COMMIT=${TRITON_COMMIT:-"main"}
echo "🎯 Target Triton commit/branch: $TRITON_COMMIT"
TRITON_CACHE_DIR="/tmp/triton-cache"
TRITON_SOURCE_DIR="/tmp/triton"

# Check disk space (need at least 10GB for Triton compilation)
AVAILABLE_SPACE=$(df /tmp | tail -1 | awk '{print $4}')
REQUIRED_SPACE=10485760 # 10GB in KB
if [ "$AVAILABLE_SPACE" -lt "$REQUIRED_SPACE" ]; then
echo "⚠️ WARNING: Low disk space. Available: $(($AVAILABLE_SPACE / 1024 / 1024))GB, Recommended: 10GB"
else
echo "✅ Sufficient disk space available: $(($AVAILABLE_SPACE / 1024 / 1024))GB"
fi

# Ensure we're in the conda environment
if [ -z "$CONDA_ENV" ]; then
echo "ERROR: CONDA_ENV is not set"
exit 1
fi

# Activate conda environment
source /opt/miniconda3/etc/profile.d/conda.sh
conda activate "$CONDA_ENV"

# Create cache directory
mkdir -p "$TRITON_CACHE_DIR"

# Check if we have cached source with correct commit
if [ -f "$TRITON_CACHE_DIR/commit" ] && [ -d "$TRITON_SOURCE_DIR" ]; then
CACHED_COMMIT=$(cat "$TRITON_CACHE_DIR/commit")
if [ "$CACHED_COMMIT" = "$TRITON_COMMIT" ] && [ "$TRITON_COMMIT" != "main" ]; then
echo "Found cached Triton source with correct commit ($CACHED_COMMIT)"
echo "Will use cached source and re-install to new conda environment"
USE_CACHED_SOURCE=true
elif [ "$TRITON_COMMIT" = "main" ]; then
echo "Target is 'main' branch (API fallback), will rebuild from scratch"
echo "Cached commit: $CACHED_COMMIT"
USE_CACHED_SOURCE=false
else
echo "Cached source commit mismatch: cached=$CACHED_COMMIT, target=$TRITON_COMMIT"
echo "Will rebuild from scratch"
USE_CACHED_SOURCE=false
fi
else
echo "No cached source found or no commit info, will build from scratch"
USE_CACHED_SOURCE=false
fi

# Update libstdc++ to match system version
# Otherwise, we get errors like:
# ImportError: /opt/miniconda3/envs/tritonparse/bin/../lib/libstdc++.so.6:
# version `GLIBCXX_3.4.30' not found (required by /tmp/triton/python/triton/_C/libtriton.so)
echo "Updating libstdc++ to match system version..."
conda install -y -c conda-forge libstdcxx-ng=12.3.0
# Check if the update was successful
strings /opt/miniconda3/envs/tritonparse/lib/libstdc++.so.6 | grep GLIBCXX | tail -5

# Uninstall existing pytorch-triton
echo "Uninstalling existing pytorch-triton..."
pip uninstall -y pytorch-triton || true
pip uninstall -y triton || true

# Setup Triton repository based on cache status
if [ "$USE_CACHED_SOURCE" = "true" ]; then
echo "Using cached Triton source..."
cd "$TRITON_SOURCE_DIR"
ACTUAL_COMMIT=$(git rev-parse HEAD)
echo "Using cached Triton commit: $ACTUAL_COMMIT"
else
echo "Setting up Triton repository from scratch..."
if [ -d "$TRITON_SOURCE_DIR" ]; then
echo "Removing existing source directory..."
rm -rf "$TRITON_SOURCE_DIR"
fi

echo "Cloning Triton repository..."
if ! git clone https://github.com/triton-lang/triton.git "$TRITON_SOURCE_DIR"; then
echo "❌ ERROR: Failed to clone Triton repository"
echo "This might be due to network issues or GitHub rate limiting"
exit 1
fi

cd "$TRITON_SOURCE_DIR"

# Checkout specific commit for reproducibility
echo "Checking out commit: $TRITON_COMMIT"
if ! git checkout "$TRITON_COMMIT"; then
echo "❌ ERROR: Failed to checkout commit $TRITON_COMMIT"
echo "This might be due to an invalid commit hash or network issues"
exit 1
fi

ACTUAL_COMMIT=$(git rev-parse HEAD)
echo "✅ Using Triton commit: $ACTUAL_COMMIT"
fi

# Install build dependencies
echo "Installing build dependencies..."
pip install ninja cmake wheel pybind11

# Install Triton requirements
echo "Installing Triton requirements..."
pip install -r python/requirements.txt

# Set environment to use clang compiler for faster compilation
echo "Setting up clang compiler for faster compilation..."
export CC=clang
export CXX=clang++
echo "Using CC: $CC"
echo "Using CXX: $CXX"

# Install Triton in editable mode with clang
if [ "$USE_CACHED_SOURCE" = "true" ]; then
echo "Installing cached Triton to new conda environment..."
echo "This should be fast since build artifacts are cached"
else
echo "Compiling and installing Triton from scratch..."
echo "This will take 30-50 minutes for compilation"
fi
pip install -e .
show_elapsed

# Verify Triton installation
echo "Verifying Triton installation..."
if python -c "import triton; print(f'Triton version: {triton.__version__}')" 2>/dev/null; then
python -c "import triton; print(f'Triton path: {triton.__file__}')"
echo "✅ Triton installation verified successfully"

# Only save commit info after successful verification
echo "$ACTUAL_COMMIT" >"$TRITON_CACHE_DIR/commit"
echo "✅ Cache information saved"

show_elapsed
echo "🎉 Triton installation completed successfully!"
else
echo "❌ ERROR: Failed to import triton"
echo "This might be due to libstdc++ version issues"
echo "Checking system libstdc++ version:"
strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX | tail -5 || echo "Could not check system libstdc++"
echo "Checking conda libstdc++ version:"
strings /opt/miniconda3/envs/tritonparse/lib/libstdc++.so.6 | grep GLIBCXX | tail -5 || echo "Could not check conda libstdc++"

# Clean up cache on failure to prevent corruption
echo "🧹 Cleaning up cache due to installation failure..."
rm -f "$TRITON_CACHE_DIR/commit"

exit 1
fi
Loading
Loading