Cache

Triton Cache Wiki

Overview

Triton's cache system accelerates kernel compilation by storing precompiled artifacts (PTX, CUBIN, HSACO etc.) and metadata. This document explains its structure, behavior and customization options.

Cache Directory Structure

By default, Triton stores cached kernels in:~/.triton/cache/. This can be customized using the TRITON_CACHE_DIR environment variable.

File Types

Extension	Description
`.json`	Metadata, compilation parameters
`.cubin` / `.hsaco`	Compiled binary kernel (CUDA, ROCm)
`.ptx` / `.amdgcn`	PTX or AMDGCN intermediate representation
`.llir`	LLVM IR
`.ttir`	Triton IR
`.ttgir`	Triton GPU IR

Example cache structure (NVIDIA):

$ tree ~/.triton/cache/QU0JRSfWJiAb9DadP-xn4vDFWO9yNo7Am32JeY1alLc
├── __grp__triton_poi_fused_threshold_backward_1.json
├── triton_poi_fused_threshold_backward_1.cubin
├── triton_poi_fused_threshold_backward_1.json
├── triton_poi_fused_threshold_backward_1.llir
├── triton_poi_fused_threshold_backward_1.ptx
├── triton_poi_fused_threshold_backward_1.ttgir
└── triton_poi_fused_threshold_backward_1.ttir

Cache Key Generation

The cache key is generated from multiple inputs to ensure uniqueness:

Triton Environment
- Triton version (not stored in metadata but included in the hash)
Kernel Identity
- Function name
- Signature types (normalized, for example, pointer types become "ptr")
- Constant values
- Kernel attributes
Backend Configuration
- Backend (rocm/cuda)
- GPU architecture
- Warp size
- Compilation options (num_warps, num_stages, etc.)
Environment Factors
- TRITON_DEBUG variable state (0 or 1)

Key Generation Code:

import hashlib
import base64

def make_so_cache_key(version_hash, signature, constants, ids, **kwargs):
    signature = {k: 'ptr' if v[0] == '*' else v for k, v in signature.items()}
    key = f"{version_hash}-{''.join(signature.values())}-{constants}-{ids}"
    for kw in kwargs:
        key = f"{key}-{kwargs.get(kw)}"
    key = hashlib.sha256(key.encode("utf-8")).hexdigest()
    return _base32(key)

Note: The Triton version is included in the cache key but is not stored in the metadata files. If you need to check the version, you must track it externally.

Environment Variables

Variable	Default	Description
`TRITON_CACHE_DIR`	`~/.triton/cache`	Custom cache directory
`TRITON_ALWAYS_COMPILE`	`0`	Bypass cache (force recompilation) when set to `1`
`TRITON_KERNEL_OVERRIDE`	`0`	Enable manual kernel IR overrides
`TRITON_OVERRIDE_DIR`	`~/.triton/override/`	Directory for manually overridden kernels
`TRITON_KERNEL_DUMP`	`0`	Enable kernel IR dumping
`TRITON_DUMP_DIR`	`~/.triton/dump/`	Directory for dumped compilation artifacts
`TRITON_STORE_BINARY_ONLY`	`0`	Store only essential binaries (saves ~77% space)
`TRITON_DEBUG`	`0`	Include debug info in cache key (affects hashing)

Cache Management

Force Recompilation

To completely bypass the cache:

export TRITON_ALWAYS_COMPILE=1

Storage Optimization

Enable binary-only storage to save space:

export TRITON_STORE_BINARY_ONLY=1

This reduces stored files to:

.json (metadata)
.cubin/.hsaco (compiled binaries)

Remote Caching

Triton supports distributed caching via RemoteCacheManager. Example Redis setup:

import os

# Configure via environment variables
os.environ["TRITON_REMOTE_CACHE_BACKEND"] = "triton.backends.redis:RedisRemoteCacheBackend"
os.environ["TRITON_REDIS_HOST"] = "redis.example.com"
os.environ["TRITON_REDIS_PORT"] = "6379"

Troubleshooting

Issue	Solution
Stale Cache	Delete cache or use `TRITON_ALWAYS_COMPILE=1`
Version Mismatch	Triton upgrades and different environment variable change the cache hash, invalidating old caches

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!