Skip to content

Add 7 new difftests and push constants support #321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
292 changes: 253 additions & 39 deletions Cargo.lock

Large diffs are not rendered by default.

164 changes: 133 additions & 31 deletions tests/difftests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,9 @@ discrepancies across implementations.

3. **Output Comparison**
- The harness reads outputs as opaque bytes.
- If outputs differ, the test fails.
- If outputs differ, the test fails with detailed error reporting.
- Tests can specify metadata to enable smarter epsilon-based comparisons and human
display of data.

Because the difftest harness merely runs Rust binaries in a directory, it supports
testing various setups. For example, you can:
Expand Down Expand Up @@ -54,10 +56,13 @@ Each test binary must:
3. Load the config using `difftest::Config::from_path`.
4. Write its computed output to `output_path`.

The test binary can _optionally_ write test metadata to `metadata_path` for custom
comparison behavior.

For example:

```rust
use difftest::config::Config;
use difftest::config::{Config, TestMetadata, OutputType};
use std::{env, fs, io::Write};

fn main() {
Expand All @@ -70,6 +75,13 @@ fn main() {
let mut file = fs::File::create(&config.output_path)
.expect("Failed to create output file");
file.write_all(&output).expect("Failed to write output");

// Optional: Write metadata for floating-point comparison
let metadata = TestMetadata {
epsilon: Some(0.00001), // Allow differences up to 1e-5
output_type: OutputType::F32, // Interpret output as f32 array
};
config.write_metadata(&metadata).expect("Failed to write metadata");
}
```

Expand All @@ -79,39 +91,76 @@ Of course, many test will have common host and GPU needs. Rather than require ev
binary to reimplement functionality, we have created some common tests with reasonable
defaults in the `difftest` library.

For example, this will handle compiling the current crate as a Rust compute shader,
running it via `wgpu`, and writing the output to the appropriate place:
The library provides helper types for common test patterns:

```rust
fn main() {
// Load the config from the harness.
let config = Config::from_path(std::env::args().nth(1).unwrap()).unwrap();
**Test types:**

// Define test parameters, loading the rust shader from the current crate.
let test = WgpuComputeTest::new(RustComputeShader::default(), [1, 1, 1], 1024);
- `WgpuComputeTest` - Single buffer compute shader test
- `WgpuComputeTestMultiBuffer` - Multi-buffer compute shader test with input/output
separation
- `WgpuComputeTestPushConstant` - Compute shader test with push constants support
- `Skip` - Marks a test variant as skipped with a reason

// Run the test and write the output to a file.
test.run_test(&config).unwrap();
}
```
**Shader source types:**

and this will handle loading a shader named `shader.wgsl` or `compute.wgsl` in the root
of the current crate, running it via `wgpu`, and writing the output to the appropriate
place:
- `RustComputeShader` - Compiles the current crate as a Rust GPU shader
- `WgslComputeShader` - Loads WGSL shader from file (shader.wgsl or compute.wgsl)

```rust
fn main() {
// Load the config from the harness.
let config = Config::from_path(std::env::args().nth(1).unwrap()).unwrap();
**Backend types:**

// Define test parameters, loading the wgsl shader from the crate directory.
let test = WgpuComputeTest::new(WgslComputeShader::default(), [1, 1, 1], 1024);
- `WgpuBackend` - Default wgpu-based compute backend
- `VulkanoBackend` - Vulkano-based compute backend (useful for testing different GPU drivers)

// Run the test and write the output to a file.
test.run_test(&config).unwrap();
}
For examples, see:

- [`tests/lang/core/ops/math_ops/`](tests/lang/core/ops/math_ops/) - Multi-buffer test
with floating-point metadata
- [`tests/storage_class/push_constant/`](tests/storage_class/push_constant/) - Push
constants usage
- [`tests/arch/workgroup_memory/`](tests/arch/workgroup_memory/) - Workgroup memory
usage

### Test Metadata

Tests producing floating-point outputs can specify comparison metadata to handle
platform-specific precision differences. The metadata controls how the harness compares
outputs:

```rust
use difftest::config::{TestMetadata, OutputType};

// Write metadata before or after writing output
let metadata = TestMetadata {
epsilon: Some(0.00001), // Maximum allowed difference (default: None)
output_type: OutputType::F32, // How to interpret output data (default: Raw)
};
config.write_metadata(&metadata)?;

// Alternative: Use the helper method for common cases
let metadata = TestMetadata::with_epsilon(0.00001); // Sets epsilon, keeps default output_type
config.write_metadata(&metadata)?;
```

**Metadata fields:**

- `epsilon`: Optional maximum allowed absolute difference between values. When `None`
(default), exact byte-for-byte comparison is used. When `Some(value)`, floating-point
values are compared with the specified tolerance.
- `output_type`: Specifies how to interpret output data:
- `Raw`: Exact byte comparison (default)
- `F32`: Interpret as array of 32-bit floats, enables epsilon comparison
- `F64`: Interpret as array of 64-bit floats, enables epsilon comparison
- `U32`/`I32`: Interpret as 32-bit integers (epsilon ignored)

**Important notes:**

- If no metadata file is written or the file is empty, the harness uses exact byte
comparison.
- All test packages must have consistent metadata. If packages specify different
`output_type` values, the test will fail with an error.
- Invalid JSON in metadata files will cause the test to fail immediately.
- The `epsilon` field is only used when `output_type` is `F32` or `F64`.

## Running Tests

### Run all difftests:
Expand All @@ -137,13 +186,66 @@ cargo difftest --nocapture

## Debugging Failing Tests

If outputs differ, the error message lists:
When outputs differ, the harness provides detailed error reporting:

### For raw byte differences

- Shows which packages produced different outputs
- Lists output file paths for manual inspection
- Groups packages by their output values

Inspect the output files with your preferred tools to determine the root cause.

### For floating-point differences (with `output_type: F32/F64`)

Reports all of the above, plus:

- Actual floating-point values in a comparison table
- Shows the maximum difference found
- Indicates the epsilon threshold (if specified)
- Highlights specific values that exceed the tolerance

### Additional output files

The harness automatically writes human-readable `.txt` files alongside binary outputs.
For floating-point data (F32/F64), these show the array values in decimal format. For
raw/integer data, these show the values as hex bytes or integers

## Skipping Tests on Specific Platforms

Sometimes a test variant needs to be skipped on certain platforms (e.g., due to driver
issues or platform limitations). The difftest framework provides a clean way to handle
this using the `Skip` scaffolding type:

```rust
use difftest::scaffold::Skip;

fn main() {
let config = Config::from_path(std::env::args().nth(1).unwrap()).unwrap();

// Skip on macOS due to platform-specific issues
#[cfg(target_os = "macos")]
{
let skip = Skip::new("This test is not supported on macOS");
skip.run_test(&config).unwrap();
return;
}

// Run the actual test on other platforms
#[cfg(not(target_os = "macos"))]
{
// ... normal test implementation ...
}
}
```

- Binary package names
- Their directories
- Output file paths
When a test is skipped:
- The skip reason is recorded in the test metadata
- The test runner logs the skip reason
- The test doesn't contribute to the output comparison
- If all variants are skipped, the test fails with an error

Inspect the output files with your preferred tools to determine the differences.
## Harness logs

If you suspect a bug in the test harness, you can view detailed test harness logs:

Expand Down
3 changes: 3 additions & 0 deletions tests/difftests/bin/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ serde_json = "1.0"
thiserror = "1.0"
toml = { version = "0.8.20", default-features = false, features = ["parse"] }
bytesize = "2.0.1"
bytemuck = "1.21.0"
difftest = { path = "../lib" }
tabled = { version = "0.15", default-features = false, features = ["std"] }

[lints]
workspace = true
Loading
Loading