Skip to content

This project provides a minimal working example to benchmark the I/O performance of reading meteorological data from NetCDF files using domain decomposition.

License

Notifications You must be signed in to change notification settings

slcs-jsc/netcdf_bench

Repository files navigation

NetCDF Domain Decomposition Benchmark

Overview

This project provides a minimal working example to benchmark the I/O performance of reading meteorological data from NetCDF files using domain decomposition. The code is designed to measure the scaling behavior of NetCDF I/O operations when splitting the data into subdomains and assigning each subdomain to a separate MPI process.

Approach

  1. Domain Decomposition:

    • The NetCDF file is divided into subdomains based on a user-defined process grid (nproc_x x nproc_y).
    • Each MPI process is responsible for reading its assigned subdomain.
  2. Halo Regions:

    • Optional halo regions (overlapping edges) can be added to subdomains. These are necessary for simulations in mptrac that require data from neighboring subdomains.
  3. Parallel I/O:

    • The code supports both independent and collective I/O modes for reading data.
    • Independent mode allows processes to read data without coordination.
    • Collective mode coordinates I/O operations among processes for potentially better performance.
  4. Performance Measurement:

    • The code measures the time taken to read data for each process and aggregates results to rank 0.
    • The scaling behavior is analyzed under the assumption of constant bandwidth per node, leading to linear scaling with the number of nodes (in theory).

Usage

Run the program with the following arguments:

mpirun -np <nprocs> ./netcdf_dd_read_bench <halo> <nproc_x> <nproc_y> <use_independent> <xdim_name> <ydim_name> <file1.nc> [file2.nc ...]
  • <halo>: Size of the halo region (0 for no halo).
  • <nproc_x>: Number of processes in the x-dimension.
  • <nproc_y>: Number of processes in the y-dimension.
  • <use_independent>: 1 for independent I/O, 0 for collective I/O.
  • <xdim_name>: Name of the longitude dimension in the NetCDF file.
  • <ydim_name>: Name of the latitude dimension in the NetCDF file.
  • <file1.nc> [file2.nc ...]: List of NetCDF files to process.

Example

mpirun -np 4 ./netcdf_dd_read_bench 1 2 2 0 lon lat data.nc

This runs the program with:

  • 4 processes arranged in a 2x2 grid.
  • Halo size of 1.
  • Collective I/O mode.
  • Longitude and latitude dimensions named lon and lat.
  • Input file data.nc.

Output

  • The program prints timing results for each process and file.
  • Results include subdomain coordinates and I/O performance metrics.

Dependencies

  • MPI
  • NetCDF library with parallel I/O support

HPC Scripts and Log Analysis

This project includes several bash scripts and a Python script to facilitate running benchmarks and analyzing results on HPC systems like JUWELS booster:

Bash Scripts

  1. compile.sh:

    • Compiles the netcdf_dd_read_bench program using MPI and NetCDF libraries.
    • Ensure the required modules are loaded before running this script.
  2. job.sh:

    • Submits a single benchmark job to the HPC scheduler.
    • Configured for a 3x3 process grid with a halo size of 0 and independent I/O.
  3. submit_benchmark_jobs.sh:

    • Generates and submits multiple benchmark jobs with varying configurations (e.g., grid size, halo size, I/O mode).
    • Staggers job submissions over a specified number of hours to optimize resource usage.

Python Script

  1. parse_timings.py:
    • Parses the log files generated by the benchmark jobs.
    • Aggregates timing statistics across nodes and files.
    • Generates a plot (io_speed_over_time.svg) to visualize I/O performance over time for different configurations.

These scripts are designed to streamline the benchmarking process and provide insights into the I/O performance of NetCDF domain decomposition.

License

This project is licensed under the GPL-3.0 License.

About

This project provides a minimal working example to benchmark the I/O performance of reading meteorological data from NetCDF files using domain decomposition.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published