Skip to content

Commit 6321ede

Browse files
Merge branch 'master' into remove-redundant-timestep-code
2 parents b1d7d04 + 3c80270 commit 6321ede

File tree

130 files changed

+5563
-4554
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

130 files changed

+5563
-4554
lines changed

.github/pull_request_template.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,5 +54,5 @@ To make sure the code is performing as expected on GPU devices, I have:
5454
- [ ] Ran the code on MI200+ GPUs and ensure the new features performed as expected (the GPU results match the CPU results)
5555
- [ ] Enclosed the new feature via `nvtx` ranges so that they can be identified in profiles
5656
- [ ] Ran a Nsight Systems profile using `./mfc.sh run XXXX --gpu -t simulation --nsys`, and have attached the output file (`.nsys-rep`) and plain text results to this PR
57-
- [ ] Ran an Omniperf profile using `./mfc.sh run XXXX --gpu -t simulation --omniperf`, and have attached the output file and plain text results to this PR.
57+
- [ ] Ran a Rocprof Systems profile using `./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace`, and have attached the output file and plain text results to this PR.
5858
- [ ] Ran my code using various numbers of different GPUs (1, 2, and 8, for example) in parallel and made sure that the results scale similarly to what happens if you run without the new code/feature

.github/workflows/frontier/build.sh

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
#!/bin/bash
22

3+
build_opts=""
4+
if [ "$1" == "gpu" ]; then
5+
build_opts="--gpu"
6+
fi
7+
38
. ./mfc.sh load -c f -m g
4-
./mfc.sh test --dry-run -j 8 --gpu
9+
./mfc.sh test --dry-run -j 8 $build_opts

.github/workflows/frontier/submit.sh

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,16 +13,29 @@ else
1313
exit 1
1414
fi
1515

16+
if [ "$2" == "cpu" ]; then
17+
sbatch_device_opts="\
18+
#SBATCH -n 32 # Number of cores required"
19+
elif [ "$2" == "gpu" ]; then
20+
sbatch_device_opts="\
21+
#SBATCH -n 8 # Number of cores required"
22+
else
23+
usage
24+
exit 1
25+
fi
26+
27+
1628
job_slug="`basename "$1" | sed 's/\.sh$//' | sed 's/[^a-zA-Z0-9]/-/g'`-$2"
1729

1830
sbatch <<EOT
1931
#!/bin/bash
2032
#SBATCH -JMFC-$job_slug # Job name
2133
#SBATCH -A CFD154 # charge account
2234
#SBATCH -N 1 # Number of nodes required
23-
#SBATCH -n 8 # Number of cores required
35+
$sbatch_device_opts
2436
#SBATCH -t 01:59:00 # Duration of the job (Ex: 15 mins)
2537
#SBATCH -o$job_slug.out # Combined output and error messages file
38+
#SBATCH -p extended # Extended partition for shorter queues
2639
#SBATCH -q debug # Use debug QOS - only one job per user allowed in queue!
2740
#SBATCH -W # Do not exit until the submitted job terminates.
2841

.github/workflows/frontier/test.sh

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,8 @@
33
gpus=`rocm-smi --showid | awk '{print $1}' | grep -Eo '[0-9]+' | uniq | tr '\n' ' '`
44
ngpus=`echo "$gpus" | tr -d '[:space:]' | wc -c`
55

6-
./mfc.sh test --max-attempts 3 -j $ngpus -- -c frontier
7-
6+
if [ "$job_device" == "gpu" ]; then
7+
./mfc.sh test --max-attempts 3 -j $ngpus -- -c frontier
8+
else
9+
./mfc.sh test --max-attempts 3 -j 32 -- -c frontier
10+
fi

.github/workflows/test.yml

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -97,9 +97,6 @@ jobs:
9797
matrix:
9898
device: ['cpu', 'gpu']
9999
lbl: ['gt', 'frontier']
100-
exclude:
101-
- device: cpu
102-
lbl: frontier
103100
runs-on:
104101
group: phoenix
105102
labels: ${{ matrix.lbl }}
@@ -116,7 +113,7 @@ jobs:
116113

117114
- name: Build
118115
if: matrix.lbl == 'frontier'
119-
run: bash .github/workflows/frontier/build.sh
116+
run: bash .github/workflows/frontier/build.sh ${{ matrix.device }}
120117

121118
- name: Test
122119
if: matrix.lbl == 'frontier'

docs/documentation/case.md

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -436,7 +436,7 @@ The effect and use of the source term are assessed by [Schmidmayer et al., 2019]
436436
- `time_stepper` specifies the order of the Runge-Kutta (RK) time integration scheme that is used for temporal integration in simulation, from the 1st to 5th order by corresponding integer.
437437
Note that `time_stepper = 3` specifies the total variation diminishing (TVD), third order RK scheme ([Gottlieb and Shu, 1998](references.md)).
438438

439-
- `adap_dt` activates the Strang operator splitting scheme which splits flux and source terms in time marching, and an adaptive time stepping strategy is implemented for the source term. It requires ``bubbles = 'T'``, ``polytropic = 'T'``, ``adv_n = 'T'`` and `time_stepper = 3`.
439+
- `adap_dt` activates the Strang operator splitting scheme which splits flux and source terms in time marching, and an adaptive time stepping strategy is implemented for the source term. It requires ``bubbles_euler = 'T'``, ``polytropic = 'T'``, ``adv_n = 'T'`` and `time_stepper = 3`. Additionally, it can be used with ``bubbles_lagrange = 'T'`` and `time_stepper = 3`
440440

441441
- `weno_order` specifies the order of WENO scheme that is used for spatial reconstruction of variables by an integer of 1, 3, 5, and 7, that correspond to the 1st, 3rd, 5th, and 7th order, respectively.
442442

@@ -461,7 +461,7 @@ It is recommended to set `weno_eps` to $10^{-6}$ for WENO-JS, and to $10^{-40}$
461461
`riemann_solver = 1`, `2`, and `3` correspond to HLL, HLLC, and Exact Riemann solver, respectively ([Toro, 2013](references.md)).
462462
`riemann_solver = 4` is only for MHD simulations. It resolves 5 of the full seven-wave structure of the MHD equations ([Miyoshi and Kusano, 2005](references.md)).
463463

464-
- `low_Mach` specifies the choice of the low Mach number correction scheme for the HLLC Riemann solver. `low_Mach = 0` is default value and does not apply any correction scheme. `low_Mach = 1` and `2` apply the anti-dissipation pressure correction method ([Chen et al., 2022](references.md)) and the improved velocity reconstruction method ([Thornber et al., 2008](references.md)). This feature requires `riemann_solver = 2` and `model_eqns = 2`.
464+
- `low_Mach` specifies the choice of the low Mach number correction scheme for the HLLC Riemann solver. `low_Mach = 0` is default value and does not apply any correction scheme. `low_Mach = 1` and `2` apply the anti-dissipation pressure correction method ([Chen et al., 2022](references.md)) and the improved velocity reconstruction method ([Thornber et al., 2008](references.md)). This feature requires `model_eqns = 2` or `3`. `low_Mach = 1` works for `riemann_solver = 1` and `2`, but `low_Mach = 2` only works for `riemann_solver = 2`.
465465

466466
- `avg_state` specifies the choice of the method to compute averaged variables at the cell-boundaries from the left and the right states in the Riemann solver by an integer of 1 or 2.
467467
`avg_state = 1` and `2` correspond to Roe- and arithmetic averages, respectively.
@@ -790,8 +790,6 @@ When ``polytropic = 'F'``, the gas compression is modeled as non-polytropic due
790790
| `x0` | Real | Reference length |
791791
| `Thost` | Real | Temperature of the surrounding liquid (host) |
792792
| `diffcoefvap` | Real | Vapor diffusivity in the gas |
793-
| `rkck_adap_dt` | Logical | Activates the adaptive rkck time stepping algorithm |
794-
| `rkck_tolerance` | Real | Admissible error truncation tolerance in the rkck stepper |
795793

796794
- `nBubs_glb` Total number of bubbles. Their initial conditions need to be specified in the ./input/lag_bubbles.dat file. See the example cases for additional information.
797795

@@ -805,8 +803,6 @@ When ``polytropic = 'F'``, the gas compression is modeled as non-polytropic due
805803

806804
- `massTransfer_model` Activates the mass transfer model at the bubble's interface based on ([Preston et al., 2007](references.md)).
807805

808-
- `rkck_adap_dt` Activates the adaptive 4th/5th order Runge—Kutta–Cash–Karp (RKCK) time-stepping algorithm (requires `time_stepper ==4`). A maximum error between the 4th and 5th order Runge-Kutta-Cash-Karp solutions for the same time step size is calculated. If the error is smaller than a tolerance (`rkck_tolerance`), then the algorithm employs the 5th order solution, while if not, both eulerian/lagrangian variables are re-calculated with a smaller time step size.
809-
810806
### 10. Velocity Field Setup
811807

812808
| Parameter | Type | Description |

docs/documentation/running.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -98,13 +98,13 @@ Learn more about NVIDIA Nsight Compute [here](https://docs.nvidia.com/nsight-com
9898

9999

100100
#### AMD GPUs
101-
- Rocprof (ROC): `./mfc.sh run ... -t simulation --roc --hip-trace [rocprof flags]` allows one to visualize MFC's system-wide performance with [Perfetto UI](https://ui.perfetto.dev/).
101+
- Rocprof Systems (RSYS): `./mfc.sh run ... -t simulation --rsys --hip-trace [rocprof flags]` allows one to visualize MFC's system-wide performance with [Perfetto UI](https://ui.perfetto.dev/).
102102
When used, `--roc` will run the simulation and generate files in the case directory for all targets.
103103
`results.json` can then be imported in [Perfetto's UI](https://ui.perfetto.dev/).
104104
Learn more about AMD Rocprof [here](https://rocm.docs.amd.com/projects/rocprofiler/en/docs-5.5.1/rocprof.html)
105105
It is best to run case files with few timesteps to keep the report file sizes manageable.
106-
- Omniperf (OMNI): `./mfc.sh run ... -t simulation --omni [omniperf flags]` allows one to conduct kernel-level profiling with [AMD's Omniperf](https://rocm.docs.amd.com/projects/omniperf/en/latest/index.html).
107-
When used, `--omni` will output profiling information for all subroutines, including rooflines, cache usage, register usage, and more, after the simulation is run.
106+
- Rocprof Compute (RCU): `./mfc.sh run ... -t simulation --rcu -n <name> [rocprof-compute flags]` allows one to conduct kernel-level profiling with [ROCm Compute Profiler](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/what-is-rocprof-compute.html).
107+
When used, `--rcu` will output profiling information for all subroutines, including rooflines, cache usage, register usage, and more, after the simulation is run.
108108
Adding this argument will moderately slow down the simulation and run the MFC executable several times.
109109
For this reason, it should only be used with case files with few timesteps.
110110

Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
#!/usr/bin/env python3
2+
import math
3+
import json
4+
5+
# Bubble screen
6+
# Description: A planar acoustic wave interacts with a bubble cloud
7+
# in water. The background field is modeled in using an Eulerian framework,
8+
# while the bubbles are tracked using a Lagrangian framework.
9+
10+
# Reference values for nondimensionalization
11+
x0 = 1.0e-03 # length - m
12+
rho0 = 1.0e03 # density - kg/m3
13+
c0 = 1475.0 # speed of sound - m/s
14+
p0 = rho0 * c0 * c0 # pressure - Pa
15+
T0 = 298 # temperature - K
16+
17+
# Host properties (water)
18+
gamma_host = 2.7466 # Specific heat ratio
19+
pi_inf_host = 792.02e06 # Stiffness - Pa
20+
mu_host = 1e-3 # Dynamic viscosity - Pa.s
21+
c_host = 1475.0 # speed of sound - m/s
22+
rho_host = 1000 # density kg/m3
23+
T_host = 298 # temperature K
24+
25+
# Lagrangian bubbles' properties
26+
R_uni = 8314 # Universal gas constant - J/kmol/K
27+
MW_g = 28.0 # Molar weight of the gas - kg/kmol
28+
MW_v = 18.0 # Molar weight of the vapor - kg/kmol
29+
gamma_g = 1.4 # Specific heat ratio of the gas
30+
gamma_v = 1.333 # Specific heat ratio of the vapor
31+
pv = 2350 # Vapor pressure of the host - Pa
32+
cp_g = 1.0e3 # Specific heat of the gas - J/kg/K
33+
cp_v = 2.1e3 # Specific heat of the vapor - J/kg/K
34+
k_g = 0.025 # Thermal conductivity of the gas - W/m/K
35+
k_v = 0.02 # Thermal conductivity of the vapor - W/m/K
36+
diffVapor = 2.5e-5 # Diffusivity coefficient of the vapor - m2/s
37+
sigBubble = 0.069 # Surface tension of the bubble - N/m
38+
mu_g = 1.48e-5
39+
40+
# Acoustic source properties
41+
patm = 101325.0 # Atmospheric pressure - Pa
42+
pamp = 1.0e5 # Amplitude of the acoustic source - Pa
43+
freq = 300e03 # Source frequency - Hz
44+
wlen = c_host / freq # Wavelength - m
45+
46+
# Domain and time set up
47+
48+
xb = -12.0e-3 # Domain boundaries - m (x direction)
49+
xe = 12.0e-3
50+
yb = -2.5e-3 # Domain boundaries - m (y direction)
51+
ye = 2.5e-3
52+
z_virtual = 5.0e-3 # Virtual depth (z direction)
53+
54+
Nx = 240 # number of elements into x direction
55+
Ny = 50 # number of elements into y direction
56+
57+
dt = 7.5e-9 # constant time-step - sec
58+
59+
# Configuring case dictionary
60+
print(
61+
json.dumps(
62+
{
63+
# Logistics
64+
"run_time_info": "T",
65+
# Computational Domain Parameters
66+
"x_domain%beg": xb / x0,
67+
"x_domain%end": xe / x0,
68+
"y_domain%beg": yb / x0,
69+
"y_domain%end": ye / x0,
70+
"stretch_y": "F",
71+
"stretch_x": "F",
72+
"m": Nx,
73+
"n": Ny,
74+
"p": 0,
75+
"dt": dt * (c0 / x0),
76+
"t_step_start": 0,
77+
"t_step_stop": 3000,
78+
"t_step_save": 500,
79+
# Simulation Algorithm Parameters
80+
"model_eqns": 2,
81+
"time_stepper": 3,
82+
"num_fluids": 2,
83+
"num_patches": 1,
84+
"viscous": "T",
85+
"mpp_lim": "F",
86+
"weno_order": 5,
87+
"weno_eps": 1.0e-16,
88+
"mapped_weno": "T",
89+
"riemann_solver": 2,
90+
"wave_speeds": 1,
91+
"avg_state": 2,
92+
"bc_x%beg": -6,
93+
"bc_x%end": -6,
94+
"bc_y%beg": -1,
95+
"bc_y%end": -1,
96+
# Acoustic source
97+
"acoustic_source": "T",
98+
"num_source": 1,
99+
"acoustic(1)%support": 2,
100+
"acoustic(1)%pulse": 1,
101+
"acoustic(1)%npulse": 1,
102+
"acoustic(1)%mag": pamp / p0,
103+
"acoustic(1)%wavelength": wlen / x0,
104+
"acoustic(1)%length": 2 * (ye - yb) / x0,
105+
"acoustic(1)%loc(1)": -7.0e-03 / x0,
106+
"acoustic(1)%loc(2)": 0.0,
107+
"acoustic(1)%dir": 0.0,
108+
"acoustic(1)%delay": 0.0,
109+
# Formatted Database Files Structure Parameters
110+
"format": 1,
111+
"precision": 2,
112+
"prim_vars_wrt": "T",
113+
"parallel_io": "T",
114+
# Patch 1: Water (left)
115+
"patch_icpp(1)%geometry": 3,
116+
"patch_icpp(1)%x_centroid": 0.0,
117+
"patch_icpp(1)%y_centroid": 0.0,
118+
"patch_icpp(1)%length_x": 2 * (xe - xb) / x0,
119+
"patch_icpp(1)%length_y": 2 * (ye - yb) / x0,
120+
"patch_icpp(1)%vel(1)": 0.0,
121+
"patch_icpp(1)%vel(2)": 0.0,
122+
"patch_icpp(1)%pres": patm / p0,
123+
"patch_icpp(1)%alpha_rho(1)": rho_host / rho0,
124+
"patch_icpp(1)%alpha_rho(2)": 0.0,
125+
"patch_icpp(1)%alpha(1)": 1.0,
126+
"patch_icpp(1)%alpha(2)": 0.0,
127+
# Lagrangian Bubbles
128+
"bubbles_lagrange": "T",
129+
"bubble_model": 2, # Keller-Miksis model
130+
"lag_params%nBubs_glb": 1194, # Number of bubbles
131+
"lag_params%solver_approach": 2,
132+
"lag_params%cluster_type": 2,
133+
"lag_params%pressure_corrector": "T",
134+
"lag_params%smooth_type": 1,
135+
"lag_params%heatTransfer_model": "T",
136+
"lag_params%massTransfer_model": "T",
137+
"lag_params%epsilonb": 1.0,
138+
"lag_params%valmaxvoid": 0.9,
139+
"lag_params%write_bubbles": "F",
140+
"lag_params%write_bubbles_stats": "F",
141+
"lag_params%c0": c0,
142+
"lag_params%rho0": rho0,
143+
"lag_params%T0": T0,
144+
"lag_params%x0": x0,
145+
"lag_params%diffcoefvap": diffVapor,
146+
"lag_params%Thost": T_host,
147+
"lag_params%charwidth": z_virtual / x0,
148+
# Fluids Physical Parameters
149+
# Host medium
150+
"fluid_pp(1)%gamma": 1.0 / (gamma_host - 1.0),
151+
"fluid_pp(1)%pi_inf": gamma_host * (pi_inf_host / p0) / (gamma_host - 1.0),
152+
"fluid_pp(1)%Re(1)": 1.0 / (mu_host / (rho0 * c0 * x0)),
153+
"fluid_pp(1)%mul0": mu_host,
154+
"fluid_pp(1)%ss": sigBubble,
155+
"fluid_pp(1)%pv": pv,
156+
"fluid_pp(1)%gamma_v": gamma_v,
157+
"fluid_pp(1)%M_v": MW_v,
158+
"fluid_pp(1)%k_v": k_v,
159+
"fluid_pp(1)%cp_v": cp_v,
160+
# Bubble gas state
161+
"fluid_pp(2)%gamma": 1.0 / (gamma_g - 1.0),
162+
"fluid_pp(2)%pi_inf": 0.0e00,
163+
"fluid_pp(2)%Re(1)": 1.0 / (mu_g / (rho0 * c0 * x0)),
164+
"fluid_pp(2)%gamma_v": gamma_g,
165+
"fluid_pp(2)%M_v": MW_g,
166+
"fluid_pp(2)%k_v": k_g,
167+
"fluid_pp(2)%cp_v": cp_g,
168+
}
169+
)
170+
)
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
2+
The user input file 'input/lag_bubbles.dat' contains the initial conditions of the lagrangian bubbles.
3+
Each row represents the initial state of one specific bubble, which are:
4+
5+
xPosition/x0 yPosition/x0 zPosition/x0 xVel/c0 yVel/c0 zVel/c0 radius/x0 interfaceVelocity/c0

examples/3D_lagrange_shbubcollapse/case.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@
8080
"n": Ny,
8181
"p": Nz,
8282
"dt": round(dt * c0 / x0, 6),
83+
"adap_dt": "T",
8384
"n_start": 0,
8485
"t_save": saveTime * (c0 / x0),
8586
"t_stop": stopTime * (c0 / x0),
@@ -89,7 +90,7 @@
8990
"num_patches": 1,
9091
"mpp_lim": "F",
9192
"viscous": "T",
92-
"time_stepper": 4, # 4th/5th RKCK
93+
"time_stepper": 3,
9394
"weno_order": 5,
9495
"weno_eps": 1.0e-16,
9596
"mapped_weno": "T",
@@ -141,8 +142,6 @@
141142
# Lagrangian Bubbles
142143
"bubbles_lagrange": "T",
143144
"bubble_model": 2, # Keller-Miksis model
144-
"rkck_adap_dt": "T", # Activate adaptive time stepper
145-
"rkck_tolerance": 1.0e-05,
146145
"lag_params%nBubs_glb": 1,
147146
"lag_params%solver_approach": 2, # Two-way coupled
148147
"lag_params%cluster_type": 2,

0 commit comments

Comments
 (0)