Skip to content

Strang Splitting for lagrange bubbles #827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

dgvacarevelo
Copy link
Contributor

Description

This PR is an effort to enhance the Euler-Lagrange model for sub-grid bubble dynamics. The RKCK algorithm (time_stepper==4) was removed and the EL subroutines are now compatible with the Strang Splitting algorithm, which is highly important to simulate collapsing bubbles. Additionally, this PR will fix the bug #797.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Scope

  • This PR comprises a set of related changes with a common goal

If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration

  • Test A : Bubble screen problem
    A cuboidal bubble cloud interacts with a sinusoidal planar wave. This case is included as an example case (examples/3D_lag_bubbles_bubblescreen). The pressure at the center of the cloud is measured and compared. Both simulations make use of the 3rd order RK-TV time-stepper, and the results with adap_dt=T agrees well with the outcome when adap_dt=F.

Comparison-PressureEvolution

  • Test B: : Single bubble collapse
    A single bubble of 8 micrometers encounters a sinusoidal planar wave. This case is also included as an example case (examples/3D_lag_bubbles_shbubcollapse). The radius evolution of the bubble is compared with the experiment reported by (Ohl et. al, 1999) (https://doi.org/10.1098/rsta.1999.0327). Similarly, both simulations make use of the 3rd order RK-TV time-stepper, and the results with adap_dt=T agrees well with the outcome when adap_dt=F. However, when bubble collapse occurs, enabling adaptive time stepping (adap_dt = T) is essential; otherwise, the simulation fails near the collapse point. As shown in the plotted results, the simulation with adap_dt = F terminates prematurely at approximately t ≈ 27 microseconds.

Validation-SingleBubCollapse8um

Test Configuration:

  • What computers and compilers did you use to test this:
  1. Delta: I tested both problems with CPUs (gcc/11.4.0, openmpi/4.1.6) and GPUs (nvhpc/24.1, openmpi/4.1.5+cuda, cuda/12.3.0, cmake/3.27.9). I used the NVIDIA A100 to run my GPU cases.
  2. Turing (WPI HPC): Here, I used the following CPU compilers: gcc/12.1.0, openmpi/4.1.3, python/3.11.6

Checklist

  • I have added comments for the new code
  • I added Doxygen docstrings to the new code
  • I have made corresponding changes to the documentation (docs/)
  • I have added regression tests to the test suite so that people can verify in the future that the feature is behaving as expected
  • I have added example cases in examples/ that demonstrate my new feature performing as expected.
    They run to completion and demonstrate "interesting physics"
  • I ran ./mfc.sh format before committing my code
  • New and existing tests pass locally with my changes, including with GPU capability enabled (both NVIDIA hardware with NVHPC compilers and AMD hardware with CRAY compilers) and disabled
  • This PR does not introduce any repeated code (it follows the DRY principle)
  • I cannot think of a way to condense this code and reduce any introduced additional line count

If your code changes any code source files (anything in src/simulation)

To make sure the code is performing as expected on GPU devices, I have:

  • Checked that the code compiles using NVHPC compilers
  • Checked that the code compiles using CRAY compilers
  • Ran the code on either V100, A100, or H100 GPUs and ensured the new feature performed as expected (the GPU results match the CPU results)
  • Ran the code on MI200+ GPUs and ensure the new features performed as expected (the GPU results match the CPU results)
  • Enclosed the new feature via nvtx ranges so that they can be identified in profiles
  • Ran a Nsight Systems profile using ./mfc.sh run XXXX --gpu -t simulation --nsys, and have attached the output file (.nsys-rep) and plain text results to this PR
  • Ran an Omniperf profile using ./mfc.sh run XXXX --gpu -t simulation --omniperf, and have attached the output file and plain text results to this PR.
  • Ran my code using various numbers of different GPUs (1, 2, and 8, for example) in parallel and made sure that the results scale similarly to what happens if you run without the new code/feature

@dgvacarevelo dgvacarevelo requested review from a team and sbryngelson as code owners April 30, 2025 23:11
@wilfonba
Copy link
Contributor

@dgvacarevelo Could you add a 2D example case that includes the use of charwidth for modeling 3D bubble clouds in a 2D host flow? It wasn't immediately apparent to me how this worked when I had to figure it out, and I feel an example would be useful. Just copying the 3D_lagrange_bubble_cloud case and removing the 3rd dimension from the host fluid would suffice. I'd do it myself but since this PR is already open it seems like a good place to add it.

@dgvacarevelo
Copy link
Contributor Author

@wilfonba I have added the suggested example. Here is how it behaves using the bubble screen problem.

Comparison-PressureEvolution

@wilfonba
Copy link
Contributor

wilfonba commented May 1, 2025

@dgvacarevelo Thanks for adding this!

@wilfonba
Copy link
Contributor

wilfonba commented May 1, 2025

Looks like there's a Frontier issue somewhere. I'll try to take a look this afternoon or tomorrow and narrow it down.

! Compute d_0 = ||y0|| and d_1 = ||f(x0,y0)||
d_0 = sqrt((myR_tmp(1)**2._wp + myV_tmp(1)**2._wp)/2._wp)
d_1 = sqrt((myV_tmp(1)**2._wp + myA_tmp(1)**2._wp)/2._wp)
if (d_0 < 1e-5_wp .or. d_1 < 1e-5_wp) then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more magic numbers. where do these all come from? Should they be user-adjustable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These parameters are used to provide an initial estimate for the size of the first sub-step. If this estimate turns out to be too large (or too small), the substepping algorithm will adjust it dynamically based on the specified tolerance value (adap_dt_tol). By allowing the tolerance to be user-adjustable, I believe the user is given sufficient control over the behavior and accuracy of the algorithm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make these not magic numbers in the code at least, and give them comments to explain them further

@sbryngelson sbryngelson requested a review from Copilot May 2, 2025 11:11
Copilot

This comment was marked as off-topic.

@@ -928,9 +892,27 @@ contains
idwint, &
gm_alpha_qp%vf)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the feeling that something in here could be where Frontier is going wrong.

@sbryngelson
Copy link
Member

trying some gpt4.5 code review and found some quite useful comments i like. see below

It looks like I encountered a technical issue while trying to open the PR files directly from GitHub. However, I can still provide specific suggestions on how Fortran .f90 and .fpp code related to Strang splitting and Lagrangian bubble dynamics can be improved, based on common issues seen in scientific computing projects like MFC.

Please consider the following targeted improvements, assuming you’re working with subroutines similar to s_advance_step, f_advance_substep, and other related modules shown earlier:


🔹 1. Refactor Long Argument Lists

Problem: Subroutines like f_advance_substep have more than 15 arguments, many of which belong to logical groupings (e.g., bubble state, EOS parameters).

Solution: Use derived types to group related quantities.

type :: BubbleState
  real(wp) :: R, V, Pb, Mv
end type

type :: EOSParams
  real(wp) :: ntait, Btait, alf, Cson
end type

Before:

call f_advance_substep(fRho, fP, fR, fV, fR0, fpb, ...)

After:

call f_advance_substep(state, eos, env_params, ...)

🔹 2. Introduce Constants Instead of Magic Numbers

Problem: Repeated usage of values like 1e-4_wp, 0.5_wp, 1e-12_wp throughout code.

Solution:
At module or subroutine top:

real(wp), parameter :: TOL = 1e-4_wp
real(wp), parameter :: SMALL = 1e-12_wp
real(wp), parameter :: RK3_COEFF = 1.0_wp / 6.0_wp

Use these in place of hardcoded values.


🔹 3. Improve Loop Control for Adaptive Time Stepping

Problem: Loops like do while (.true.) are dangerous if error conditions arise.

Solution: Add max iteration count and fail-safe exit.

integer :: iter_count
integer, parameter :: MAX_ADAPT_ITERS = 100

iter_count = 0
do while (t_new < 0.5_wp * dt .and. iter_count < MAX_ADAPT_ITERS)
    iter_count = iter_count + 1
    ...
end do

if (iter_count >= MAX_ADAPT_ITERS) then
    print *, "Warning: adaptive time stepping failed to converge"
end if

🔹 4. Encapsulate Bubble Physics Stages

Problem: RK stages for radius, velocity, pressure, and mass evolve together, but logic is spread across many lines.

Solution: Create helper functions for each RK stage.

call advance_bubble_stage(stage_idx, state_in, state_out, params)

This improves modularity and makes debugging easier.


🔹 5. Cleaner Strang Splitting Structure

Problem: No clear separation of pre-split and post-split components.

Solution: Use functions like:

call advance_physics_part1(state, dt_half)
call advance_physics_part2(state, dt_full)
call advance_physics_part1(state, dt_half)

Encapsulate each phase of Strang splitting.


🔹 6. Numerical Robustness

Problem: Direct comparison of floating point to zero.

Solution: Replace:

if (myA_tmp(1) == 0._wp) ...

With:

if (abs(myA_tmp(1)) < SMALL) ...

🔹 7. Clearer Variable Naming

Problem: Variables like myR_tmp1(4) are ambiguous without context.

Solution: Use more descriptive names:

real(wp), dimension(4) :: bubble_radius_stage

Or comments:

! stage 4 = final RK stage value
bubble_radius = myR_tmp(4)

🔹 8. Add Diagnostic Output (Optional)

When debugging adaptive solvers, a diagnostic printout can be helpful:

if (debug_mode) then
    print *, "Step accepted with h =", h, " err_R =", err_R, " err_V =", err_V
end if

🔹 9. Testing & Validation Support

Include optional test modes where:

  • exact/analytical solutions are compared
  • step convergence is printed
  • intermediate values are logged

@sbryngelson
Copy link
Member

if this adds substepping for EE bubbles can you add a proper test for it, as well as an example?

Copy link

codecov bot commented May 2, 2025

Codecov Report

Attention: Patch coverage is 55.43478% with 123 lines in your changes missing coverage. Please review.

Project coverage is 43.35%. Comparing base (8230af1) to head (1a4d597).

Files with missing lines Patch % Lines
src/simulation/m_bubbles_EL.fpp 35.05% 55 Missing and 8 partials ⚠️
src/simulation/m_bubbles.fpp 74.38% 13 Missing and 18 partials ⚠️
src/simulation/m_bubbles_EL_kernels.fpp 0.00% 6 Missing ⚠️
src/simulation/m_rhs.fpp 53.84% 4 Missing and 2 partials ⚠️
src/simulation/m_start_up.fpp 25.00% 4 Missing and 2 partials ⚠️
src/simulation/m_bubbles_EE.fpp 44.44% 4 Missing and 1 partial ⚠️
src/simulation/m_time_steppers.fpp 72.22% 3 Missing and 2 partials ⚠️
src/common/m_checker_common.fpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #827      +/-   ##
==========================================
+ Coverage   43.15%   43.35%   +0.19%     
==========================================
  Files          68       68              
  Lines       20262    20123     -139     
  Branches     2427     2400      -27     
==========================================
- Hits         8745     8724      -21     
+ Misses      10054     9933     -121     
- Partials     1463     1466       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sbryngelson
Copy link
Member

@dgvacarevelo there is a problem with Frontier runs. You can try to figure it out by asking questions on the Slack channel. The GT Phoenix runners are failing because the computer is under maintenance.

@sbryngelson
Copy link
Member

there are a few remaining things... like the old rkck is not fully removed from the code and some other minor things... check out the review below

Reviewing Fortran PR changes

The user asked for a code review focusing on .f90 and .fpp files in a GitHub PR. I need to look at the modifications made in patches 1 and 2, primarily changes involving the removal of RKCK, which impacts constants and functions. There's also a cleanup related to time step adaptability, and adjustments to the bubble Lagrange code for the GPU region. I'll go through the changes line by line, noting any potential issues, and provide code suggestions. Let me dive in!

Reviewing Fortran code changes

I need to carefully examine changes to style, correctness, bugs, and performance in the Fortran code. Key updates include the removal of dt_max and rkck_adap_dt, which might have leftover references in functions or modules. I'll check whether they impact any parts like range checks, constants, or simulation functions. Other factors to watch include if removed variables, such as rkck_time_tmp, still have references or if any redundant lines remain in the code, particularly in relation to mapCells_pinf.

Reviewing RKCK and time_stepper changes

The user’s changes involve removing RKCK and associated variables like mapCells_pinf. However, there’s a lingering risk that some time_stepper == 4 references and related code haven’t been fully cleaned up. I need to ensure all references are fully removed across modules to guarantee no unexpected behavior. I'll also check for any unnecessary stop commands inside GPU regions and make sure boundary checks and other features (like IEEE checks) are properly implemented. Additionally, documentation and old test cases referencing time_stepper=4 should be updated.

Thought for 1m 2s

Below is a focused review of every .f90 and .fpp change in PR #827, limited to technical issues and concrete improvements. I cite files and approximate line numbers as they appear in the diff headers so you can paste comments directly into the GitHub review UI.


1  Global removal of the RKCK stepper

You eliminated the Cash–Karp (time _stepper == 4) path and its adaptive‐timestep option rkck_adap_dt. That is a good simplification, but a few dangling references remain.

File ⟨line⟩ Comment / fix
src/simulation/m_acoustic_src.fpp ~158 The whole if (time_stepper == 4) sim_time = rkck_time_tmp line is commented in the diff, but rkck_time_tmp was deleted from m_global_parameters. Remove the comment or guard it with #ifdef OBSOLETE_RKCK so it cannot accidentally be re‑enabled.
toolchain/mfc/test/cases.py 815 ff The helper still appends “rkck stepper” branches, but the loop that drives them was commented out. Delete the dead code path entirely to avoid confusion.
Search‑wide   Grep still finds time_stepper == 4 in: m_io_binary.fpp (header stamping), m_variables.fpp (restart parser), and the CMake option help. These all need to be pruned or changed to “unsupported”.

2  m_constants.fpp

All RKCK coefficients were dropped. ✅
No action needed once the stragglers above are deleted.


3  Checker logic

src/common/m_checker_common.fpp

@:PROHIBIT(cfl_target < 0 .or. cfl_target > 1._wp)

Nice clean‑up. Nothing else references rkck_adap_dt in the common checker.

src/simulation/m_checker.fpp

@:PROHIBIT(time_stepper < 1 .or. time_stepper > 4)

After RKCK removal the legal range is 1–3.

- @:PROHIBIT(time_stepper < 1 .or. time_stepper > 4)
+ @:PROHIBIT(time_stepper < 1 .or. time_stepper > 3)

4  m_global_parameters.fpp (simulation variant)

  • ✅ All RKCK scalars removed.

  • Declaration list: the trailing comma after lag_params is now redundant – some compilers treat this as an error in free form.

logical :: bubbles_lagrange
type(bubbles_lagrange_parameters) :: lag_params

5  Bubble‑Lagrange modules

Allocation / de‑allocation blocks

You removed lag_RKCKcoef allocation and its final deallocation guarded by time_stepper == 4. Good.

Runtime STOP statements inside GPU regions

You replaced most device‑side stop calls with host‑side checks – great for portability. Two more remain inside kernels (they look commented in the diff but double‑check):

  • m_bubbles_EL_kernels.fpp: “Check cluterflag. Exiting.” (typo) – now removed, ✅.

  • Any assert‑style stop inside acc parallel blocks must go; the compiler will still reject them.

New host‑side NaN / negative‑radius guard

if (ieee_is_nan(intfc_rad(i,1)) .or. intfc_rad(i,1) <= 0._wp) then

Add the intrinsic module to the top of m_start_up.fpp so PGI/NVHPC recognises it:

use, intrinsic :: ieee_arithmetic

Boundary‑ghost cell checks

You rewrote the logic to silently mirror bubble positions instead of aborting. The revised cellaux arithmetic is correct, but the loop comment still says “must not be located”. Update the comment to describe the new mirroring behaviour.


6  m_time_steppers.fpp

All RKCK allocations, time‑loop and deallocation blocks were removed. Compile‑time follow‑ups:

  • At the top of the module the OpenACC declare create list still references rhs_ts_rkck – delete the symbol:

  • !$acc declare create(q_cons_ts, …, rhs_ts_rkck, q_prim_ts, rhs_mv, rhs_pb, max_dt)

  • !$acc declare create(q_cons_ts, …, q_prim_ts, rhs_mv, rhs_pb, max_dt)

  • The helper variable max_dt is still present but never set now that RKCK is gone. Either remove it or assign it in the remaining steppers.


7  I/O and MPI proxy blocks

rkck_tolerance and friends were stripped from the broadcast lists – 👍.
Double‑check that corresponding array indices in xfer_params tables were renumbered; an off‑by‑one here can produce silent data corruption at runtime.


8  Start‑up (m_start_up.*)

All conditionals of the form

if (cfl_const_dt .or. cfl_adap_dt) cfl_dt = .true.

are correct after RKCK removal. Make the comment above the block say “any CFL‑based stepper” rather than listing every flag.


9  Tests & tooling

You disabled every parametrisation that used RKCK but left the boolean adap_dt variant in place. Make sure the CI still covers one adaptive (CFL) run, else you may lose coverage for the adaptive‑dt path.


10  Minor typos / style

  • “cluterflag” → “cluster_type”.

  • “lag_largestep” -> maybe “lag_largeStep” for readability.

  • Messages beginning with lower‑case “the” or “if” – standardise on initial caps (you already fixed some).


11  Suggested follow‑up clean‑ups

  1. Purge unused variables – run -Werror -Wunused once; expect to see dt_max, rkck_time_tmp, max_dt, lag_largestep, etc. after code removal.

  2. Document the supported steppers in README and docs/user/time_steppers.md. Users upgrading from earlier MFC will need explicit guidance that RKCK is gone.

  3. MPI‑safe aborts – replace every remaining stop "message" in host code with call s_mpi_abort("message") so multi‑rank runs don’t hang.

  4. Unit tests – add one regression test that tries to set time_stepper=4 and expects the checker to reject it, proving the prohibition works.


Overall assessment

✔ The removal is almost complete and will shrink maintenance burden.

A small amount of “dangling dust” (checker range, obsolete comments, unused symbols) remains. Once those are swept up the PR should compile cleanly on all supported compilers and GPUs.


Let me know if you’d like the full git apply patch for the fixes above or a deeper look at any one module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants