Add a frontend for Quokka dataset #5168

chongchonghe · 2025-05-15T06:29:05Z

PR Summary

This PR add a frontend for loading Quokka dataset. This frontend fully support uniform-grid or AMR datasets in cartesian coordinates and particles. QUOKKA is based on the AMReX framework, so we define a new QuokkaDataset class that inherits from AMReXDataset.

This PR is fully tested on my computer. We have also added Answer Tests in frontends/amrex/test_outputs.py to validate new features with Quokka data that we provided.

We prepared a Jupyter Notebook with examples and test dataset: https://github.com/Rongjun-ANU/README-of-yt-frontend-for-QUOKKA/blob/main/README.ipynb

Authors: @Rongjun-ANU @chongchonghe

Quokka is a two-moment AMR radiation hydrodynamics (with self-gravity, particles, and chemistry) on CPUs/GPUs for astrophysics.

PR Checklist

New features are documented, with docstrings and narrative docs
Adds a test for any bugs fixed. Adds tests for new features.

…sing (#3) Updated the following classes to read header file and `metadata.yaml` in Quokka simulations. ``` class QuokkaHierarchy(BoxlibHierarchy) class QuokkaDataset(AMReXDataset) ``` --------- Co-authored-by: Rongjun-ANU <Rongjun.Huang@alumni.anu.edu.au>

…elds fields (#5) This pull request introduces support for the Quokka simulation framework by adding new dataset and hierarchy classes, and updating field information. The most important changes include the addition of `QuokkaDataset` and `QuokkaHierarchy` classes, the parsing of metadata and parameter files, and the setup of fluid and radiation fields. ### Support for Quokka Framework: * Added `QuokkaDataset` and `QuokkaHierarchy` classes to handle Quokka-specific data structures and hierarchy. (`yt/frontends/amrex/data_structures.py`, [yt/frontends/amrex/data_structures.pyR1332-R1571](diffhunk://#diff-7aa59ffaeb7262d9580ae294f19155633ce68e2dda607cc4e2ddbf1bc20b66b3R1332-R1571)) * Implemented `_parse_parameter_file` and `_parse_metadata_file` methods in `QuokkaDataset` to read and parse metadata from `metadata.yaml` and other parameter files. (`yt/frontends/amrex/data_structures.py`, [yt/frontends/amrex/data_structures.pyR1332-R1571](diffhunk://#diff-7aa59ffaeb7262d9580ae294f19155633ce68e2dda607cc4e2ddbf1bc20b66b3R1332-R1571)) ### Field Information Updates: * Added `QuokkaFieldInfo` class to define known fields and set up fluid and radiation fields dynamically. (`yt/frontends/amrex/fields.py`, [yt/frontends/amrex/fields.pyR506-R597](diffhunk://#diff-565dd12ba00274995d4c360bd7cb3f7d43068a0f2c7f93fe2d33978c1e90c98fR506-R597)) ### API Updates: * Included `QuokkaDataset`, `QuokkaHierarchy`, and `QuokkaFieldInfo` in the API imports to make them accessible. (`yt/frontends/amrex/api.py`, [[1]](diffhunk://#diff-92252639991f47b99474012776b030614e644789c56c7a9c39661ee5a301f87bR14-R15) [[2]](diffhunk://#diff-92252639991f47b99474012776b030614e644789c56c7a9c39661ee5a301f87bR24) ### Miscellaneous: * Added `yaml` import for parsing metadata files. (`yt/frontends/amrex/data_structures.py`, [yt/frontends/amrex/data_structures.pyR1](diffhunk://#diff-7aa59ffaeb7262d9580ae294f19155633ce68e2dda607cc4e2ddbf1bc20b66b3R1)) --------- Co-authored-by: Rongjun-ANU <Rongjun.Huang@alumni.anu.edu.au>

slight update on `class QuokkaHierarchy(BoxlibHierarchy)` to fix the bug, and also add support on multi particle types. --------- Co-authored-by: Rongjun-ANU <Rongjun.Huang@alumni.anu.edu.au>

add support for `Fields.json` in particle folders. also readout the unit of particles. --------- Co-authored-by: Rongjun-ANU <Rongjun.Huang@alumni.anu.edu.au>

update for Bfields --------- Co-authored-by: Rongjun-ANU <Rongjun.Huang@alumni.anu.edu.au>

rename Fields.json to Fields.yaml --------- Co-authored-by: Rongjun-ANU <Rongjun.Huang@alumni.anu.edu.au>

add quokka = ["PyYAML>=6.0.1"] (#13)

Co-authored-by: Rongjun-ANU <Rongjun.Huang@alumni.anu.edu.au>

Recently I have updated the format of metadata.yaml from Quokka simulations, and also include `quokka_version` variable in metadata.yaml. See quokka-astro/quokka#935 . This PR updates the Quokka frontend to support this new, nested yaml metadata. ## PR Checklist  - [ ] New features are documented, with docstrings and narrative docs - [ ] Adds a test for any bugs fixed. Adds tests for new features.

## PR Summary   ## PR Checklist  - [ ] New features are documented, with docstrings and narrative docs - [ ] Adds a test for any bugs fixed. Adds tests for new features.

update YT docs for Quokka frontend

update the `yt/frontends/amrex/tests/test_outputs.py` to run pytest for quokka dataset --------- Co-authored-by: Rongjun-ANU <Rongjun.Huang@alumni.anu.edu.au> Co-authored-by: ChongChong He <chongchong.he@anu.edu.au>

Add support for reading face-centered dataset. See changes in loading_data.rst for details. ## PR Summary   ## PR Checklist  - [x] New features are documented, with docstrings and narrative docs - [ ] Adds a test for any bugs fixed. Adds tests for new features.

chongchonghe · 2025-05-15T07:14:56Z

@BenWibking @Rongjun-ANU

chongchonghe · 2025-05-15T07:23:02Z

I couldn't understand the remaining pre-commit.ci failsures. I would appreciate some explanation or help from the reviewers.

chongchonghe · 2025-05-15T07:27:42Z

pre-commit.ci autofix

doc/source/examining/loading_data.rst

chongchonghe · 2025-05-16T02:40:28Z

pre-commit.ci autofix

chrishavlin · 2025-05-16T18:59:33Z

@chongchonghe I should be able to take a detailed look late next week.

one quick question: do you have a sample dataset you can share and upload? I think it'd be particularly helpful to have one that has the face-centered data as well. The way you've set up this frontend to have a nested dataset if the face-centered data exists is intriguing. But it may introduce some confusing bugs -- the yt dataset-related classes rely on many class attributes and loading multiple datasets of the same class at once can lead to errors (e.g., like the one that PR #5159 fixed). So it'd be good if we can thoroughly test that case.

Usually, we try to get sample datasets uploaded to the yt website so that new frontend tests can utilize the data (usually done via a pull request to https://github.com/yt-project/website like this one or this one, but let me check back with you before you do so. The server that the full test suite runs on is having some issues, so I'm not sure we'll follow the usual protocol right now... at the very least it'd be useful if you could share some datasets so that I can download and run the new test(s) locally.

chongchonghe · 2025-05-17T02:50:01Z

@chrishavlin

This repo contains face-centered Quokka dataset (sample/HydroWave) and a script (test.py) to test that:
https://github.com/Rongjun-ANU/README-of-yt-frontend-for-QUOKKA . Running test.py using the YT from this PR is successful on my computer.

The dataset HydroWave and RadiatingParticles in the repo above is used in the Answer Test for this frontend. However, I don't know how to upload these data to the server for CI tests.

chongchonghe · 2025-05-18T02:13:39Z

Unfortunately, I couldn't get pytest to work on my computer. I always get this error:

E   ModuleNotFoundError: No module named 'imp'

It seems like the answer_testing in yt is using nose, which relies on imp, which is deprecated. However, I run the new tests I added in this PR on my computer as a python script and it worked, so I'm sure the unit tests will pass once we upload the dataset to the server.

chrishavlin · 2025-05-20T19:18:55Z

thanks for the dataset! hopefully will have time to try things out later this week.

chongchonghe · 2025-05-27T04:09:05Z

Hi @chrishavlin , did you have time to try out this PR?

chrishavlin · 2025-05-27T15:26:07Z

Hi @chongchonghe thanks for the ping, but unfortunately not yet. Had to bump it to this week's to-do list .

chrishavlin

So I'm still working through the code, but wanted to bring up a high-level question.

Normally, when frontends introduce dependencies, those dependencies should be added only to the frontend that requires them. But right now you're making PyYaml a dependency of all the amrex subclass frontends. It's a pretty small dependency, so I'm inclined to let it go in this instance, especially since it's already a dependency for a full yt install (via pyaml)... @neutrinoceros do you have an opinion on this?

For reference, here's a diff that would make PyYaml a dependency only for Quokka (adjusting pyproject.toml, moving the imports to the functions where they're used):

Diff is here

diff --git a/pyproject.toml b/pyproject.toml
index 03300c836..16a5f9b41 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -89,7 +89,7 @@ Fortran = ["f90nml>=1.1"]
 # We also normalize all target names to lower case for consistency.
 adaptahop = []
 ahf = []
-amrex = ["PyYAML>=6.0.1"]
+amrex = []
 amrvac = ["yt[Fortran]"]
 art = []
 arepo = ["yt[HDF5]"]
@@ -121,7 +121,7 @@ open-pmd = ["yt[HDF5]"]
 owls = ["yt[HDF5]"]
 owls-subfind = ["yt[HDF5]"]
 parthenon = ["yt[HDF5]"]
-quokka = ["PyYAML>=6.0.1"]
+quokka = ["yt[amrex]", "PyYAML>=6.0.1"]
 ramses = ["yt[Fortran]", "scipy"]
 rockstar = []
 sdf = ["requests>=2.20.0"]
diff --git a/yt/frontends/amrex/data_structures.py b/yt/frontends/amrex/data_structures.py
index 7a27f4efb..4e584de6b 100644
--- a/yt/frontends/amrex/data_structures.py
+++ b/yt/frontends/amrex/data_structures.py
@@ -6,7 +6,6 @@ from functools import cached_property
 from stat import ST_CTIME
 
 import numpy as np
-import yaml
 
 from yt.data_objects.index_subobjects.grid_patch import AMRGridPatch
 from yt.data_objects.static_output import Dataset
@@ -1373,6 +1372,7 @@ class QuokkaDataset(AMReXDataset):
     _field_info_class = QuokkaFieldInfo
     _subtype_keyword = ""
     _default_cparam_filename = "metadata.yaml"
+    _load_requirements = ["yaml"]
 
     def __init__(
         self,
@@ -1465,6 +1465,8 @@ class QuokkaDataset(AMReXDataset):
                 mylog.debug(f"No face-centered {direction} dataset found at {fc_dir}")
 
     def _parse_parameter_file(self):
+        import yaml
+
         # Call parent method to initialize core setup by yt
         super()._parse_parameter_file()
 
@@ -1686,6 +1688,8 @@ class QuokkaDataset(AMReXDataset):
         mylog.debug("Parsed header parameters: %s", self.parameters)
 
     def _parse_metadata_file(self):
+        import yaml
+
         # Construct the full path to the metadata file
         metadata_filename = os.path.join(self.output_dir, self.cparam_filename)
         try:

doc/source/examining/loading_data.rst

pyproject.toml

neutrinoceros · 2025-05-29T16:53:39Z

@neutrinoceros do you have an opinion on this?

Well, I think I'd have been more lenient with letting this one through... if you hadn't already done the work to separate dependencies more clearly. Now I'm in favor of committing your patch :)

chrishavlin · 2025-05-29T18:45:53Z

Now I'm in favor of committing your patch :)

Hah, ok, then let's go with that. @chongchonghe , could you update the code so that pyyaml is only needed for quokka (check the diff in my previous comment)?

chongchonghe · 2025-06-10T14:48:49Z

@chrishavlin I've implemented the changes as you suggested. Thanks a lot!

I have also been trying to optimize how we deal with face-centering variables. I came to the conclusion that the current approach is the optimal approach. QUOKKA does not currently support writing FC variables as 'raw' data as WarpX does. So, I did a hack by loading the face-centered quantities as if they are cell-centered. Raw and "nodal" fields will be supported in the future.

Sorry for my delayed response. I've been overwhelmed with a proposal recently. Please let me know if any revisions are required, or if you need any additional information from me before we can proceed with merging this PR.

chrishavlin · 2025-06-11T15:11:27Z

Great, thanks! I'll do another read through and will likely have some code suggestions, but probably won't be anything major.

neutrinoceros · 2025-06-22T14:26:03Z

I am not currently available to review a large PR as this one, I apologize for the inconvenience.

chongchonghe · 2025-10-10T00:33:44Z

@chrishavlin Just want to pin you for a review on this PR.

chrishavlin

Thanks for your patience @chongchonghe ! I have some minor suggestions here to clean up some of the new code. I'll also see if anyone else is available for a quick review since we like to have at least 2 reviews for new frontends: but I recognize this has been around for quite a while and it'd be nice to get it merged ASAP.

And I did download the test data and confirmed that the new tests run successfully. I'll make a companion PR to the yt website repo with the data in a convenient form so it ends up where it needs to eventually, but note that the tests won't actually run for now while yt's continuous integration server is down.

chrishavlin · 2025-10-15T19:54:05Z

yt/frontends/amrex/tests/test_outputs.py

    assert type(ds.parameters["s0_interp_type"]) is int  # noqa: E721


+quokka = "quokka_RadiatingParticles_plt026"


Given how you have https://github.com/Rongjun-ANU/README-of-yt-frontend-for-QUOKKA/tree/main set up, it will be easier to just use paths here:

Suggested change

quokka = "quokka_RadiatingParticles_plt026"

quokka = "quokka/RadiatingParticles/plt026"

Then the sample folder from https://github.com/Rongjun-ANU/README-of-yt-frontend-for-QUOKKA/tree/main can be copied to a quokka directory within yt's test data directory (rather than renaming, e.g., sample/RadiatingParticles/plt026 to sample_RadiatingParticles_plt026).

chrishavlin · 2025-10-15T19:54:18Z

yt/frontends/amrex/tests/test_outputs.py

+    assert particle_info["units"]["end_time"] == "T^1"
+
+
+quokka_face_centered = "quokka_HydroWave_plt00004"


see previous comment.

Suggested change

quokka_face_centered = "quokka_HydroWave_plt00004"

quokka_face_centered = "quokka/HydroWave/plt00004"

chrishavlin · 2025-10-15T20:10:38Z