Electrode workflow and documentation improvements (#1055)

esoteric-ephemera · web-flow · commit 0b66db372e94 · 2025-04-24T08:47:03.000-07:00
* Ensure electrode workflow doesn't store charge density in JobStore by default / better naming of tasks in wf

* Documentation improvements

* Remove strict-tblite dependence and patch openmm tests to work with newer dependency stack
diff --git a/.github/workflows/testing.yml b/.github/workflows/testing.yml
@@ -199,6 +199,7 @@ jobs:
           micromamba activate a2
           python -m pip install --upgrade pip
           uv pip install .[strict,tests]
+          uv pip install tblite>=0.4.0
 
       - name: Install pymatgen from master if triggered by pymatgen repo dispatch
         if: github.event_name == 'repository_dispatch' && github.event.action == 'pymatgen-ci-trigger'
diff --git a/docs/user/codes/vasp.md b/docs/user/codes/vasp.md
@@ -97,7 +97,25 @@ functional. Full structural relaxation is performed.
 ### Double Relax
 
 Perform two back-to-back relaxations. This can often help avoid errors arising from
-Pulay stress.
+[Pulay stress](https://www.vasp.at/wiki/index.php/Pulay_stress).
+
+In short: While the cell size, shape, symmetry, etc. can change during a relaxation, the *k* point grid does not change with it.
+Additionally, the number of plane waves is held constant during a relaxation.
+Both features lead to artificial (numerical) stress due to under-convergence of a relaxation with respect to the basis set.
+To avoid this, we perform a single relaxation, and input its final structure to another relaxation calculation.
+At the start of the second relaxation, the *k*-point mesh and plane waves are adjusted to reflect the new symmetry of the cell.
+
+### Materials Project structure optimization
+
+The Materials Project hosts a large database of, among other physical properties, optimized structures and their associated total energy, formation enthalpy, and basic electronic structure properties.
+To generate this data, the Materials Project uses a simple double-relaxation followed by a final static calculation.
+While in principle, if the second relaxation calculation is converged, a final static calculation would not be needed.
+However, the second relaxation may have residual Pulay stress, and VASP averages some electronic structure data ([like the density of states](https://www.vasp.at/wiki/index.php/DOSCAR)) during a relaxation.
+Thus we need to perform a final single-point (static) calculation, usually using the corrected tetrahedron method (`ISMEAR=-5`) to ensure accurate electronic structure properties.
+
+The workflows used to produce PBE GGA or GGA+*U* and r<sup>2</sup>SCAN thermodynamic data are, respectively, `MPGGADoubleRelaxStaticMaker` and `MPMetaGGADoubleRelaxStaticMaker` in `atomate2.vasp.flows.mp`.
+Moving forward, the Materials Project prefers r<sup>2</sup>SCAN calculations, but maintains its older set of GGA-level data which currently has wider coverage.
+For documentation about the calculation parameters used, see the [Materials Project documentation.](https://docs.materialsproject.org/methodology/materials-methodology/calculation-details)
 
 ### Band Structure
 
@@ -616,6 +634,33 @@ written:
 static_job.maker.input_set_generator.user_incar_settings["LOPTICS"] = True
 ```
 
+To update *k*-points, use the `user_kpoints_settings` keyword argument of an input set generator.
+You can supply either a `pymatgen.io.vasp.inputs.Kpoints` object, or a `dict` containing certain [keys](https://github.com/materialsproject/pymatgen/blob/b54ac3e65e46b876de40402e8da59f551fb7d005/src/pymatgen/io/vasp/sets.py#L812).
+We generally recommend the former approach unless the user is familiar with the specific style of *k*-point updates used by `pymatgen`.
+For example, to use just the $\Gamma$ point:
+
+```py
+from pymatgen.io.vasp.inputs import Kpoints
+from atomate2.vasp.sets.core import StaticSetGenerator
+from atomate2.vasp.jobs.core import StaticMaker
+
+custom_gamma_only_set = StaticSetGenerator(user_kpoints_settings=Kpoints())
+gamma_only_static_maker = StaticMaker(input_set_generator=custom_gamma_only_set)
+```
+
+For those who are more familiar with manual *k*-point generation, you can use a VASP-style KPOINTS file or string to set the *k*-points as well:
+
+```py
+kpoints = Kpoints.from_str(
+    """Uniform density Monkhorst-Pack mesh
+0
+Monkhorst-pack
+5 5 5
+"""
+)
+custom_static_set = StaticSetGenerator(user_kpoints_settings=kpoints)
+```
+
 Finally, sometimes you have a workflow containing many VASP jobs. In this case it can be
 tedious to update the input sets for each job individually. Atomate2 provides helper
 functions called "powerups" that can apply settings updates to all VASP jobs in a flow.
@@ -663,8 +708,7 @@ modification of several additional VASP settings, such as the k-points
 
 If a greater degree of flexibility is needed, the user can define a default set of input
 arguments (`config_dict`) that can be provided to the {obj}`.VaspInputGenerator`.
-By default, the {obj}`.VaspInputGenerator` uses a base set of VASP input parameters
-from {obj}`.BaseVaspSet.yaml`, which each `Maker` is built upon. If desired, the user can
+By default, the {obj}`.VaspInputGenerator` uses a base set of VASP input parameters (`atomate2.vasp.sets.base._BASE_VASP_SET`), which each `Maker` is built upon. If desired, the user can
 define a custom `.yaml` file that contains a different base set of VASP settings to use.
 An example of how this can be done is shown below for a representative static
 calculation.
diff --git a/pyproject.toml b/pyproject.toml
@@ -63,10 +63,9 @@ forcefields = [
     "torchdata<=0.7.1",                            # TODO: remove when issue fixed
 ]
 ase = ["ase>=3.23.0"]
-# tblite py3.12 support tracked in https://github.com/tblite/tblite/issues/198
-ase-ext = ["tblite>=0.3.0; python_version < '3.12'"]
+ase-ext = ["tblite>=0.3.0; platform_system=='Linux'"]
 openmm = [
-    "mdanalysis>=2.7.0",
+    "mdanalysis>=2.8.0",
     "openmm-mdanalysis-reporter>=0.1.0",
     "openmm>=8.1.0",
 ]
@@ -115,16 +114,14 @@ strict = [
     "pymongo==4.10.1",
     "python-ulid==3.0.0",
     "seekpath==2.1.0",
-    "tblite==0.3.0; python_version < '3.12'",
     "typing-extensions==4.13.2",
 ]
 strict-openff = [
     "mdanalysis==2.9.0",
     "monty==2025.3.3",
     "openmm-mdanalysis-reporter==0.1.0",
     "openmm==8.1.1",
-    "pymatgen==2025.4.20", # TODO: open ff is extremely sensitive to pymatgen version
-    "mdanalysis==2.9.0"
+    "pymatgen==2024.11.13", # TODO: open ff is extremely sensitive to pymatgen version
 ]
 strict-forcefields = [
     "calorine==3.0",
@@ -185,6 +182,7 @@ exclude_lines = [
     '^\s*@overload( |$)',
     '^\s*assert False(,|$)',
     'if typing.TYPE_CHECKING:',
+    'if TYPE_CHECKING:',
 ]
 
 [tool.ruff]
diff --git a/src/atomate2/common/flows/electrode.py b/src/atomate2/common/flows/electrode.py
@@ -120,6 +120,10 @@ def make(
             relax = self.bulk_relax_maker.make(structure)
         else:
             relax = self.relax_maker.make(structure)
+
+        _shown_steps = str(n_steps) if n_steps else "inf"
+        relax.append_name(f" 0/{_shown_steps}")
+
         # add ignored_species to the structure matcher
         sm = _add_ignored_species(self.structure_matcher, inserted_element)
         # Get the inserted structure
@@ -132,6 +136,7 @@ def make(
             get_charge_density=self.get_charge_density,
             n_steps=n_steps,
             insertions_per_step=insertions_per_step,
+            n_inserted=1,
         )
         relaxed_summary = RelaxJobSummary(
             structure=relax.output.structure,
diff --git a/src/atomate2/common/jobs/electrode.py b/src/atomate2/common/jobs/electrode.py
@@ -21,7 +21,7 @@
     from pymatgen.analysis.structure_matcher import StructureMatcher
     from pymatgen.core import Structure
     from pymatgen.entries.computed_entries import ComputedEntry
-    from pymatgen.io.vasp.outputs import VolumetricData
+    from pymatgen.io.common import VolumetricData
 
 
 logger = logging.getLogger(__name__)
@@ -84,17 +84,21 @@ def get_stable_inserted_results(
         The number of ions inserted so far, used to help assign a unique name to the
         different jobs.
     """
-    if structure is None:
-        return []
-    if n_steps is not None and n_steps <= 0:
+    if (
+        (structure is None)
+        or (n_steps is not None and n_steps <= 0)
+        or (n_inserted > n_steps)
+    ):
         return []
     # append job name
-    add_name = f"{n_inserted}"
+    _shown_steps = str(n_steps) if n_steps else "inf"
+    add_name = f"{n_inserted}/{_shown_steps}"
 
     static_job = static_maker.make(structure=structure)
-    chg_job = get_charge_density_job(static_job.output.dir_name, get_charge_density)
+    static_job.append_name(f" {n_inserted - 1}/{_shown_steps}")
     insertion_job = get_inserted_structures(
-        chg_job.output,
+        static_job.output.dir_name,
+        get_charge_density,
         inserted_species=inserted_element,
         insertions_per_step=insertions_per_step,
     )
@@ -107,7 +111,6 @@ def get_stable_inserted_results(
         ref_structure=structure,
         structure_matcher=structure_matcher,
     )
-    nn_step = n_steps - 1 if n_steps is not None else None
     next_step = get_stable_inserted_results(
         structure=min_en_job.output[0],
         inserted_element=inserted_element,
@@ -116,17 +119,14 @@ def get_stable_inserted_results(
         relax_maker=relax_maker,
         get_charge_density=get_charge_density,
         insertions_per_step=insertions_per_step,
-        n_steps=nn_step,
+        n_steps=n_steps,
         n_inserted=n_inserted + 1,
     )
 
-    for job_ in [static_job, chg_job, insertion_job, min_en_job, relax_jobs, next_step]:
-        job_.append_name(f" {add_name}")
     combine_job = get_computed_entries(next_step.output, min_en_job.output)
     replace_flow = Flow(
         jobs=[
             static_job,
-            chg_job,
             insertion_job,
             relax_jobs,
             min_en_job,
@@ -204,7 +204,8 @@ def get_insertion_electrode_doc(
 
 @job
 def get_inserted_structures(
-    chg: VolumetricData,
+    prev_dir: Path | str,
+    get_charge_density: Callable[[str | Path], VolumetricData],
     inserted_species: ElementLike,
     insertions_per_step: int = 4,
     charge_insertion_generator: ChargeInterstitialGenerator | None = None,
@@ -213,7 +214,8 @@ def get_inserted_structures(
 
     Parameters
     ----------
-    chg: The charge density.
+    prev_dir: The previous directory where the static calculation was performed.
+    get_charge_density: A function to get the charge density from a run directory.
     inserted_species: The species to insert.
     insertions_per_step: The maximum number of ion insertion sites to attempt.
     charge_insertion_generator: The charge insertion generator to use,
@@ -226,6 +228,7 @@ def get_inserted_structures(
     """
     if charge_insertion_generator is None:
         charge_insertion_generator = ChargeInterstitialGenerator()
+    chg = get_charge_density(prev_dir)
     gen = charge_insertion_generator.generate(chg, insert_species=[inserted_species])
     inserted_structures = [defect.defect_structure for defect in gen]
     return inserted_structures[:insertions_per_step]
@@ -297,22 +300,3 @@ def get_min_energy_summary(
         return None
 
     return min(topotactic_summaries, key=lambda x: x.entry.energy_per_atom)
-
-
-@job
-def get_charge_density_job(
-    prev_dir: Path | str,
-    get_charge_density: Callable,
-) -> VolumetricData:
-    """Get the charge density from a task document.
-
-    Parameters
-    ----------
-    prev_dir: The previous directory where the static calculation was performed.
-    get_charge_density: A function to get the charge density from a task document.
-
-    Returns
-    -------
-        The charge density.
-    """
-    return get_charge_density(prev_dir)
diff --git a/src/atomate2/openmm/jobs/base.py b/src/atomate2/openmm/jobs/base.py
@@ -317,12 +317,6 @@ def _add_reporters(
             if traj_file_type in ("h5md", "nc", "ncdf", "json"):
                 writer_kwargs["velocities"] = report_velocities
                 writer_kwargs["forces"] = False
-            elif report_velocities and traj_file_type != "trr":
-                raise ValueError(
-                    f"File type {traj_file_type} does not support velocities as"
-                    f"of MDAnalysis 2.7.0. Select another file type"
-                    f"or do not attempt to report velocities."
-                )
 
             traj_file = dir_name / f"{traj_file_name}.{traj_file_type}"
 
@@ -341,15 +335,23 @@ def _add_reporters(
             else:
                 if report_velocities:
                     # assert package version
-
-                    kwargs["writer_kwargs"] = writer_kwargs
                     warnings.warn(
                         "Reporting velocities is only supported with the"
                         "development version of MDAnalysis, >= 2.8.0, "
                         "proceed with caution.",
                         stacklevel=1,
                     )
-                traj_reporter = MDAReporter(**kwargs)
+
+                try:
+                    traj_reporter = MDAReporter(**kwargs, writer_kwargs=writer_kwargs)
+                except TypeError:
+                    warnings.warn(
+                        "The current version of `openmm-mdanalysis-reporter` "
+                        "does not support `writer_kwargs`. To use these features, "
+                        "pip install this package from the github source.",
+                        stacklevel=2,
+                    )
+                    traj_reporter = MDAReporter(**kwargs)
 
             sim.reporters.append(traj_reporter)
 
diff --git a/tests/abinit/conftest.py b/tests/abinit/conftest.py
@@ -130,9 +130,9 @@ def check_run_abi(ref_path: str | Path):
 
     user = AbinitInputFile.from_file("run.abi")
     assert user.ndtset == 1, f"'run.abi' has multiple datasets (ndtset={user.ndtset})."
-    with zopen(ref_path / "inputs" / "run.abi.gz") as file:
+    with zopen(ref_path / "inputs" / "run.abi.gz", "rt", encoding="utf-8") as file:
         ref_str = file.read()
-    ref = AbinitInputFile.from_string(ref_str.decode("utf-8"))
+    ref = AbinitInputFile.from_string(ref_str)
     # Ignore the pseudos as the directory depends on the pseudo root directory
     # diffs = user.get_differences(ref, ignore_vars=["pseudos"])
     diffs = _get_differences_tol(user, ref, ignore_vars=["pseudos"])
diff --git a/tests/openmm_md/flows/test_core.py b/tests/openmm_md/flows/test_core.py
@@ -3,6 +3,7 @@
 import io
 from pathlib import Path
 
+import pytest
 from emmet.core.openmm import OpenMMInterchange, OpenMMTaskDocument
 from jobflow import Flow
 from MDAnalysis import Universe
@@ -11,6 +12,11 @@
 from atomate2.openmm.flows.core import OpenMMFlowMaker
 from atomate2.openmm.jobs import EnergyMinimizationMaker, NPTMaker, NVTMaker
 
+try:
+    import h5py
+except ImportError:
+    h5py = None
+
 
 def test_anneal_maker(interchange, run_job):
     # Create an instance of AnnealMaker with custom parameters
@@ -54,6 +60,9 @@ def test_anneal_maker(interchange, run_job):
 
 
 # @pytest.mark.skip("Reporting to HDF5 is broken in MDA upstream.")
+@pytest.mark.skipif(
+    condition=h5py is None, reason="h5py is required for HDF5 features."
+)
 def test_hdf5_writing(interchange, run_job):
     # Create an instance of AnnealMaker with custom parameters
     import MDAnalysis
diff --git a/tests/openmm_md/jobs/test_base.py b/tests/openmm_md/jobs/test_base.py
@@ -33,7 +33,13 @@ def test_add_reporters(interchange, tmp_path):
     assert next_dcd[5] is True  # enforce periodic boundaries
     assert isinstance(sim.reporters[1], StateDataReporter)
     next_state = sim.reporters[1].describeNextReport(sim)
-    assert next_state[0] == 50  # steps until next report
+
+    # steps until next report
+    # TODO: make test more robust
+    if isinstance(next_state, dict):
+        assert next_state["steps"] == 50
+    else:
+        assert next_state[0] == 50
 
 
 def test_resolve_attr():
@@ -180,14 +186,12 @@ def do_nothing(self, sim, dir_name):
         report_velocities=True,
     )
 
-    with pytest.raises(RuntimeError):
-        run_job(maker1.make(interchange))
-        # run_job(base_job)
-
     import MDAnalysis
     from packaging.version import Version
 
     if Version(MDAnalysis.__version__) < Version("2.8.0"):
+        with pytest.raises(RuntimeError):
+            run_job(maker1.make(interchange))
         return
 
     maker2 = BaseOpenMMMaker(
diff --git a/tests/openmm_md/test_utils.py b/tests/openmm_md/test_utils.py
@@ -12,6 +12,10 @@
     increment_name,
 )
 
+"""
+TODO: Needs revision
+"""
+
 
 @pytest.mark.skip("annoying test")
 def test_download_xml(tmp_path: Path) -> None:
diff --git a/tests/vasp/flows/test_electrode.py b/tests/vasp/flows/test_electrode.py

Original file line number	Diff line number	Diff line change
`@@ -12,6 +12,10 @@`
`12`	`12`	`increment_name,`
`13`	`13`	`)`
`14`	`14`
	`15`	`+"""`
	`16`	`+TODO: Needs revision`
	`17`	`+"""`
	`18`	`+`
`15`	`19`
`16`	`20`	`@pytest.mark.skip("annoying test")`
`17`	`21`	`def test_download_xml(tmp_path: Path) -> None:`