Releases · NVIDIA/physicsnemo

25 Aug 18:31

ktangsali

v1.2.0

17efe3e

v1.2.0 Latest

Latest

PhysicsNeMo General Release v1.2.0

Added

Diffusion Transformer (DiT) model. The DiT model can be accessed in
physicsnemo.experimental.models.dit.DiT. ⚠️Warning: - Experimental feature
subject to future API changes.
Improved documentation for diffusion models and diffusion utils.
Safe API to override __init__'s arguments saved in checkpoint file with
Module.from_checkpoint("chkpt.mdlus", override_args=set(...)).
PyTorch Geometric MeshGraphNet backend.
Functionality in DoMINO to take arbitrary number of scalar or vector
global parameters and encode them using class ParameterModel
TopoDiff model and example.
Added ability for DoMINO model to return volume neighbors.
Added functionality in DoMINO recipe to introduce physics residual losses.
Diffusion models, metrics, and utils: implementation of Student-t
distribution for EDM-based diffusion models (t-EDM). This feature is adapted
from the paper Heavy-Tailed Diffusion Models, Pandey et al..
This includes a new EDM preconditioner (tEDMPrecondSuperRes), a loss
function (tEDMResidualLoss), and a new option in corrdiff diffusion_step.
⚠️ This is an experimental feature that can be accessed through the
physicsnemo.experimental module; it might also be subjected to API changes
without notice.
Bumped Ruff version from 0.0.290 to 0.12.5. Replaced Black with ruff-format.
Domino improvements with Unet attention module and user configs
Hybrid MeshGraphNet for modeling structural deformation
Enabled TransformerEngine backend in the transolver model.
Inference code for x-meshgraphnet example for external aerodynamics.
Added a new example for external_aerodynamics: training transolver on
irregular mesh data for DrivaerML surface data.
Added a new example for external aerodynamics for finetuning pretrained models.

Changed

Diffusion utils: physicsnemo.utils.generative renamed into physicsnemo.utils.diffusion
Diffusion models: in CorrDiff model wrappers (EDMPrecondSuperResolution and
UNet), the arguments profile_mode and amp_mode cannot be overriden by
from_checkpoint. They are now properties that can be dynamically changed
after the model instantiation with, for example, model.amp_mode = True
and model.profile_mode = False.
Updated healpix data module to use correct DistributedSampler target for
test data loader
Existing DGL-based vortex shedding example has been renamed to vortex_shedding_mgn_dgl.
Added new vortex_shedding_mgn example that uses PyTorch Geometric instead.
HEALPixLayer can now use earth2grid HEALPix padding ops, if desired
Migrated Vortex Shedding Reduced Mesh example to PyTorch Geometric.
CorrDiff example: fixed bugs when training regression UNet.
Diffusion models: fixed bugs related to gradient checkpointing on non-square
images.
Diffusion models: created a separate class Attention for clarity and
modularity. Updated UNetBlock accordingly to use the Attention class
instead of custom attention logic. This will update the model architecture
for SongUNet-based diffusion models. Changes are not BC-breaking and are
transparent to the user.
⚠️ BC-breaking: refactored the automatic mixed precision
(AMP) API in layers and models defined in physicsnemo/models/diffusion/ for
improved usability. Note: it is now, not only possible, but required to
explicitly set model.amp_mode = True in order to use the model in a
torch.autocast clause. This applies to all SongUNet-based models.
Diffusion models: fixed and improved API to enable fp16 forward pass in
UNet and EDMPrecondSuperResolution model wrappers; fp16 forward pass can
now be toggled/untoggled by setting model.use_fp16 = True.
Diffusion models: improved API for Apex group norm. SongUNet-based models
will automatically perform conversion of the input tensors to
torch.channels_last memory format when model.use_apex_gn is True. New
warnings are raised when attempting to use Apex group norm on CPU.
Diffusion utils: systematic compilation of patching operations in stochastic_sampler
for improved performance.
CorrDiff example: added option for Student-t EDM (t-EDM) in train.py and
generate.py. When training a CorrDiff diffusion model, this feature can be
enabled with the hydra overrides ++training.hp.distribution=student_t and
++training.hp.nu_student_t=<nu_value>. For generation, this feature can be
enabled with similar overrides: ++generation.distribution=student_t and
++generation.nu_student_t=<nu_value>.
CorrDiff example: the parameters P_mean and P_std (used to compute the
noise level sigma) are now configurable. They can be set with the hydra
overrides ++training.hp.P_mean=<P_mean_value> and
++training.hp.P_std=<P_std_value> for training (and similar ones with
training.hp replaced by generation for generation).
Diffusion utils: patch-based inference and lead time support with
deterministic sampler.
Existing DGL-based XAeroNet example has been renamed to xaeronet_dgl.
Added new xaeronet example that uses PyTorch Geometric instead.
Updated the deforming plate example to use the Hybrid MeshGraphNet model.
⚠️ BC-breaking: Refactored the transolver model to improve
readability and performance, and extend to more use cases.
Diffusion models: improved lead time support for SongUNetPosLtEmbd and
EDMLoss. Lead-time embeddings can now be used with/without positional
embeddings.
Diffusion models: consolidate ApexGroupNorm and GroupNorm in
models/diffusion/layers.py with a factory get_group_norm that can
be used to instantiate either one of them. get_group_norm is now the
recommended way to instantiate a GroupNorm layer in SongUNet-based and
other diffusion models.
Physicsnemo models: improved checkpoint loading API in
Module.from_checkpoint that now exposes a strict parameter to raise error
on missing/unexpected keys, similar to that used in
torch.nn.Module.load_state_dict.
Migrated Hybrid MGN and deforming plate example to PyTorch Geometric.

Fixed

Bug fixes in DoMINO model in sphere sampling and tensor reshaping
Bug fixes in DoMINO utils random sampling and test.py
Optimized DoMINO config params based on DrivAer ML

Assets 2

16 Jun 19:59

ktangsali

v1.1.1

ccb2b89

v1.1.1

PhysicsNeMo (Core) General Release v1.1.1 (patch to v1.1.0)

Fixed

Fixed an inadvertent change to the deterministic sampler 2nd order correction

Assets 2

10 Jun 20:36

ktangsali

v1.1.0

7a798a3

v1.1.0

PhysicsNeMo (Core) General Release v1.1.0

Added

Added ReGen score-based data assimilation example
General purpose patching API for patch-based diffusion
New positional embedding selection strategy for CorrDiff SongUNet models
Added Multi-Storage Client to allow checkpointing to/from Object Storage

Changed

Simplified CorrDiff config files, updated default values
Refactored CorrDiff losses and samplers to use the patching API
Support for non-square images and patches in patch-based diffusion
ERA5 download example updated to use current file format convention and
restricts global statistics computation to the training set
Support for training custom StormCast models and various other improvements for StormCast
Updated CorrDiff training code to support multiple patch iterations to amortize
regression cost and usage of torch.compile
Refactored physicsnemo/models/diffusion/layers.py to optimize data type
casting workflow, avoiding unnecessary casting under autocast mode
Refactored Conv2d to enable fusion of conv2d with bias addition
Refactored GroupNorm, UNetBlock, SongUNet, SongUNetPosEmbd to support usage of
Apex GroupNorm, fusion of activation with GroupNorm, and AMP workflow.
Updated SongUNetPosEmbd to avoid unnecessary HtoD Memcpy of pos_embd
Updated from_checkpoint to accommodate conversion between Apex optimized ckp
and non-optimized ckp
Refactored CorrDiff NVTX annotation workflow to be configurable
Refactored ResidualLoss to support patch-accumlating training for
amortizing regression costs
Explicit handling of Warp device for ball query and sdf
Merged SongUNetPosLtEmb with SongUNetPosEmb, add support for batch>1
Add lead time embedding support for positional_embedding_selector. Enable
arbitrary positioning of probabilistic variables
Enable lead time aware regression without CE loss
Bumped minimum PyTorch version from 2.0.0 to 2.4.0, to minimize
support surface for physicsnemo.distributed functionality.

Dependencies

Made nvidia.dali an optional dependency

Assets 2

25 Mar 23:57

ktangsali

v1.0.1

51c931f

v1.0.1

PhysicsNeMo (Core) General Release v1.0.1

Added

Added version checks to ensure compatibility with older PyTorch for distributed utilities and ShardTensor

Fixed

EntryPoint error that occured during physicsnemo checkpoint loading

Assets 2

18 Mar 20:24

ktangsali

v1.0.0

7e3d2a5

v1.0.0

PhysicsNeMo (Core) General Release v1.0.0

Added

DoMINO model architecture, datapipe and training recipe
Added matrix decomposition scheme to improve graph partitioning
DrivAerML dataset support in FIGConvNet example.
Retraining recipe for DoMINO from a pretrained model checkpoint
Prototype support for domain parallelism of using ShardTensor (new).
Enable DeviceMesh initialization via DistributedManager.
Added Datacenter CFD use case.
Add leave-in profiling utilities to physicsnemo, to easily enable torch/python/nsight
profiling in all aspects of the codebase.

Changed

Refactored StormCast training example
Enhancements and bug fixes to DoMINO model and training example
Enhancement to parameterize DoMINO model with inlet velocity
Moved non-dimensionaliztion out of domino datapipe to datapipe in domino example
Updated utils in physicsnemo.launch.logging to avoid unnecessary wandb and mlflow
imports
Moved to experiment-based Hydra config in Lagrangian-MGN example
Make data caching optional in MeshDatapipe
The use of older importlib_metadata library is removed

Deprecated

ProcessGroupConfig is tagged for future deprecation in favor of DeviceMesh.

Fixed

Update pytests to skip when the required dependencies are not present
Bug in data processing script in domino training example
Fixed NCCL_ASYNC_ERROR_HANDLING deprecation warning

Dependencies

Remove the numpy dependency upper bound
Moved pytz and nvtx to optional
Update the base image for the Dockerfile
Introduce Multi-Storage Client (MSC) as an optional dependency.
Introduce wrapt as an optional dependency, needed when using
ShardTensor's automatic domain parallelism

Assets 2

27 Nov 19:04

ktangsali

v0.9.0

5bc7702

v0.9.0

Modulus (core) general release v0.9.0

Added

FIGConvUNet model and example.
The Transolver model.
The XAeroNet model.
Incoporated CorrDiff-GEFS-HRRR model into CorrDiff, with lead-time aware SongUNet and
cross entropy loss.

Changed

Refactored EDMPrecondSRV2 preconditioner and fixed the bug related to the metadata
Extended the checkpointing utility to store metadata.
Corrected missing export of loggin function used by transolver model

Assets 3

24 Sep 17:10

NickGeneva

v0.8.0

e5844cc

v0.8.0

Modulus (core) general release v0.8.0

Added

Graph Transformer processor for GraphCast/GenCast.
Utility to generate STL from Signed Distance Field.
Metrics for CAE and CFD domain such as integrals, drag, and turbulence invariances and
spectrum.
Added gradient clipping to StaticCapture utilities.
Bistride Multiscale MeshGraphNet example.

Changed

Refactored CorrDiff training recipe for improved usability
Fixed timezone calculation in datapipe cosine zenith utility.

Assets 3

23 Jul 23:25

NickGeneva

v0.7.0

336cc94

v0.7.0

Modulus (core) general release v0.7.0

Added

Code logging for CorrDiff via Wandb.
Augmentation pipeline for CorrDiff.
Regression output as additional conditioning for CorrDiff.
Learnable positional embedding for CorrDiff.
Support for patch-based CorrDiff training and generation (stochastic sampling only)
Enable CorrDiff multi-gpu generation
Diffusion model for fluid data super-resolution (CMU contribution).
The Virtual Foundry GraphNet.
A synthetic dataloader for global weather prediction models, demonstrated on GraphCast.
Sorted Empirical CDF CRPS algorithm
Support for history, cos zenith, and downscaling/upscaling in the ERA5 HDF5 dataloader.
An example showing how to train a "tensor-parallel" version of GraphCast on a
Shallow-Water-Equation example.
3D UNet
AeroGraphNet example of training of MeshGraphNet on Ahmed body and DrivAerNet datasets.
Warp SDF routine
DLWP HEALPix model
Pangu Weather model
Fengwu model
SwinRNN model
Modulated AFNO model

Changed

Raise ModulusUndefinedGroupError when querying undefined process groups
Changed Indexing error in examples/cfd/swe_nonlinear_pino for modulus loss function
Safeguarding against uninitialized usage of DistributedManager

Removed

Remove mlflow from deployment image

Fixed

Fixed bug in the partitioning logic for distributing graph structures
intended for distributed message-passing.
Fixed bugs for corrdiff diffusion training of EDMv1 and EDMv2

Dependencies

Update DALI to CUDA 12 compatible version.
Update minimum python version to 3.10

Assets 3

17 Apr 22:45

NickGeneva

v0.6.0

eff54e6

v0.6.0

Modulus (core) general release v0.6.0

Added

Added citation file
Link to the CWA dataset
ClimateDatapipe: an improved datapipe for HDF5/NetCDF4 formatted climate data
Performance optimizations to CorrDiff
Physics-Informed Nonlinear Shallow Water Equations example
Warp neighbor search routine with a minimal example
Strict option for loading Modulus checkpoints
Regression only or diffusion only inference for CorrDiff
Support for organization level model files on NGC file system
Physics-Informed Magnetohydrodynamics example

Changed

Updated Ahmed Body and Vortex Shedding examples to use Hydra config
Added more config options to FCN AFNO example
Moved posiitonal embedding in CorrDiff from the dataloader to network architecture

Deprecated

modulus.models.diffusion.preconditioning.EDMPrecondSR. Use EDMPecondSRV2 instead

Removed

Pickle dependency for CorrDiff

Fixed

Consistent handling of single GPU runs in DistributedManager
Output location of objects downloaded with NGC file system
Bug in scaling the conditional input in CorrDiff deterministic sampler

Dependencies

Updated DGL build in Dockerfile
Updated default base image
Moved Onnx from optional to required dependencies
Optional Makani dependency required for SFNO model

Assets 3

26 Jan 01:13

NickGeneva

v0.5.0

24bee5c

v0.5.0

Modulus (core) general release v0.5.0

Added

Distributed process group configuration mechanism.
DistributedManager utility to instantiate process groups based on a process group config.
Helper functions to facilitate distributed training with shared parameters.
Brain anomaly detection example.
Updated Frechet Inception Distance to use Wasserstein 2-norm with improved stability.
Molecular Dynamics example.
Improved usage of GraphPartition, added more flexible ways of defining a partitioned graph.
Physics-Informed Stokes Flow example.

Changed

MLFLow logging such that only proc 0 logs to MLFlow.
FNO given separate methods for constructing lift and spectral encoder layers.

Removed

The experimental SFNO

Dependencies

Removed experimental SFNO dependencies
Added CorrDiff dependencies (cftime, einops, pyspng)
Made tqdm a required dependency

Assets 3

Releases: NVIDIA/physicsnemo

v1.2.0

Added

Changed

Fixed

Uh oh!

v1.1.1

Fixed

Uh oh!

v1.1.0

Added

Changed

Dependencies

Uh oh!

v1.0.1

Added

Fixed

Uh oh!

v1.0.0

Added

Changed

Deprecated

Fixed

Dependencies

Uh oh!

v0.9.0

Added

Changed

Uh oh!

v0.8.0

Added

Changed

Uh oh!

v0.7.0

Added

Changed

Removed

Fixed

Dependencies

Uh oh!

v0.6.0

Added

Changed

Deprecated

Removed

Fixed

Dependencies

Uh oh!

v0.5.0

Added

Changed

Removed

Dependencies

Uh oh!