List view
To make Grad DFT more useful to more people, there are more features that we should add to the code. Such features include: (1) Nuclear gradients: Since we don't calculate all of the Fock matrix terms from scratch in JAX, nuclear gradients like forces and stresses are not available using autodiff. We can either implement these in a non-auto diff way or implement the calculation of all Fock matrix terms in JAX. Nuclear gradients could prove very useful in training functionals. (2) Greatly expand the number of XC energy densities available in Grad DFT: JaxXC could be the answer here. If not, for useful solids for functionals, we should be able to have neural PBESol and SCAN type functionals. (3) Exporting models: Grad DFT is for training neural DFT functionals. It is not designed for large scale simulations. As such, we should be able to export models to be used in popular high performance DFT codes. This may end up being part of a standalone package to Grad DFT itself, but is a necessary step for neural functionals to be used in production science.
No due date•0/4 issues closedTo fully realize the capability of Grad DFT to produce the new frontiers of accuracy for XC functionals, we must consider some performance bottlenecks which prevent us from training models at scale on modern HPC platforms. Our goal in this milestone is to have Grad DFT efficiently compute a batched loss function with a reasonable batch size (16-64 structures) in parallel on a supercomputer. To do this, we will need: (1) Single program-multiple data parallelism. This is implemented in JAX with sharding. HPC system specs dependent, we should aim to run DFT calculations with 1-4 structures per node and scale to 10-100 nodes. (2) Better handing or ERIs or bypassing their need completely. Holding on to ERIs loaded in from PySCF uses a large amount of memory. For periodic systems, we can use FFTs to efficiently get the hartree potential so we will no longer need ERIs. (3) Using symmetry. Many computations, for molecules and solids alike, can be sped up using point-group and space-group symmetries. We should implement this where convenient.
No due date•0/1 issues closedPresently, Grad DFT can be run in SCF mode using either linear mixing of DIIS. The former works for Molecules and Solids while the latter works only for Molecules at present. There do exist limitations, however. (1) SCF runs for a fixed number of iteration which may or may not converge the SCF cycle. (2) In the differentiable SCF loops, occupations are only calculated in a "down and up" procedure. This is fine for insulators but we need to have Gaussian, Fermi-dirac and perhaps even Mazari "cold" smearing. These occupations setting methods should work for all types of SCF procedure. (3) We should reconsider the structure of how SCF calculations are run to have things work in a more modular way. Presently, SCF is exposed as floating functions at module level scope.
No due date