Feature: Spin-polarized calculations for EXX PW (#6260)

Flying-dragon-boxing · zhangzh-pku · WHUweiqingzhou · web-flow · commit 80ada9e406b9 · 2025-06-02T22:28:59.000+08:00
* feat pexsi * fix : diag not completed * feat * feat: pexsi hsolver * CMake building implemented * Works * adapt to the new container * Turn off USE_PEXSI * Update LibRI to 553c91c * modify include files * namespace-ize * new inputs added * Configure Makefile Compiling, fix typos * Fix Makefile Intel toolchains compile errors * Fix even more PEXSI related Makefile compiling issues * Modify inputs and update to latest version (#2) * run INPUT.Default() in every process in InputParaTest (#3490) Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com> * add blas support for FindLAPACK.cmake (#3497) * more unittest of QO: towards orbital selection (#3499) * Fix: fix bug in mulliken charge calculation (#3503) * fix phase * fix case test * Refactor: namespace Conv_Coulomb_Pot_K (#3446) * Refactor: namespace Conv_Coulomb_Pot_K * Refactor: namespace Conv_Coulomb_Pot_K --------- Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com> * enable the computation of all zeros in one function call (#3449) Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com> * replace ios.eof() by ios.good() to avoid meeting badbit and failbit in reading STRU (#3506) * Build: add ccache to accelerate the testing process (#3509) * Build: add ccache to accelerate the testing process * Update test.yml * Update test.yml * Update test.yml * Docs: to avoid the misunderstanding in docs (#3518) * to avoid the misunderstanding in docs * Update docs/quick_start/hands_on.md Co-authored-by: Chun Cai <amoycaic@gmail.com> --------- Co-authored-by: Chun Cai <amoycaic@gmail.com> * Docs: fix a missing depencency in conda build env (#3508) * Feature: Add ENABLE_RAPIDJSON option to control the output of abacus.json (#3519) Add ENABLE_RAPIDJSON option to control the output of abacus.json * Feature: add python wrapper for math sphbes (#3475) * recommit for review * add python wrapper * remove timer since performace tests add * Feature: support segment split in kline mode in KPT file and `out_band` band output precision control, `8` as default (#3493) * add precision control * correct serial version of nscf_band function * fix issue 3482 * update unit and integrated test * update document * correct unittest and make compatible with false and true * fix: bug in Autotest.sh when result.ref has no totaltimeref (#3523) * Fix : unit test of module_xc (#3524) * Fix: omit small magnetic moments to avoid numerical instability (#3530) * update deltalambda * avoid numerical error in orbMulP * add constrain on Mi * change case reference value * Fix: fix multiple compiler warnings (#3515) * Fix: add noreturn attribute to warning_quit * Add type conversion * fix string literal * fix small number trunctuation * Fix system call returned value not checked * fix missing braket * Refactor parameter_pool.cpp and parameter_pool.h * remove duplicated return statements * Change WARNING_QUIT occurances in tests * Add warning message to help debug UT * output the default precision flag (#3496) Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com> * Build: Improving CMake performance for finding LibXC and ELPA (#3478) * Fix for finding LibXC and ELPA * For compatibility to previous routines * syntax fix for FindELPA.cmake * Update cmake/FindELPA.cmake Co-authored-by: Chun Cai <amoycaic@gmail.com> * Using CMake interface as default for finding LibXC * update docs * fix for FindLibxc: changing imcompatible if statement * fix for FindLibxc: changing imcompatible if statement * fix for FindLibxc: changing imcompatible if statement * update docs for installing pkg-config * Update FindLibxc.cmake * Update FindLibxc.cmake * remove previous LibXC routine in CMakeLists.txt Co-authored-by: Chun Cai <amoycaic@gmail.com> * Update easy_install.md with Makefile-built LibXC supported * Update easy_install.md to include different behavior in different version on finding ELPA --------- Co-authored-by: Chun Cai <amoycaic@gmail.com> * Docs: correct some docs about mp2 smearing method (#3533) * correct some docs about mp2 smearing method * add docs about mv method * Feature : printing band density (#3501) Co-authored-by: wenfei-li <liwenfei@gmail.com> Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com> * add some docs for PR#3501 (#3537) * Feature: enable restart charge density mixing during SCF (#3542) * add a new parameter mixing_restart * do not update rho if iter==mixing_restart * do not update rho if iter==mixing_restart-1 * reset mix and rho_mdata if iter==mixing_restart * fix SCF exit directly since drho=0 if iter=GlobalV::MIXING_RESTART * re-set_mixing in eachiterinit for PW and LCAO * enable SCF restarts in esolver_ks::RUN * add some UnitTests * add some Docs * new inputs added * Update input-main.md (#3551) Solve the format problem mentioned in issue 3543 * Build: fix compatibility issue against toolchain install (#3540) * Fix for finding LibXC and ELPA * For compatibility to previous routines * syntax fix for FindELPA.cmake * Update cmake/FindELPA.cmake Co-authored-by: Chun Cai <amoycaic@gmail.com> * Using CMake interface as default for finding LibXC * update docs * fix for FindLibxc: changing imcompatible if statement * fix for FindLibxc: changing imcompatible if statement * fix for FindLibxc: changing imcompatible if statement * update docs for installing pkg-config * Update FindLibxc.cmake * Update FindLibxc.cmake * remove previous LibXC routine in CMakeLists.txt Co-authored-by: Chun Cai <amoycaic@gmail.com> * Update easy_install.md with Makefile-built LibXC supported * Update easy_install.md to include different behavior in different version on finding ELPA * fix compatibility issue against toolchain * Change default ELPA install routine to old one --------- Co-authored-by: Chun Cai <amoycaic@gmail.com> * Test: Configure performance tests for math libraries (#3511) * add performace test of sphbes functions. * fix benchmark cmake errors * add dependencies for docker * update docs * add performance tests for sphbes * add google benchmark * rewrite benchmark tests in fixtures * disable internal testing in benchmark * merge benchmark into integration test --------- Co-authored-by: StarGrys <771582678@qq.com> * Configure Makefile Compiling, fix typos * Fix Makefile Intel toolchains compile errors * Fix even more PEXSI related Makefile compiling issues * Update hsolver_pw.cpp (#3556) when use_uspp==false, overlap matrix should be E. * Fix: cuda build target (#3276) * Fix: cuda buid target * Update CMakeLists.txt --------- Co-authored-by: Denghui Lu <denghuilu@pku.edu.cn> --------- Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com> Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com> Co-authored-by: Haozhi Han <haozhi.han@outlook.com> Co-authored-by: Zhao Tianqi <hongriTianqi@users.noreply.github.com> Co-authored-by: PeizeLin <78645006+PeizeLin@users.noreply.github.com> Co-authored-by: jinzx10 <jzx016@hotmail.com> Co-authored-by: Chun Cai <amoycaic@gmail.com> Co-authored-by: Peng Xingliang <91927439+pxlxingliang@users.noreply.github.com> Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com> Co-authored-by: Wenfei Li <38569667+wenfei-li@users.noreply.github.com> Co-authored-by: Denghui Lu <denghuilu@pku.edu.cn> Co-authored-by: YI Zeping <18586016708@163.com> Co-authored-by: wenfei-li <liwenfei@gmail.com> Co-authored-by: jingan-181 <78459531+jingan-181@users.noreply.github.com> Co-authored-by: StarGrys <771582678@qq.com> Co-authored-by: Haozhi Han <haozhi.han@stu.pku.edu.cn> * Revert "Modify inputs and update to latest version" * Update FindPEXSI.cmake to fix Comments * Fix CI errors * Fix CI Errors and Merge with Upstream * Resolve Pull Request Reviews * Fix parallel communication related issue * Fix vars in Makefile.vars, add input tests and comments for pexsi vars * Fix nspin > 1 cases * Improvement: take calculated mu as new initial guess, may slightly improve performance * Fix mistakes in the last commit * Fix: params and features - set default pexsi_temp - fix md in pexsi * fix empty lines * Fix: move params to pexsi_solver, rename USE_PEXSI to ENABLE_PEXSI * Tests: Modify Dockerfile and GitHub Workflows * Fix: wrong abacus link for dockerfile * Docs: added docs for pexsi inputs * Tests: three tests added for pexsi * Fix unit test issues in input_conv * Very good unit test, making my laptop fan spin * Change default pexsi_npole from 80 to 40 * Place pexsi_EDM in DensityMatrix, set size of pexsi_dm = 1 when GlobalV::NSPIN==4, and add comments for dmToRho * An unit test added for DiagoPexsi * modify for changed gint interface * correct nspin related behaviors * add efermi passthrough * Revert "add efermi passthrough" This reverts commit d7b402d. * commits to resolve conversations related to codes * DM and EDM pointers in pexsi now handled by diagopexsi, and copying h s matrices no longer needed * add pexsi examples * fix pexsi unit test (original version shouldn't run) * add building docs for pexsi * set cxx standard to c++14, which is required in make_unique * Fix: Fix typo related to pexsi * update to PPEXSIDFTDriver2 * default npoints to 1, so single core pexsi will work * Feature: exx operator for pw basis, single kpt * apply pexsi changes(?) * q-e style exx_div * Correct exxdiv * Fix Compile errors * refactor to abandon `pdiagh` * Fix mu_buffer and nspin * HSE examples * Feature: Multi-K exx * Feature: Multi-K exx * Updates with latest * Remove redundant global vars * Update to v3.9.0 * Update to v3.9.0, now code works * Remove Redundant cal_exx_energy in esolver_ks_pw.cpp * Some mess * Minor Fixes * Fix separate loop and screening * Add EXX stress * EXX Energy??? * Multi-K is broken??? * Fix: Multi-K and stress * Feature: ACE for single-K * Feature: ACE should work for multi-K, but not for sure * Feature: ACE works. Next step is ACE energy. * Fix: adapt to the latest instruction for variable `conv_esolver` * Reconstruct: move exx_helper to hamilt_pwdft * Fix: Now EXX PW doesn't depend on LibRI * Fix: Add input constraints for EXX PW * Fix: Remove redundant mpi barrier * Fix: Clean irrelevant files * Fix: Clean irrelevant files * Feature: add ace flag, exit on using gpu * Refactor: Phase 1 for refactoring exx energy * Feature: now ace calculates energy * Feature: enable exx energy * Fix: fix makefile compilation error * Fix: One minor fix for a segmentation fault * Tests: one integrate test for exx pw, only for verifying whether exx pw works * Revert "Tests: one integrate test for exx pw, only for verifying whether exx pw works" This reverts commit e7b606f. * Fix: EXX PW ACE open only when separate_loop is on * add timer * Feature: Double Grid method of EXX PW * Feature: Double Grid method of EXX PW Stress * Fix: Double Grid method of EXX PW Stress * Feature: add double grid variable * Feature: add double grid variable * Fis: HSE stress * Fix: HSE Stress * Fix: Timer * Fix: Timer * For non mp sampling, disable extrapolation * Modify test * Modify mp * Format * Format * Feature: nspin == 2 scf * Fix: nspin == 2 scf * Docs: EXX PW Docs * Feature: EXX PW for nspin=2 --------- Co-authored-by: zhangzhihao <1900017707@pku.edu.cn> Co-authored-by: zhangzh-pku <64026312+zhangzh-pku@users.noreply.github.com> Co-authored-by: wqzhou <33364058+WHUweiqingzhou@users.noreply.github.com> Co-authored-by: kirk0830 <67682086+kirk0830@users.noreply.github.com> Co-authored-by: Haozhi Han <haozhi.han@outlook.com> Co-authored-by: Zhao Tianqi <hongriTianqi@users.noreply.github.com> Co-authored-by: PeizeLin <78645006+PeizeLin@users.noreply.github.com> Co-authored-by: jinzx10 <jzx016@hotmail.com> Co-authored-by: Chun Cai <amoycaic@gmail.com> Co-authored-by: Peng Xingliang <91927439+pxlxingliang@users.noreply.github.com> Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com> Co-authored-by: Wenfei Li <38569667+wenfei-li@users.noreply.github.com> Co-authored-by: Denghui Lu <denghuilu@pku.edu.cn> Co-authored-by: YI Zeping <18586016708@163.com> Co-authored-by: wenfei-li <liwenfei@gmail.com> Co-authored-by: jingan-181 <78459531+jingan-181@users.noreply.github.com> Co-authored-by: StarGrys <771582678@qq.com> Co-authored-by: Haozhi Han <haozhi.han@stu.pku.edu.cn> Co-authored-by: Mohan Chen <mohan.chen.chen.mohan@gmail.com>
diff --git a/source/module_hamilt_pw/hamilt_pwdft/operator_pw/op_exx_pw.cpp b/source/module_hamilt_pw/hamilt_pwdft/operator_pw/op_exx_pw.cpp
@@ -80,8 +80,9 @@ OperatorEXXPW<T, Device>::OperatorEXXPW(const int* isk_in,
     // allocate h_psi recip space memory
     resmem_complex_op()(h_psi_recip, wfcpw->npwk_max);
     // resmem_complex_op()(this->ctx, psi_all_real, wfcpw->nrxx * GlobalV::NBANDS);
+
     int nks = wfcpw->nks;
-//    std::cout << "nks: " << nks << std::endl;
+    int nk_fac = PARAM.inp.nspin == 2 ? 2 : 1;
     resmem_real_op()(pot, rhopw->npw * nks * nks);
 
     tpiba = ucell->tpiba;
@@ -168,6 +169,12 @@ void OperatorEXXPW<T, Device>::act_op(const int nbands,
                                    const int ngk_ik,
                                    const bool is_first_node) const
 {
+//    std::cout << "nbands: " << nbands
+//              << " nbasis: " << nbasis
+//              << " npol: " << npol
+//              << " ngk_ik: " << ngk_ik
+//              << " is_first_node: " << is_first_node
+//              << std::endl;
     if (!potential_got)
     {
         get_potential();
@@ -199,6 +206,7 @@ void OperatorEXXPW<T, Device>::act_op(const int nbands,
         Real nqs = q_points.size();
         for (int iq: q_points)
         {
+//            std::cout << "ik" << this->ik << " iq" << iq << std::endl;
             for (int m_iband = 0; m_iband < psi.get_nbands(); m_iband++)
             {
                 // double wg_mqb_real = GlobalC::exx_helper.wg(iq, m_iband);
@@ -518,7 +526,23 @@ std::vector<int> OperatorEXXPW<T, Device>::get_q_points(const int ik) const
     {
         for (int iq = 0; iq < wfcpw->nks; iq++)
         {
-            q_points_ik.push_back(iq);
+            if (PARAM.inp.nspin ==1 )
+            {
+                q_points_ik.push_back(iq);
+            }
+            else if (PARAM.inp.nspin == 2)
+            {
+                int nk_fac = 2;
+                int nk = wfcpw->nks / nk_fac;
+                if (iq / nk == ik / nk)
+                {
+                    q_points_ik.push_back(iq);
+                }
+            }
+            else
+            {
+                ModuleBase::WARNING_QUIT("OperatorEXXPW", "nspin == 4 not supported");
+            }
         }
     }
     // else
@@ -539,6 +563,8 @@ void OperatorEXXPW<T, Device>::multiply_potential(T *density_recip, int ik, int
     ModuleBase::timer::tick("OperatorEXXPW", "multiply_potential");
     int npw = rhopw->npw;
     int nks = wfcpw->nks;
+    int nk_fac = PARAM.inp.nspin == 2 ? 2 : 1;
+    int nk = nks / nk_fac;
 
     #ifdef _OPENMP
     #pragma omp parallel for schedule(static)
@@ -635,6 +661,8 @@ void OperatorEXXPW<T, Device>::get_potential() const
                     }
                 }
 
+                const int nk_fac = PARAM.inp.nspin == 2 ? 2 : 1;
+                const int nk = nks / nk_fac;
                 const int ig_kq = ik * nks * npw + iq * npw + ig;
 
                 Real gg = (k_c - q_c + rhopw->gcar[ig]).norm2() * tpiba2;
@@ -689,6 +717,8 @@ void OperatorEXXPW<T, Device>::exx_divergence()
     Real nqs_half2 = 0.5 * kv->nmp[1];
     Real nqs_half3 = 0.5 * kv->nmp[2];
 
+    int nk_fac = PARAM.inp.nspin == 2 ? 2 : 1;
+
     // here we follow the exx_divergence subroutine in q-e (PW/src/exx_base.f90)
     double alpha = 10.0 / wfcpw->gk_ecut;
     double tpiba2 = tpiba * tpiba;
@@ -766,6 +796,7 @@ void OperatorEXXPW<T, Device>::exx_divergence()
     }
 
     div *= ModuleBase::e2 * ModuleBase::FOUR_PI / tpiba2 / wfcpw->nks;
+//    std::cout << "div: " << div << std::endl;
 
     // numerically value the mean value of F(q) in the reciprocal space
     // This means we need to calculate the average of F(q) in the first brillouin zone
@@ -793,8 +824,9 @@ void OperatorEXXPW<T, Device>::exx_divergence()
     //    printf("ucell: %p\n", ucell);
     double omega = ucell->omega;
     div -= ModuleBase::e2 * omega * aa;
-    exx_div = div * wfcpw->nks;
-    // std::cout << "EXX divergence: " << exx_div << std::endl;
+    exx_div = div * wfcpw->nks / nk_fac;
+//    exx_div = 0;
+//    std::cout << "EXX divergence: " << exx_div << std::endl;
 
     return;
 }
@@ -862,8 +894,7 @@ double OperatorEXXPW<T, Device>::cal_exx_energy_op(psi::Psi<T, Device> *ppsi_) c
     T* density_recip = new T[rhopw->npw];
 
     if (wg == nullptr) return 0.0;
-    // evaluate the Eexx
-    // T Eexx_ik = 0.0;
+    const int nk_fac = PARAM.inp.nspin == 2 ? 2 : 1;
     double Eexx_ik_real = 0.0;
     for (int ik = 0; ik < wfcpw->nks; ik++)
     {
@@ -884,8 +915,6 @@ double OperatorEXXPW<T, Device>::cal_exx_energy_op(psi::Psi<T, Device> *ppsi_) c
                 continue;
             }
 
-            //            std::cout << "ik = " << ik << " nb = " << n_iband << " wg_ikb = " << wg_ikb_real << std::endl;
-
             // const T *psi_nk = get_pw(n_iband, ik);
             psi.fix_kb(ik, n_iband);
             const T* psi_nk = psi.get_pointer();
@@ -901,7 +930,6 @@ double OperatorEXXPW<T, Device>::cal_exx_energy_op(psi::Psi<T, Device> *ppsi_) c
             }
             double nqs = q_points.size();
 
-            //            std::cout << "ik = " << ik << " ib = " << n_iband << " wg_kb = " << wg_ikb_real << " wk_ik = " << kv->wk[ik] << std::endl;
             for (int iq: q_points)
             {
                 for (int m_iband = 0; m_iband < psi.get_nbands(); m_iband++)
@@ -914,8 +942,6 @@ double OperatorEXXPW<T, Device>::cal_exx_energy_op(psi::Psi<T, Device> *ppsi_) c
                         continue;
                     }
 
-                    //                    std::cout << "iq = " << iq << " mb = " << m_iband << " wg_iqb = " << wg_iqb_real << std::endl;
-
                     psi_.fix_kb(iq, m_iband);
                     const T* psi_mq = psi_.get_pointer();
                     // const T* psi_mq = get_pw(m_iband, iq);
@@ -945,6 +971,7 @@ double OperatorEXXPW<T, Device>::cal_exx_energy_op(psi::Psi<T, Device> *ppsi_) c
                     {
                         int nks = wfcpw->nks;
                         int npw = rhopw->npw;
+                        int nk = nks / nk_fac;
                         Real Fac = pot[ik * nks * npw + iq * npw + ig];
                         Eexx_ik_real += Fac * (density_recip[ig] * std::conj(density_recip[ig])).real()
                                         * wg_iqb_real / nqs * wg_ikb_real / kv->wk[ik];
diff --git a/source/module_io/input_conv.cpp b/source/module_io/input_conv.cpp
@@ -419,9 +419,9 @@ void Input_Conv::Convert()
             ModuleSymmetry::Symmetry::symm_flag = -1;
         }
 
-        if (PARAM.inp.nspin != 1)
+        if (PARAM.inp.nspin != 1 && PARAM.inp.nspin != 2)
         {
-            ModuleBase::WARNING_QUIT("Input_Conv", "EXX PW works only with nspin=1");
+            ModuleBase::WARNING_QUIT("Input_Conv", "EXX PW works only with nspin=1 and 2");
         }
 
         if (PARAM.inp.device != "cpu")

Original file line number	Diff line number	Diff line change
`@@ -419,9 +419,9 @@ void Input_Conv::Convert()`
`419`	`419`	`ModuleSymmetry::Symmetry::symm_flag = -1;`
`420`	`420`	`}`
`421`	`421`
`422`		`- if (PARAM.inp.nspin != 1)`
	`422`	`+ if (PARAM.inp.nspin != 1 && PARAM.inp.nspin != 2)`
`423`	`423`	`{`
`424`		`- ModuleBase::WARNING_QUIT("Input_Conv", "EXX PW works only with nspin=1");`
	`424`	`+ ModuleBase::WARNING_QUIT("Input_Conv", "EXX PW works only with nspin=1 and 2");`
`425`	`425`	`}`
`426`	`426`
`427`	`427`	`if (PARAM.inp.device != "cpu")`