Data-driven learning of the generalized Langevin equation with state-dependent memory

Abstract

We present a data-driven method to learn stochastic reduced models of complex systems that retain a state-dependent memory beyond the standard generalized Langevin equation (GLE) with a homogeneous kernel. The constructed model naturally encodes the heterogeneous energy dissipation by jointly learning a set of state features and the non-Markovian coupling among the features. Numerical results demonstrate the limitation of the standard GLE and the essential role of the broadly overlooked state-dependency nature in predicting molecule kinetics related to conformation relaxation and transition.

Numerical Example 1

System

Consider a polymer molecule consisting of 16 atoms. The resolved variable is defined as the end-to-end distance. (see paper).

Codes

The example is given in folder 'case_unimodal', and 'main.m' provides how to drive these codes.

Compute the probability distribution function by 'step1_PDF.m', which provides free energy and the conservative force ('data/PDF.mat').
Compute the one feature (1D) $h(q)$ by 'step2_hx.m'. This is done by only considering the state-dependency when $t=0$ ('data/PDF.mat').

$$m\dot{v}_t=F(q_t)-h(q_t) \int_0^t \theta(t-\tau)h(q_\tau) v_\tau d\tau+h(q_t) R_t$$

$$h(q)= \frac{\langle \dot{v}_0-f(q_0),\dot{v}_0 |q_0=q \rangle}{\langle v_0,v_0 \rangle}.$$

Compute the two-point correlation functions by 'step3_corr.m' and 'step4_hx_corr.m' to construct 1D kernel ('data/corr.mat' and 'data/hx_corr.mat').

$$\langle \frac{m\dot{v}_t-F(q_t)}{h(q_t)},v_0 \rangle= \int_0^t \theta(t-\tau) \langle h(q_\tau) v_\tau, v_0 \rangle d\tau$$

Compute three-point correlation functions for N features (ND) state-dependent kernel by 'step3_training_set.m' and 'step4_collect_training_set.m' ('data/dx_10_w_501.mat').
Train the model with 'train.py' ('MD_ND_2.mat').
Simulate the standard GLE model and state-dependent GLE model by 'step5_std_GLE.m', 'step5_hx_GLE_1D.m' and 'step5_hx_GLE_2D.m' (mat files in 'GLE_data').
Compute correlation functions of all the reduced models by 'step6_GLE_corr.m' ('corr_GLE.mat', 'corr_hx_GLE_1D.mat', 'corr_hx_GLE_2D.mat').
The visualization is at the end of the 'main.m'.

Result

The two figures show the probability distribution and free energy without an energy barrier.

The following two figures shows velocity correlation $\langle v(t),v(0) \rangle$ and state-dependent velocity correlation $\langle v(t),v(0) |q(0)=q^* \rangle$. 'MD' represents the full model, 'GLE' represents the standard GLE, 'SD-GLE-1D' represents our model with the 1D $h(q)$ formulation as mentioned before, 'SD-GLE-2D' represents our model with the 2D $h(q)$ formulation in the paper.

The following figure shows the distribution of the period for the molecule taking a certain conformation state ($q>15$).

Numerical Example 2 (case_bimodal)

System

Consider the molecule benzyl bromide in an aqueous environment. The full system consists of one benzyl bromide molecule and 2400 water molecules with the periodic boundary condition imposed along each direction. The resolved variable is defined as the distance between the bromine atom and the ipso-carbon atom. (see arxiv)

Codes

The example is given in folder 'case_bimodal', and 'main.m' provides how to drive these codes. The parameters here are smaller than the ones used in the paper. The number of bases for $h(q)$ is 66 in the paper but 8 here, and the number of three-point correlation functions is 65 in the paper but 26 here.

Compute the probability distribution function by 'step1_PDF.m', which provides free energy and the conservative force ('data/PDF.mat').
Compute the two-point correlation functions by 'step2_std_corr.m' to construct the 1D kernel ('data/corr.mat').
Compute three-point correlation functions for ND state-dependent kernel by 'step3_training_set.m' and 'step4_collect_training_set.m' ('data/dx_0.2_w_301.mat').
Train the model with 'train.py' ('MD_ND_4.mat', 'MD_ND_4_std.mat' for the model in the paper. 'MD_ND_4_lite.mat' and 'MD_ND_4_std_lite.mat' is the corresponding lite version due to the size limitation).
Simulate the standard GLE model and state-dependent GLE model by 'step5_std_GLE.m', 'step5_hx_GLE.m'. 'step5_hx_GLE_fast_conv.m' do the same thing as 'step5_hx_GLE.m' but evaluate convolution by fast convolution algorithm.
Compute correlation functions of all the reduced models by 'step6_GLE_corr.m' ('corr_GLE.mat', 'corr_ML_4D.mat').
The visualization is at the end of the 'main.m'.

Results

The two figures show the probability distribution and free energy with two local minima.

The following two figures shows velocity correlation $\langle v(t),v(0) \rangle$ and state-dependent velocity correlation $\langle v(t),v(0) |q(0)=q^* \rangle$. 'MD' represents the full model, 'GLE' represents the standard GLE, and 'SD-GLE' represents our model.

Full Data

Due to the storage limitation of GitHub, we only upload part of the data. The full data including MD trajectories (example 1), position and velocity of the resolved variables, and simulation data (example1) can be accessed from Globus with the link (https://app.globus.org/file-manager?origin_id=ec51ed95-bc26-44a4-a8a0-65b74d694c33&origin_path=%2F).

Software and Library

Python environment is given in the file 'conda-environment.txt'.

MATLAB version is 2022a.

The training is performed on v100s.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
case_bimodal		case_bimodal
case_unimodal		case_unimodal
README.md		README.md
conda-environment.txt		conda-environment.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data-driven learning of the generalized Langevin equation with state-dependent memory

Abstract

Numerical Example 1

System

Codes

Result

Numerical Example 2 (case_bimodal)

System

Codes

Results

Full Data

Software and Library

About

Uh oh!

Releases

Packages

Languages

qdgp/Data-driven-learning-of-the-generalized-Langevin-equation-with-state-dependent-memory

Folders and files

Latest commit

History

Repository files navigation

Data-driven learning of the generalized Langevin equation with state-dependent memory

Abstract

Numerical Example 1

System

Codes

Result

Numerical Example 2 (case_bimodal)

System

Codes

Results

Full Data

Software and Library

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages