This GitHub repo has folders required for setting up the molecular dynamics simulations, metadynamics simulations, and post-processing codes for studies on MEK kinase and mutations
README: There are three folders F_unbiased that includes the files to setup and perform Unbiased Molecular Dynamics, F_metad contains files to setup and perform metadynamics, F_post_processing_codes contains files to perform post processing
F_unbiased: This folder has the input gromacs .mdp scripts required to perform Unbiased Molecular Dynamics and the active and the inactive structures of ALK
- The topology files .top are generated using Gromacs function : "pdb2gmx" and we implement charmm27 forcefield
- The outputs (.gro, .cpt, .top, .tpr) of the unbiased MD simualtion are inputs to Metadynamics. The active and inactive MEK structures at 101 ns of MD simulation were chosen as reference in the Metadynamics run. These structures are included in F_metad as active_mek_wt_protein.pdb and inactive_mek_wt_protein.pdb
F_metad: This folder has the input scripts required to perform Metadynamics and also the output free energy file that is obtained by summing over HILLS files
- plumed.dat - a plumed script that needs to be run for performing Metadynamcs
- mek_active_CA.pdb - the reference active structure, one of the two collective variables is RMSD from this structure
- mek_inactive_CA.pdb - the reference inactive structure, one of the two collective variables is RMSD from this structure
PS note: Additionally, you would need the usual gromacs file: .mdp file, .top file, .cpt file, .tpr file which are the outputs of unbiased MD
- fes_8us_wt_mintozero.dat - The free energy file which has free energy at every coordinate on CV space obtained by summing over HILLS files.
F_post_processing_codes: This folder contain the python scripts used for post-processing trajectories data to calculate boltzmann weighted correlations matrix, sasa, hydrogen bond occupancies and plot free energy landscapes, extract structures from zones and check convergence of metadynamics free energy zones
Our MD and metadynamics simulations use charmm27.ff from GROMACS 5.0.7
The MD simulations are performed in GROMACS. The paper mentions Biophyscode which actually is a wrapper on Gromacs. Please feel free to try out Biophyscode, a creation from Radhakrishnan lab. https://biophyscode.github.io. Please note: Following is the walk-through on how to run a simulation using Biophyscode:
"https://biophyscode.github.io/molecular_dynamics_lab/".
All mutations are introduced using BioPhysCode Automacs routine based on MODELLER. Please see MODELLER here:
All the requisite files needed to setup and run the simulations are included in the F_unbiased. We followed Bevan Lab Tutorials: "Lysozyme in water" example for equilibration and production
http://www.mdtutorials.com/gmx/lysozyme/index.html
We patch GROMACS 5.0.7 with PLUMED 2.3.5 and use multiple walker (num_of_walkers=10) metadynamics. For installation procedure of PLUMED and the subsequent patching with GROMACS, see here: https://www.plumed.org/doc-v2.6/user-doc/html/_installation.html
In the folder F_metad we have the scripts required to run the metadynamics. Please note: Metadynamics has to be run using the output .gro and .cpt files of the unbiased MD simulations. The folder contains the .mdp file, plumed.dat and the reference active and inactive MEK alpha Carbons structure files
plumed.dat metadynamics script is run in GROMACS 5.0.7 patched with PLUMED 2.3.5 using multiple walker (num_of_walkers=10) metadynamics. We set in the PLUMED script, the energy in kcal/mol and length in Å. The parameters used in this study to perform Well-Tempered multiple walker metadynamics (WTMD) are:bias factor γ = T +∆T/T = 20, height = 0.6 and pace = 500.
Aggregate of 8us metadynamics run in GROMACS 5.0.7 patched with PLUMED 2.3.5 using multiple walker (num_of_walkers=10) metadynamics until the convergence criterion is met for the zones of interest (See section xxx)
The output of the metadynamics run are the HILLS files. We will have ten of them since we used ten walkers. To get the free energy file from the HILLS use plumed sum_hills action of PLUMED (https://www.plumed.org/doc-v2.5/user-doc/html/sum_hills.html)
Analysis of the trajectories was done using python and mostly MDanalysis package in python: https://www.mdanalysis.org
The folder F_post_processing contains eleven codes:
Aloop_helicity.ipynb
is the code to compute hbonds that contribute to making the partial helix in the activation loop of kinaseAnalysis_weighted_correlation.ipynb
is the code that computes boltzmann weighted SASA, Hbond and their log ratio plotBoltz_corr_plotter.ipynb
is the code to plot the boltzmann weighted correlation matricesDihedral_compute.ipynb
is the code to compute the distribution of the dihedral angle representing the DFG flipFELandscape_plotter.ipynb
is the code to plot the free energy landscapeFELandscape_to_zone_structure_extractor.ipynb
is the code to extract structures from chosen zones on the free energy landscapeHierarchical_cluster.ipynb
is the code to plot the hierarchical cluster differentiating dynamics of mutated MEK systemsKE_salt_bridge.ipynb
is the code to compute the distribution of the KE salt bridge distanceboltz_corr1.py
is the parallelized code to compute boltzmann weighted correlationsservice.sh
is bash script to extract protein trajectory from system trajectoryzone_highlighter.ipynb
is the code to identify specific zones on the free energy landscape
Docking simulations were performed following the Induced Fit protocol using Schrodinger’s (v. 2019.4) Glide software. The mek_docking_files folder contains the input and output files from our simulations. Each system will have the following files:
ATP-out.maegz
is the input ligand file.structure.pdb
is the input receptor structure file.system_ligand.prj
contains the receptor structure along with water molecules, Mg2+ ions, and ATP-GS aligned to the binding site.system_docking-out.maegz
contains the output poses from the docking simulation.
If you have any suggestions or queries please feel free to reach out at : patilk@seas.upenn.edu
If you found the above scripts and/or codes helpful in your work, please cite:
- Jordan, E. Joseph, et al. "Computational algorithms for in silico profiling of activating mutations in cancer." Cellular and Molecular Life Sciences 76.14 (2019): 2663-2679.
- Patil, Keshav, et al. "Computational studies of anaplastic lymphoma kinase mutations reveal common mechanisms of oncogenic activation." Proceedings of the National Academy of Sciences 118.10 (2021).