This repository contains a Google Colab notebook for calculating the net charge and isoelectric point (pI) of proteins from user-provided PDB codes.
Author: Marilina Cathcarth
- Email: mcathcarth@gmail.com
- GitHub: mcathcarth
Version: 1.0
Date: Jun 10, 2024
Description: This script calculates the net charge and isoelectric point (pI) of proteins based on their sequences extracted from PDB files. The net charge of each protein is computed over a specified pH range, and the data is saved and visualized.
If you use this tool in your research, please cite this repository using the following DOI:
https://doi.org/10.5281/zenodo.11553881
For citation in various formats, please visit the DOI link.
The script uses the following Python libraries:
- Biopython (Bio) - Licensed under the Biopython License Agreement (BPLA)
- NumPy - Licensed under the BSD License
- Matplotlib - Licensed under the BSD License
- requests - Licensed under the Apache License 2.0
-
Open the Colab Notebook: Click on the "Open In Colab" badge above to open the notebook directly in Google Colab.
-
Modify Input Parameters:
-
Update the
pdb_codes
list with the PDB codes of the proteins you want to analyze. -
Optionally, provide user names (
usr_names
) and number of chains (usr_nc
) if needed. -
Set the parameters for
pH_start
,pH_end
,pH_step
, andtarget_pH
as per your requirements.
-
-
Run Each Cell Sequentially: Ensure that you run each cell in order, as some cells depend on the outputs or definitions from previous cells.
-
View Results: After running the final cell, view the output which includes the isoelectric point (pI) and the net charge of each protein at the target pH. Additionally, a plot showing the net charge vs. pH will be generated and saved.
-
Save Your Work:
-
To save your changes, you can either:
-
Download the notebook to your local machine:
File -> Download .ipynb
-
Save a copy to your Google Drive:
File -> Save a copy in Drive
-
-
-
Note: Any changes you make in Google Colab will not affect the original notebook in this GitHub repository. Each user works on their own copy of the notebook.
-
pdb_codes
: A list of PDB codes for the proteins to analyze (e.g., ["4F5S", "1CF3", "1OVA"]). -
usr_names
: Optional list of user-provided names for the proteins. -
usr_nc
: Optional list of the number of chains to consider for each protein. -
pH_start
: Starting pH value for the range (e.g., 3.0). -
pH_end
: Ending pH value for the range (e.g., 10.0). -
pH_step
: Step size for the pH range (e.g., 0.1). -
target_pH
: Specific pH value at which to calculate the net charge (e.g., 7.4).