Skip to content

nicwulab/SARS-CoV-2_NTD_DMS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Studying the impact of NTD mutations on SARS-CoV-2 spike expression using deep mutational scanning

Dependencies

Input files

Primer design for DMS library construction

  1. Generating foward (NNK + internal barcode) and reverse primers (constant)
    python3 script/lib_primer_design.py

  2. Generating barcode file
    python3 script/check_barcode.py

Calculating experssion score from DMS data

  1. Merge overlapping paired-end reads using PEAR
    pear -f [FASTQ FILE FOR FORWARD READ] -r [FASTQ FILE FOR FORWARD READ] -o [OUTPUT FASTQ FILE]

  2. Counting variants based on nucleotide sequences
    python3 script/NTD_fastq2count.py

    • Input files:
      • Merged read files in fastq_merged/ folder
    • Output files:
      • result/NTD_DMS_count_nuc_A.tsv
      • result/NTD_DMS_count_nuc_B.tsv
  3. Convert nucleotide sequences to amino acid mutations
    python3 script/NTD_count_nuc2aa.py

    • Input files:
    • Output files:
      • result/NTD_DMS_count_aa_A.tsv
      • result/NTD_DMS_count_aa_B.tsv
  4. Compute expression score
    python3 script/NTD_count2score.py

  5. Plot correlation between replicates as quality control
    Rscript script/plot_QC.R

  6. Plot heatmap for input frequencies of individual mutations Rscript script/plot_input_freq_heatmap.R

  7. Plot heatmap for the expression scores of individual mutations
    Rscript script/plot_score_heatmap.R

  8. Plot mutational tolerability in loops vs others
    Rscript script/NTD_loop_vs_other_residues.R

  9. Plot the mutational tolerability in selected hotspot regions Rscript script/Hot_spots_vs_other_residues.R

Structural analysis

  1. Computing relative solvent accessibility (RSA) for individual residues
    python3 script/RSA_analysis.py

  2. Compute expression score for RBD DMS data Rscript script/compute_exp_score_RBD.R

  3. Compute RSA for RBD residues python3 script/RBD_analysis.py

  4. Computing the distance of individual NTD residues to RBD or S2
    python3 script/Dist_analysis.py

  5. Replace the B-factor by expression score in the PDB file
    python3 script/Bfactor_to_score.py

  6. Plot mean expression score vs RSA for individual residues
    Rscript script/plot_RSA.R

  7. Plot mutational tolerability vs RSA for RBD DMS data
    Rscript script/Mean_expression_score_in_mammalian_system.R

  8. Plot mutational tolerability vs distance to RBD/S2 for individual residues and categorized by antibody epitopes
    Rscript script/Dist_to_RBD_S2_exp_by_Ab.R

  9. Plot mutational tolerability vs sequence conservation for individual residues
    Rscript script/align_freq_vs_score.R

  10. Plot mutational tolerability vs sequence conservation for individual residues
    Rscript script/RSA_vs_alignment_frequency.R

  11. Visualizing the mutational tolerability on the S protein structure Pymol script/plot_Bfactor_as_exp.pml

  12. Analysis of the ciculating NTD mutations/indels among 17 major variants Rscript script/NTD_circulating_mutation_vs_other_residues.R

  13. Mutational tolerability of selected regions compared to other residues Rscript script/NTD_loop_vs_other_residues.R