Skip to content

v0.1.0

Choose a tag to compare

@mirimia mirimia released this 03 Mar 18:48
· 190 commits to master since this release
b2bac89

New features

  • compare_exon_sets module is now available.
  • get_cluster_stats.pl script is now available.
  • params.config (main.nf): addition of an "orthogroupnum" parameter specifying the number of orthogroups to be jointly evaluated in a single instance of the cluster_EXs process, reducing the number of required jobs.
  • Process check_input (main.nf): addition of a check for geneIDs in input files (i.e. raising of warnings for not-coding genes included in the gene orthogroups).
  • GetLiftOverFile.pl: addition of an option to filter by SS dinucleotide.
  • Addition of a test set for a limited number of selected gene orthogroups in 3 mammalian species (human, mouse and cow).

Changes to the main.nf algorithm

  • Process parse_IPA_prot_aln: addition of an initial filter to avoid comparing identical protein isoform pairs.
  • Process parse_IPA_prot_aln: modification of the sliding window to search for intron conservation depending on the number of gaps in the region surrounding the intron.
  • Process parse_IPA_prot_aln: addition of gap length correction on left and right side of the alignment surrounding an intron when evaluating intron conservation.
  • Process parse_IPA_prot_aln: addition of a filter to ensure that I1 and I2 (the two introns surrounding the evaluated exon) are indeed consecutive.
  • Process parse_IPA_prot_aln: addition of a requirement for valid single exon matches to have >50% non-gapped alignments.
  • Process parse_IPA_prot_aln: correction of cases in which aligned exons with 0% similarity were considered not aligned.
  • Process parse_IPA_prot_aln: do not consider as valid hits of internal (query) exons against first/last exons in the target isoform.
  • Process score_EX_matches: adjustment of the scoring when evaluating homology of N- and C-terminal exons.
  • Process score_EX_matches: change the score to -1 for the evaluated exon in case no exon alignment is detected in the relative target isoform.
  • Process filter_and_select_best_EX_matches_by_targetgene: microexons (<=3 amino acids) in both species automatically pass the sequence similarity.
  • Process filter_and_select_best_EX_matches_by_targetgene: inversion of the logic used in the selection of the best target-gene hit. First, we filter based on the scores of the single features, then we select the best hit per gene (prioritizing the filtered ones).
  • Process cluster_EXs: addition of an extra output including the exons excluded from the clustering algorithm.

Others

  • Updated documentation.
  • params.config (main.nf): rename "liftover" variable as "bonafide_pairs".
  • params.config (exint_plotter): change of the "isoformID" parameter from transcript ID to protein ID.
  • Various bug corrections and fixes.