pUniFind: Unified large pretrained deep learning model pushing the limit of mass spectra interpretation
This is the official repository for pUniFind, the most powerful zero-shot open peptide-spectrum scoring model surpassing other SOTA search engines and the first zero-shot open de novo sequencing deep learning model supporting over 1300 modifications. Developed by pFind group and DP Technology. We will release our arxiv preprint very soon.
π₯ Powerful open scoring performance. Surpassing all former SOTA search enegines including open-pFind and MSFragger with MSBooster supporting over 1300 modifications.
π₯ High Accuracy. Comprehensive experimental results demonstrate that the model exhibits no significant overfitting to either the target or decoy peptides in the training data, while maintaining high accuracy across different evaluation scenarios. More careful evaluations can be seen in our preprint.
π₯ Zero-shot open de novo. The first open de novo sequencing deep learning methods without the need for finetuning, supporting over 1300 modifications.
π₯ De Novo reliable result filtering and user-friendly result file. Based on various deep learning features, our model can effectively filter out unreliable results which is extremely useful for real world usage. Our user-friendly results file also contains end-to-end score, cos similarity, mass difference and missing fragment ion sites, which can better help user to evaluate its reliability. Result file also support visualization.
- 2025/5/25 pUniFind repository Initial Release π.
Please see our user guide.
Should you encounter any technical issues, suggestions, observe suboptimal performance, or identify inconsistencies between pUniFind results and our evaluation metrics, we welcome your feedback π. We are looking for bad cases to further refine our model. If you have any suggestions about our software, please do not hesitate to contact us. We are actively updating and refining our software, since the main author is far from graduation :(.
We provide priority support for user-reported issues through the following channels:
For technical inquiries:
-
GitHub Issues: Open a new issue with:
- Data description.
- Error logs and environment.
- Uploaded folder description
-
pFind Studio user support WeChat group:
- Please add my WeChat:
JL_Zhao2000
, and I will invite you into our user support group. (Because WeChat invitation expires in one week.)
- Please add my WeChat:
For collaboration requests:
π§ Contact info: Jiale Zhao. Email: zhaojiale22z@ict.ac.cn or marshmallowzjl@gmail.com.
Staring and watching our repo will remind you of our updates. We will keep optimizing our model.
Milestone | Status |
---|---|
Post arxiv preprint | π very soon |
TIMS / Astral Support | π very soon |
nce option (currently use default 25 as input) | π very soon |
Integarating pUniFind into open-pFind | π§ Preparing |
User-defined new PTM Tuning | π Planning |
Improving the performance and speed of scoring and de novo sequencing. | π Long-term |
If you find our software is useful and helped your research, please cite us π through:
Waiting for bioxiv
Your every citation will motivate the main author to make pUniFind more user-friendly and powerful. The main author needs your valuable citations and stars to find a job after graduation π«.