-
Notifications
You must be signed in to change notification settings - Fork 0
EasyTutorial
[TOC]
-
Explanation of
MS Data Format
Options-
RAW (Orbitrap)
:- Earlier Thermo Orbitrap instruments
- Does not support FAIMS
-
RAW (FAIMS)(Orbitrap New Instrument, Ascend)
:- FAIMS-enabled, including newer and earlier Thermo Orbitrap instruments
- Newer Thermo Orbitrap instruments (e.g., Ascend), including those with or without FAIMS
-
RAW (Astral)(FAIMS)
:- Thermo Astral instruments, with or without FAIMS
-
d (timsTOF)
:- Bruker timsTOF instruments
-
MGF (Not Recommended)
:- MGF: A simple text-based spectral peak file
- Why MGF is not recommended
-
PF2
:- Internal binary spectral file used by the pFindStudio series software
-
-
If encounter any issues, please refer to: Spectrum Extraction Fails
-
Directly loading
RAW
files is recommended- pLink3 uses its internal plugin called
pParse
to extract spectra and convert them to binary.pf2
files -
pParse
provides precursor mass correction and monoisotopic peak detection functions - Additionally,
pParse
supports exporting mixed spectra, meaning that multiple precursors are fragmented in a single MS2 - The binary
.pf2
format allows for relatively faster read/write operations - Similarly, pLink2 also does not recommend
MGF
input and suggests directly loadingRAW
files
- pLink3 uses its internal plugin called
-
MGF is not recommended
- Although
MGF
is widely used, it is not a standardized format- In
MGF
files, spectrum header information requirements vary across software; aside from the minimal definitions, there is no universally accepted standard
- In
- If the
MGF
file lacks complete information, unexpected bugs might occur during pLink searches -
Note: Under certain settings,
pParse
can also export.mgf
files, but they are different from those extracted by other tools. The.mgf
files extracted by third-party tools may lack the additional features provided bypParse
mentioned above, and thus do not achieve optimal pLink identification performance.
- Although
-
If the instrument type lacks native pLink support
- For instruments not natively supported by pLink, you may use third-party tools like MSConvert to export
MGF
as input. (Functional but suboptimal - better than nothing)
- For instruments not natively supported by pLink, you may use third-party tools like MSConvert to export
-
For versions greater than
pLink3.0.16
, a new multi-process mode (Multi-process
) has been added. -
Multi-thread mode (
Multi-thread
)- The entire search workflow is handled by only 1 process. On this single process, multiple threads are allocated.
-
RAW
files are searched sequentially, meaning the nextRAW
file will only be searched after the current one is completed.
-
Multi-process mode (
Multi-process
)- Some parts of pLink only support single-thread operation, such as serial file reading and writing. During these parts, CPU resources cannot be fully utilized, which affects the overall search speed.
- The motivation behind designing
Multi-process
is to maximize the utilization of CPU resources as configured in the parameters. - In
Multi-process
mode, the number of processes and threads per process are automatically allocated by the program, but the total CPU usage will not exceed the configuredCPU Number
. -
Multi-process
searches multipleRAW
files in parallel, merging the results at the end and performingFDR
quality control together.
- Whether you choose
Multi-process
orMulti-thread
, it has almost no effect on the identification results. It only relates to the allocation of CPU resources. - We strongly recommend using
Multi-process
when there are manyRAW
files, as it will be faster. - When there is only one
RAW
file, there is no difference betweenMulti-process
andMulti-thread
. - If choosing
Multi-process
gets stuck at the following log, and you do not see the process of loading spectra, you can try switching toMulti-thread
to see if it runs normally.[MultiProcess] Start...
- For troubleshooting, please refer to: Self-Troubleshooting for pLink3 Crashes or Prolonged Runtime
-
Supported Crosslinker Types
- Chemical crosslinking, mass spectrometry non-cleavable (MS-non-Cleavable), e.g., DSS
- Chemical crosslinking, mass spectrometry cleavable (MS-Cleavable), e.g., DSSO
- Endogenous crosslinking, e.g., disulfide bonds (SS), zero-length crosslinking between amino acids (e.g., Tyr-Tyr)
-
Method: Open
pConfig.exe
and select theLinkers
tab -
To configure a new crosslinker, mimic existing ones
- For example, in the above screenshot, the parameters for the MS-Cleavable
DSSO
crosslinker
- For example, in the above screenshot, the parameters for the MS-Cleavable
-
Parameter Definitions
-
Name
: Crosslinker name- Note: The name must not contain spaces or special characters
- Avoid special characters, including: Chinese,
=
, non-English letters, etc. - Recommendation: It is best to consist only of numbers, English letters,
-
,_
, etc.
-
AlphaSites
/BetaSites
: Reactive sites of the crosslinker-
[
denotes the protein N-terminus,]
denotes the protein C-terminus - For asymmetric crosslinkers,
AlphaSites
andBetaSites
can be different - For multiple reactive sites, list the amino acid letters (e.g.,
DE
)
-
-
LinkerMass
: The extra mass added after the crosslinker reaction -
MonoMass
: The extra mass when the crosslinker reacts at only one end- If multiple
monolink
forms exist, fill in only the most significant one here; the others can be set as avariable modification
in the search parameters
- If multiple
-
LinkerComposition
: The chemical composition added after the crosslinker reaction -
MonoComposition
: The chemical composition added when the crosslinker reacts at only one end -
MSCleavable
: Whether the crosslinker arm are cleavable during MS fragmentation -
LongMass
/ShortMass
: The residual modification mass after crosslinker arm cleavage-
Long
is the larger residual mass,Short
is the smaller one - If more than two cleavage types exist, fill in only the two most significant ones
-
-
-
Important:
LinkerComposition
andMonoComposition
must be filled in correctly, as the software uses these to recalculate crosslinker related masses. -
If unsure how to set the chemical composition or mass of the crosslinker, it is recommended to consult the official documentation or literature
- For example, for the DSSO crosslinker:
- For example, for the DSS/BS3 crosslinker:
- Reference websites:
-
Isotope Labeled Crosslinkers
-
H
or1H
denotes hydrogen,2H
denotes deuterium. -
N
or14N
denotes nitrogen,15N
denotes its isotope. -
C
denotes carbon,13C
denotes its isotope. - For example: Configuration for DSS-D12
- DSS-D12 means 12 hydrogen atoms in DSS are replaced by 12 deuterium atoms
- Reference documentation: DSS-H12/D12
-
LinkerMass
: 150.1434042 -
MonoMass
: 168.1539675 -
LinkerCompostion
: C(8)1H(-2)2H(12)O(2) -
MonoCompostion
: C(8)2H(12)O(3)
-
-
MS Data Format
各项的解释-
RAW (Orbitrap)
:- 较早的Thermo的Orbitrap仪器
- 不支持FAIMS
-
RAW (FAIMS)(Orbitrap New Instrument, Ascend)
:- 使用了FAIMS,包括较早和较新的Orbitrap仪器
- Thermo较新的Orbitrap仪器,例如Ascend,包括使用或没有使用FAIMS
-
RAW (Astral)(FAIMS)
:- Thermo的Astral仪器,包括使用或没有使用FAIMS
-
d (timsTOF)
:- Bruker的timsTOF仪器
-
MGF (Not Recommended)
:- MGF: 简单的存储谱峰的文本文件
- 为什么不推荐选择MGF
-
PF2
:- pFindStudio系列软件内部的二进制谱图文件
-
- 遇到问题请参考: 若没能成功提取谱图
-
推荐直接载入
RAW
文件- pLink3调用内部插件
pParse
提取谱图,并转化为二进制格式的.pf2
文件 -
pParse
有precursor mass校正和monoisotopic peak检测功能 - 且
pParse
支持混合谱导出,混合谱意为多个precursors共碎裂在一张MS2 - 二进制格式
pf2
的读写相对更快 - 同样,pLink2也是不推荐
MGF
输入,建议直接载入RAW
- pLink3调用内部插件
-
不推荐选择
MGF
- 虽然
MGF
使用广泛,但MGF
不是标准格式- 不同软件要求的
MGF
的谱图头信息存在差别,除最低定义外,没有公认的规则
- 不同软件要求的
- 若
MGF
所含信息不全,pLink搜索时可能会出现奇怪的bug -
注意:
pParse
部分设置下也导出.mgf
文件,但和其它工具提取的是不同的。第三方工具提取的.mgf
可能没有以上说的pParse
附加的特性,达不到最佳的pLink鉴定性能
- 虽然
-
若有仪器类型pLink没有原生支持
- 若某些仪器类型pLink没有原生支持,也可以使用第三方工具导出
MGF
格式作为输入,例如MSConvert。(仅限跑通pLink软件,有总比没有强)
- 若某些仪器类型pLink没有原生支持,也可以使用第三方工具导出
-
大于
pLink3.0.16
的版本,新增多进程模式(Multi-process
)。 -
多线程模式(
Multi-thread
)- 搜索的全流程,仅1个process在工作。在这1个process的基础上,分配多线程。
-
RAW
文件是一个接一个顺序搜索,即仅上一个RAW
搜索完毕,才会进行下一个RAW
的搜索。
-
多进程模式(
Multi-process
)- pLink的部分环节仅支持单线程工作,例如串行的文件读写。在这些环节时,无法充分利用CPU资源,影响整体的搜索速度。
- 设计
Multi-process
的动机,即是尽可能充分利用设定的CPU资源。 -
Multi-process
模式,进程数和每个进程的线程数由程序自动分配,但总的CPU利用数不会超过设定的CPU Number
。 -
Multi-process
会并行地同时搜索多个RAW
,在最后合并结果,并一起进行FDR
质控。
- 不管选择
Multi-process
, 还是Multi-thread
,对鉴定结果几乎没有影响,仅与CPU资源的分配有关系。 - 我们强烈建议在
RAW
文件比较多时,使用Multi-process
,速度会更快。 - 一个
RAW
文件时,Multi-process
和Multi-thread
没有区别。 - 如果选择
Multi-process
一直卡在如下日志,没有看到载入谱图的过程,可以尝试切换Multi-thread
看是否能正常运行[MultiProcess] Start...
- 问题排查可参考:pLink3运行崩溃或时间过长-自我排查
-
支持交联剂类型
- 化学交联,质谱不可断裂(MS-non-Cleavable),例如DSS
- 化学交联,质谱可断裂(MS-Cleavable),例如DSSO
- 内源性交联,例如二硫键(SS)、氨基酸零距离交联(例如Tyr-Tyr)
-
方法:打开
pConfig.exe
,选择Linkers
栏 -
配置新交联剂,可模仿已有交联剂
- 例如上图示例,MS-Cleavable的
DSSO
交联剂参数
- 例如上图示例,MS-Cleavable的
-
参数含义
-
Name
:交联剂名称- 注意:交联剂名称不能含空格或特殊字符
- 不要含特殊字符,包括:中文、
=
、小语种等 - 建议:最好仅由数字、英文字母、
-
、_
等组成
-
AlphaSites
/BetaSites
:交联反应位点-
[
表示蛋白质N端,]
表示蛋白质C端 - 非对称交联剂,
AlphaSites
和BetaSites
可不同 - 多反应位点,可填多个氨基酸字母,例如
DE
-
-
LinkerMass
:交联剂反应后,多出的质量 -
MonoMass
:交联剂仅一端反应,多出的质量- 如存在多种
monolink
形式,此处可仅填其中最显著的一种,其它的可以当做可变修饰
设置为搜索参数
- 如存在多种
-
LinkerComposition
:交联剂反应后,多出的化学组成 -
MonoComposition
:交联剂仅一端反应,多出的化学组成 -
MSCleavable
:交联臂在质谱碎裂过程中是否可断裂 -
LongMass
/ShortMass
:交联臂断裂后,残留的修饰质量-
Long
为残留质量大者,Short
为残留质量小者 - 若存在超过2种断裂类型,此处可仅填最显著的2种
-
-
-
注意:
LinkerComposition
、MonoComposition
需要填写正确,软件内部会利用化学组成,再计算一遍交联剂的相关质量。 -
在不知道怎么设置交联剂化学组成或质量时,建议查阅官方文档或文献
- 例如,DSSO交联剂
- 例如,DSS/BS3交联剂
- 可参考网站
-
同位素标记的交联剂
-
H
或1H
表示氢原子,2H
表示氘原子 -
N
或14N
表示氮原子,15N
表示其同位素 -
C
表示碳原子,13C
表示其同位素 - 例如:DSS-D12的参数配置
- DSS-D12意为DSS的12个氢原子被12个氘原子取代
- 参考文档:DSS-H12/D12
-
LinkerMass
: 150.1434042 -
MonoMass
: 168.1539675 -
LinkerCompostion
: C(8)1H(-2)2H(12)O(2) -
MonoCompostion
: C(8)2H(12)O(3)
-