FragPipe workflows

FragPipe can be downloaded here. Follow the instructions on that same Releases page to launch the program.

Listed below are the analysis workflows provided with FragPipe. Any of these workflows can be customized and saved for later use, each customized workflow should be saved with a unique name.

Closed (standard) database searches
Non-specific digestion
Isobaric label-based quantification
MS1 label-based quantification
Open (mass-tolerant) and mass offset
Labile PTM Searches
Glyco
DIA and spectral library building


Basic search (Default)

Simple closed search, no quantification. MSFragger search with ‘stricttrypsin’ (Trypsin/P) enzyme, fully tryptic peptides only, up to 2 missed cleavages. Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering).

Basic search & quantification with match-between-runs (LFQ-MBR)

Perform closed search, followed by label free quantification and match-between-runs with IonQuant. If using mzML files, need to choose the right MS data type (Regular MS vs IM-MS). Need to assign runs to experiments.

HLA peptide search (Nonspecific-HLA)

Nonspecific search, with recommended settings for HLA peptides. Peptide length 7-25. MSFragger search assumes cysteines were not alkylated (i.e. samples were not treated with iodoacetamide). Cysteinylation (C+119) is specified as variable modification. Protein FDR filter is not applied, so each output file (PSM, ion, peptide) is filtered to 1% FDR at that level. If needed, extend to add label-free quantifcation (using IonQuant) or spectral library building with EasyPQP.

Peptidome search (Nonspecific-peptidome)

Nonspecific search, with recommended settings for peptidome data (plasma, CSF, etc.). Peptide length 7-65. MSFragger search assumes cysteines were alkylated. Met oxidation, C-term amidantion, and Pyro-Glu are specified as variable modifications. Protein FDR filter is not applied, so each output file (PSM, ion, peptide) is filtered to 1% FDR at that level.

Phospho TMT-6 quantification (TMT6-phospho)

TMT 6-plex workflow with quantification from MS3. TMT-Integrator with virtual reference approach, median-centering normalization, data summarization to all levels. If a reference/bridge sample is available, specify the corresponding channel/sample name tag in the annotation file(s) and in TMT-Integrator tab.

TMT-10 quantification (TMT10)

Basic TMT 10-plex workflow, with identification and quantification from high mass accuracy MS2. Met oxidation, protein N-term Acetyl, n-term TMT, and TMT on S (“overlabeling”) are specified as variable modifications. TMT-Integrator with virtual reference approach, median-centering normalization, data summarization at the gene level. If a reference/bridge sample is available, specify the corresponding channel/sample name tag in the annotation file(s) and in TMT-Integrator tab.

TMT-10 MS3 quantification (TMT10-MS3)

TMT 10-plex workflow, with quantification from MS3 and identification from low mass accuracy MS2. Met oxidation, protein N-term Acetyl, and n-term TMT are specified as variable modifications. TMT-Integrator with virtual reference approach, median-centering normalization, data summarization at the gene level. If a reference/bridge sample is available, specify the corresponding channel/sample name tag in the annotation file(s) and in TMT-Integrator tab.

Phospho TMT-10 MS3 quantification (TMT10-MS3-phospho)

TMT 10-plex workflow for phosphopeptide enriched data, with quantification from MS3 and identification from low resolution MS2. PTMProphet for site localization. TMT-Integrator with virtual reference approach, median-centering normalization, data summarization at the gene/protein/peptide/site levels. If a reference/bridge sample is available, specify the corresponding channel/sample name tag in the annotation file(s) and in TMT-Integrator tab.

TMT-10 quantification with bridge/pooled sample (TMT10-bridge)

TMT 10-plex, quantification and identification from high mass accuracy MS2. Met oxidation, protein N-term Acetyl, n-term TMT, and TMT on S (“overlabeling”) are specified as variable modifications. TMT-Integrator with Bridge channel (labeled as ‘pool’ in the annotation files), data summarization at the gene level. Printing results with three normalization options (None; MD: Median Centering; GN: median centering with MAD variance scaling.

Phospho TMT-10 quantification (TMT10-phospho)

TMT 10-plex workflow for phosphopeptide enriched data, with quantification from MS2. PTMProphet for site localization. TMT-Integrator with virtual reference approach, median-centering normalization, data summarization at the gene/protein/peptide/site levels. If a reference/bridge sample is available, specify the corresponding channel/sample name tag in the annotation file(s) and in TMT-Integrator tab.

Phospho TMT-10 quantification with bridge/pooled sample (TMT10-phospho-bridge)

TMT 10-plex workflow for phosphopeptide enriched data, with quantification from MS2. PTMProphet for site localization. TMT-Integrator with Bridge channel (labeled as ‘pool’ in the annotation files), median-centering normalization, data summarization at the gene/protein/peptide/site levels.

Ubiquitin TMT-10 quantification (TMT10-ubiquitin)

TMT 10-plex workflow for ubiquitin enriched data, with quantification from MS2. Site localization based on the MSFragger search engine assignment. TMT is specified as variable modification on both peptide n-term and K. Oxidation and N-term Acetyl, and K Ubiq are also specified as variable modifications. Up to 3 missed cleavages and 4 variable modifications total. TMT-Integrator with virtual reference approach, median-centering normalization, data summarization at the gene/protein/peptide/site levels. If a reference/bridge sample is available, specify the corresponding channel/sample name tag in the annotation file(s) and in TMT-Integrator tab

TMT-16 quantification (TMT16)

Basic TMT 16-plex workflow, with quantification and identification from MS2. Met oxidation, protein N-term Acetyl, n-term TMT are specified as variable modifications. TMT-Integrator with virtual reference approach, median-centering normalization, data summarization at the gene level. If a reference/bridge sample is available, specify the corresponding channel/sample name tag in the annotation file(s) and in TMT-Integrator tab.

TMT-16 MS3 quantification (TMT16-MS3)

Basic TMT 16-plex workflow, with quantification from MS3 and identification form low mass accuracy MS2 (ion trap). Met oxidation, protein N-term Acetyl, n-term TMT are specified as variable modifications. TMT-Integrator with virtual reference approach, median-centering normalization, data summarization at the gene level. If a reference/bridge sample is available, specify the corresponding channel/sample name tag in the annotation file(s) and in TMT-Integrator tab.

iTRAQ search and quantification (iTRAQ4)

Closed search and basic iTRAQ 4-plex workflow, with quantification from MS2. TMT-Integrator with virtual reference approach, median-centering normalization, data summarization at the gene level.

SILAC3

Triple-SILAC quantification workflow. Closed search with MSFragger, SILAC quantification with IonQuant.

SILAC3-phospho

Triple-SILAC, phosphopeptide-enriched workflow. Closed search with MSFragger, site localization with PTM-Prophet, SILAC quantification with IonQuant. PTM-Prophet (optional) requires mzML files as input.

Basic open search (Open)

Open search workflow for PTM analysis. MSFragger localization-aware open search (LOS) algorithm, with deisotoping, mass calibration, parameter optimization, and monoisotope correction enabled. Mass range -150 to 500 Da, with Met oxidation and protein N-term Acetyl included as variable modifications. PeptideProphet with extended mass model. Crystal-C for artifact removal. PTM-Shepherd for mass shift summarization.

Mass shift search (Mass-Offset-CommonPTMs)

Mass Offset (also known as Multinotch) search workflow for a fast search for most common modifications (list of mass shifts specified in MSFragger ‘Mass Offset’ field). The mass/modification list used can be found here. MSFragger localization-aware open search (LOS) algorithm, filtered to report PSMs with specified mass shifts only (with isotope errors allowed). No variable modifications are specified. Mass calibration, parameter optimization, and precursor monoisotope error correction are enabled. PeptideProphet with extended mass model. PTM-Shepherd for mass shift summarization.

Labile phosphopeptide search (Labile_phospho)

For CID/HCD search of phosphopeptides. Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. Phosphorylation specified as both a variable modification and a mass offset to consider labile and nonlabile cases. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet with accurate mass model, PTMProphet localization, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering).

Labile mode ADP-ribosylation search (Labile_ADP-ribosylation)

For CID/HCD search of ADP-ribosylated peptides. Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. ADP-ribosylation specified as both a variable modification and a mass offset to consider labile and nonlabile cases. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet with extended mass model, PTMProphet localization, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering).

N-glycopeptide search (glyco-N-HCD)

For CID/HCD search of enriched N-glycopeptides. Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet with extended mass model, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering). PTM-Shepherd used for summarization.

N-glycopeptide search, hybrid activation (glyco-N-Hybrid)

For hybrid activation (EThcD, etc) search of enriched N-glycopeptides. Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet with extended mass model, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering). PTM-Shepherd used for summarization.

N-glycopeptide search with quantification (glyco-N-LFQ)

For search and label-free quantitation of enriched N-glycopeptides fragmented with CID/HCD. Method can be adapted for other fragmentation methods by comparing MSFragger settings to the glyco-N-hybrid workflow.

N-glycopeptide open search (glyco-N-open-HCD)

For CID/HCD open search of enriched N-glycopeptides. Mass range -200 to +4,000 Da, Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet with extended mass model, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering). PTM-Shepherd used for summarization.

N-glycopeptide open search, hybrid activation (glyco-N-open-Hybrid)

For hybrid activation (EThcD, etc) open search of enriched N-glycopeptides. Mass range -200 to +4,000 Da, Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet with extended mass model, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering). PTM-Shepherd used for summarization.

N-glycopeptide search with TMT (glyco-N-TMT)

For search and TMT quantitation of enriched N-glycopeptides fragmented with CID/HCD. Settings are provided for TMT-10 with virtual reference channel - method can be adapated for other TMT settings by adjusting TMT-Integrator parameters. See other TMT workflows for examples. Method can be adapted for other fragmentation methods by comparing MSFragger settings to the glyco-N-hybrid workflow.

O-glycopeptide search (glyco-O-HCD)

For CID/HCD search of enriched O-glycopeptides. Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet with extended mass model, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering). PTM-Shepherd used for summarization.

O-glycopeptide search, hybrid activation (glyco-O-Hybrid)

For hybrid activation (EThcD, etc) search of enriched O-glycopeptides. Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet with extended mass model, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering). PTM-Shepherd used for summarization.

O-glycopeptide open search (glyco-O-open-HCD)

For CID/HCD open search of enriched O-glycopeptides. Mass range -200 to +4,000 Da, Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet with extended mass model, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering). PTM-Shepherd used for summarization.

O-glycopeptide open search, hybrid activation (glyco-O-open-Hybrid)

For hybrid activation (EThcD, etc) open search of enriched O-glycopeptides. Mass range -200 to +4,000 Da, Met oxidation and protein N-term Acetyl specified as variable modifications, and C+57 as fixed modification. Deisotoping, mass calibration, and parameter optimization are enabled. Post-processing with Philosopher (PeptideProphet with extended mass model, ProteinProphet), with 1% FDR filtering at the PSM and protein levels (sequential filtering). PTM-Shepherd used for summarization.

DIA-Umpire signal extraction (DIA-Umpire)

DIA-Umpire SE module to extract pseudo-MS/MS spectra. Supports RAW, mzML, and mzXML for Thermo data. Requires mzML files for AB Sciex or timsTOF PASEF data. Use this workflow if you would like to run DIA-Umpire on DIA data with an aim to process DIA-Umpire-extracted mzML files at a later stage. For example, to search them together with DDA data to build a combined (hybrid) DIA+DDA spectral library using SpecLib workflow.

DIA-Umpire spectral library building (DIA-Umpire_Speclib)

DIA-Umpire based workflow for direct DIA analysis and building spectral libraries. Supports RAW, mzML, and mzXML for Thermo data. Requires mzML files for AB Sciex and timsTOF PASEF data. Pseudo-MS/MS spectra are extracted with DIA-Umpire SE module, and searched using MSFragger. 1% FDR filtering at all levels (protein, peptide) using 2D, picked FDR strategy. EasyPQP for generating a spectral library compatible with Spectronaut and DIA-NN for subsequent quantification using those tools.

MSFragger DIA narrow window SpecLib

MSFragger-DIA mode for direct identification of peptides from DIA data. Recommended options for narrow-window DIA data such as GFP DIA data for building a spectral library. Reporting 3 highest scoring hits for each MS/MS spectrum. FDR filtering to 1% at all levels (protein, peptide, PSM) using 2D, picked FDR filtering. Spectral library building with EasyPQP to generate a Spectronaut and DIA-NN compatible spectral library for subsequent quantification using those tools. Default is RT alignment using commonly observed peptides (ciRT). Alternatively (recommended for non-human data), when building a library from narrow-window DIA runs, include one (or more) wide-window single-shot DIA runs, and choose “Automatic selection of a run as Reference’ in EasyPQP. All runs will then be aligned to the reference wide-window DIA run.

MSFragger DIA wide window SpecLib

MSFragger-DIA based workflow for direct identification of peptides from wide-window DIA data (or combined narrow- and wide-window DIA data). This is an alternative workflow to using DIA-Umpire as peptides are identified by MSFragger directly from raw DIA data. Reporting 3 highest scoring hits for each MS/MS spectrum, followed by PeptideProphet, ProteinProphet, and FDR filtering to 1% at all levels (protein, peptide, PSM) with 2D, picked FDR approach. Spectral library building with EasyPQP to generate a Spectronaut and DIA-NN compatible spectral library for subsequent quantification using those tools. RT alignment in EasyPQP using “Automatic selection of a run as RT Reference”.

Spectral library building from DDA (SpecLib)

Workflow for building spectral libraries using DDA data, or using pseudo-MS/MS spectra extracted with DIA-Umpire (or using both data types if building a combined DIA+DDA library). Closed search with MSFragger, peptide/protein validation and protein inference with PeptideProphet/ProteinProphet via Philosopher. Building a consensus spectral library with EasyPQP. The library is filtered to 1% FDR at the protein and peptide levels. If using fractionated DDA data, in EasyPQP choose RT Calibration option: “ciRT” (choose iRT if using organisms other than yeast or human). If using DDA data together with DIA-Umpire extracted mzML files, choose RT Calibration option: “Automatic selection of a run as reference RT”. Supports DDA RAW/.d files, mzML, and MGF files. Generated library.tsv file is directly compatible with DIA-NN and Spectronaut for targeted extraction of quantitative information.

Next: see the FragPipe usage tutorial.