Guide to FragPipe results

Output files will depend on the workflow used and how experiments/groups are set on the ‘Workflow’ tab. This page lists the different output files generated by FragPipe. Main report files are either comma-separated (csv) or tab-separated (tsv), and their column contents are described here. Outputs that take the name of the input LC-MS file are shown with the generic ‘filename’ placeholder here.

Log files are saved automatically (with a timestamp) if an analysis finishes successfully, but they can also be exported (with the ‘Export Log’ button on the ‘Run’ tab) to help with troubleshooting if an analysis fails.

FragPipe also uses various configuration and intermediate files listed here.

Also see our guide to using FragPipe.

Main report files

psm.tsv (from Philosopher, updated by PTM-Shepherd and IonQuant)
ion.tsv (from Philosopher, overwritten by IonQuant)
peptide.tsv (from Philosopher, overwritten by IonQuant)
protein.tsv (from Philosopher, overwritten by IonQuant)
combined_ion.tsv (from Philosopher, overwritten by IonQuant)
combined_modified_peptide.tsv (from IonQuant)
combined_peptide.tsv (from Philosopher, overwritten by IonQuant)
combined_protein.tsv (from Philosopher, overwritten by IonQuant)
diann-output files (see DIA-NN documentation)

PTM-Shepherd reports

PTM-Shepherd reports on modification profiles and diagnostic ions (if enabled) are found in the ‘ptm-shepherd-output’ folder. If glycan assignment is used, PTM-Shepherd will write assigned glycan information directly to the psm.tsv table.

TMT/iTRAQ reports

Isobaric labeling reports are found in the ‘tmt-report’ folder, generated by TMT-Integrator. Two sets of isobaric labeling quantification reports are generated, one set with abundances and one with ratios. Files containing ‘ratio’ in the file name report ratio to the reference/bridge channel in each plex if specified, or a ratio to the average abundance within each plex (virtual reference approach). Files containing ‘abundance’ are generated by converting ratio tables back to the intensity (ion abundance) scale.

SILAC/dimethyl reports

Quantification results from MS1-based isotopic labeling experiments are generated by IonQuant and reported at the peptide ion, peptide, and protein levels.

Other reports

protein.fas - FASTA file containing the FDR-filtered protein sequences identified in the analysis, generated by Philosopher
MSstats.csv - protein abundance report formatted for use with MSstats (see the tutorial), generated by IonQuant
reprint.int.tsv - input file for the Resource for Evaluation of Protein Interaction Networks (REPRINT) containing protein intensities, generated by Philosopher
reprint.spc.tsv - input file for the Resource for Evaluation of Protein Interaction Networks (REPRINT) containing protein spectral counts, generated by Philosopher
library.tsv - spectral library, generated by either EasyPQP (default) or SpectraST

Log files

filter.log - FDR filtering-specific portion of the log generated by Philosopher, shows the number of PSMs, ions, peptides, and proteins passing the cutoff
log_[timestamp].txt - complete log of the FragPipe analysis

Configuration files

fragger.params - configuration file for MSFragger search parameters
fragpipe_[timestamp].config - configuration file for the entire FragPipe analysis
shepherd.config - parameter file for PTM-Shepherd
annotation.txt - annotation file containing labels for TMT/iTRAQ channels
tmt-integrator-conf.yml - configuration file for TMT-Integrator

Intermediate files

[filename].pepXML - peptide-spectrum matches from the database search with MSFragger
[filename]_c.pepXML - pepXML file after curation by Crystal-C
[filename].pin - peptide-spectrum matches from the database search with MSFragger, formatted for validation with Percolator
interact-[filename].pep.xml - peptide-spectrum matches with validation information generated by PeptideProphet via Philosopher or by Percolator
interact-[filename].mod.pep.xml - peptide-spectrum matches with validation information generated by PeptideProphet (or Percolator) and localization information generated by PTMProphet via Philosopher
combined.prot.xml - protein identifications with validation information generated by ProteinProphet via Philosopher
filelist_proteinprophet.txt - list of interact.pep.xml files to be passed to ProteinProphet, gets around Windows command length limitations for very large experiments
filelist_ionquant.txt - similar list of files to be passed to IonQuant

psm.tsv

`psm.tsv` files contain FDR-filtered search results, where each row contains a peptide-spectrum match (PSM). A separate psm.tsv file will be generated for each experiment specified on the ‘Workflow’ tab. Contents of each column are listed below.

Spectrum MS/MS spectrum identifier, follows the format (file name).(scan #).(scan #).(charge)

Spectrum File name of originating identification file

Peptide peptide amino acid sequence, any modifications not included (‘stripped’ peptide sequence)

Modified Peptide peptide sequence including modifications, modified residues are followed by brackets containing the integer mass (in Da) of the residue plus the modification; blank if peptide is unmodified

Prev AA residue preceding the identified peptide within the mapped protein sequence; - if none

Next AA residue following the identified peptide within the mapped protein sequence; - if none

Peptide Length number of residues in the peptide sequence

Charge charge state of the identified peptide

Retention MS2 scan’s precursor retention time (in seconds)

Observed Mass mass of the identified peptide (in Da)

Calibrated Observed Mass mass of the identified peptide after m/z calibration (in Da)

Observed M/Z mass-to-charge ratio of the peptide ion

Calibrated Observed M/Z mass-to-charge ratio of the peptide ion after m/z calibration

Calculated Peptide Mass theoretical peptide mass based on identified sequence and modifications

Calculated M/Z theoretical peptide mass-to-charge ratio based on identified sequence and modifications

Delta Mass difference between calibrated observed peptide mass and calculated peptide mass (in Da)

Expectation expectation value from statistical modeling with PeptideProphet, lower values indicate higher likelihood

Hyperscore similarity score between observed and theoretical spectra, higher values indicate greater similarity

Nextscore similarity score (hyperscore) of second-highest scoring match for the spectrum

PeptideProphet Probability confidence score determined by PeptideProphet, higher values indicate greater confidence

Number of Enzymatic Termini 2 = fully-enzymatic, 1 = semi-enzymatic, 0 = non-enzymatic

Number of Missed Cleavages number of potential enzymatic cleavage sites within the identified sequence

Protein Start starting position of the identified peptide within the protein sequence

Protein End ending position of the identified peptide within the protein sequence

Intensity precursor abundance (area under the curve) for each PSM if IonQuant is used; or maximum MS1 peak intensity within the retention time tolerance if Philosopher freequant is used (not recommanded)

Ion Mobility TIMS transit time of the precursor ion (1/K₀)

Assigned Modifications variable modifications (listed by mass in Da) with modified residue and location within the peptide

Observed Modifications modifications from Delta Mass values as mapped to Unimod entries. Assigned glycan composition will be placed here if glycan composition assignment is performed (PTM-Shepherd).

MSFragger Localization MSFragger-determined localization for open/offset searches, if using localize_delta_mass. Lower case letter(s) indicate localized site(s). More than one lower case letter indicates ambiguous localization. If all letters are upper case, the unlocalized candidate got a higher score and no localization information is known.

Best Score with Delta Mass highest observed hyperscore when the Delta Mass is placed on the theoretical spectrum (from open/offset search)

Best Score without Delta Mass highest hyperscore observed without placing the Delta Mass on the theoretical spectrum (from open/offset search)

[modified residue]:[modification mass] localization probabilities for each residue/mass pair provided to PTMProphet, where localization probability of each site (closer to 1 = more confident) is denoted in parentheses following the site (e.g., GS(0.101)DRT(0.899)PER in the column STY:79.9663, where phosphorylation probability is higher at T5 than S2); probabilities will add up to the number of modified sites

Glycan Score (only present if glycan composition assignment was performed in PTM-Shepherd). Score assigned to the glycan composition. Higher is better.

Glycan q-value (only present if glycan composition assignment was performed in PTM-Shepherd). Q-value for the glycan composition assignment from the glycan FDR calculation. NOTE: all PSMs that pass peptide FDR are reported, even if the glycan FDR is not passed. Filter this column to q-value less than 0.01 for a 1% glycan FDR (for example).

O-Pair Score (only present if O-Pair was run). Score from O-Pair localization. Higher is better.

Number of Glycans (only present if O-Pair was run). Plausible number of total glycans assigned by O-Pair.

Total Glycan Composition (only present if O-Pair was run). Plausible total (summed) glycan composition for all glycans assigned by O-Pair. N=HexNAc, H=Hex, A=NeuAc, G=NeuGc, F=Fuc

Glycan Site Composition(s) (only present if O-Pair was run). Glycan compositions at each site assigned by O-Pair (in order from N-terminal to C-terminal).

Confidence Level (only present if O-Pair was run). O-Pair localization confidence level. 1=All glycans localized with spectral evidence, 1b=All glycans localized, but by process of elimination (not all with evidence). 2=Some glycans localized but not all. 3=No glycans localized

Site Probabilities (only present if O-Pair was run). O-Pair localization probabilities for each localized site. Format is “[Residue number, glycan composition, probability]”

138/144 Ratio (only present if O-Pair was run). Ratio of oxonium ions detected at m/z 138 to 144 for distinguishing GlcNAc and GalNAc glycans

Has N-Glyc Sequon (only present if O-Pair was run). Whether the N-X-S/T sequon was detected in the peptide sequence.

Paired Scan Num (only present if O-Pair was run). The scan number of the paired scan used for O-glycan localization in O-Pair. The scan number reported in the Spectrum column of the table is for the collisional activation scan (and all information in the row is for the collisional scan, except the O-glycan localization from O-pair, which comes from the paired scan listed here).

[modified residue]:[modification mass] Best Localization highest observed localization probability (from PTMProphet for this modification within the peptide

Purity proportion of total ion abundance in the inclusion window from the precursor (including precursor isotopic peaks), from Philosopher freequant

Is Unique whether the identified sequence maps to a single identified protein (FALSE if shared between multiple proteins identified in the experiment)

Protein protein sequence header corresponding to the identified peptide sequence; this will be the selected razor protein if the peptide maps to multiple proteins (in this case, other mapped proteins are listed in the ‘Mapped Proteins’ column)

Protein ID protein identifier (primary accession number) for the selected protein

Entry Name entry name for the selected protein

Gene gene name for the selected protein

Protein Description name of the selected protein

Mapped Genes additional genes the identified peptide may originate from

Mapped Proteins additional proteins the identified peptide maps to (including any arising from I/L substitutions)

(additional columns for TMT/iTRAQ channels if used, where each contains the relative reporter ion abundances for that PSM)

ion.tsv

`ion.tsv` files contain FDR-filtered search results, where each row contains a peptide sequence with a certain charge and modification state. PSMs are collapsed into a single ion. A separate ion.tsv file will be generated for each experiment specified on the ‘Workflow’ tab. Contents of each column are listed below.

Peptide Sequence peptide amino acid sequence, any modifications not included (‘stripped’ peptide sequence)

Modified Sequence peptide sequence including modifications, modified residues are followed by brackets containing the integer mass (in Da) of the residue plus the modification; blank if peptide is unmodified

Prev AA residue preceding the identified peptide within the mapped protein sequence; - if none

Next AA residue following the identified peptide within the mapped protein sequence; - if none

Peptide Length number of residues in the peptide sequence

M/Z calculated (theoretical) peptide mass-to-charge ratio based on identified sequence and modifications

Charge peptide ion charge state

Observed Mass calculated mass of the identified peptide (in Da)

Probability confidence score determined by PeptideProphet, higher values indicate greater confidence

Expectation expectation value from statistical modeling with PeptideProphet, lower values indicate higher likelihood

Spectral Count number of corresponding PSMs

Intensity maximum intensity from all observed PSMs for the ion

Assigned Modifications variable modifications (listed by modification mass in Da) with modified residue and location within the peptide

Observed Modifications for peptides identified with non-zero delta masses (from open or mass offset searches), modifications mapping to a Unimod entry of the corresponding delta mass are listed here

Protein ID UniProt protein identifier (primary accession number)

Entry Name entry name for the selected protein

Gene gene name for the selected protein

Protein Description name of the selected protein

Mapped Genes additional genes the identified peptide may originate from (including any arising from I/L substitutions)

Mapped Proteins additional proteins the identified peptide maps to (including any arising from I/L substitutions)

(additional columns for TMT/iTRAQ channels if applicable, each contains relative reporter ion abundances)

peptide.tsv

`peptide.tsv` files contain FDR-filtered search results, where each row is an identified peptide sequence. Ions are collapse into a single peptide. A separate peptide.tsv file will be generated for each experiment specified on the ‘Workflow’ tab. Contents of each column are listed below.

Peptide peptide amino acid sequence, no modifications included (‘stripped’ peptide sequence)

Prev AA residue preceding the identified peptide within the mapped protein sequence; - if none

Next AA residue following the identified peptide within the mapped protein sequence; - if none

Peptide Length number of residues in the peptide sequence

Charges peptide ion charge state(s)

Probability confidence score determined by PeptideProphet, higher values indicate greater confidence

Spectral Count number of corresponding PSMs

Intensity summed intensity of the top 3 most abundant ions for the peptide

Assigned Modifications variable modifications (listed by mass in Da) with modified residue and location within the peptide

Protein ID protein identifier (primary accession number) for the selected protein

Entry Name entry name for the selected protein

Gene gene name for the selected protein

Protein Description name of the selected protein

Mapped Genes additional genes the identified peptide may originate from (including any arising from I/L substitutions)

Mapped Proteins additional proteins the identified peptide maps to (including any arising from I/L substitutions)

(additional columns for TMT/iTRAQ channels if applicable, each contains relative reporter ion abundances)

protein.tsv

`protein.tsv` files contain FDR-filtered protein results, where each row is an identified protein group. A separate protein.tsv file will be generated for each experiment specified on the ‘Workflow’ tab. Contents of each column are listed below.

Group protein group number

SubGroup protein subgroup identifier

Protein protein sequence header

Protein ID UniProt protein identifier (primary accession number)

Entry Name protein entry name

Gene gene name

Length number of residues in protein sequence

Percent Coverage percent of protein sequence observed from the identified peptides

Organism species of identified protein

Protein Description protein name

Protein Existence type of evidence that supports the existence of the protein

Protein Probability confidence score determined by ProteinProphet

Top Peptide Probability best peptide probability of supporting peptides

Total Peptides number of peptides (stripped sequences) that can be mapped to the protein. There could be

peptides only mapped to this protein
peptides can be mapped to multiple proteins but the protein inference algorithm assigns it to this protein
peptides can be mapped to multiple proteins but the protein inference algorithm assigns it to the other protein

Unique Peptides number of peptides (stripped sequences) that only mapped to the protein

Razor Peptides number of peptides (stripped sequences) in support of the protein identification. There could be

peptides only mapped to this protein
peptides can be mapped to multiple proteins but the protein inference algorithm assigns it to this protein

Total Spectral Count number of PSMs corresponding to the total peptides

Unique Spectral Count number of PSMs corresponding to the unique peptides

Razor Spectral Count number of PSMs corresponding to the razor peptides

Total Intensity protein intensity calculated using the total peptides (from the top-N algorithm)

Unique Intensity protein intensity calculated using the unique peptides (from the top-N algorithm)

Razor Intensity protein intensity calculated using the unique peptides (from the top-N algorithm)

Razor Assigned Modifications modifications from the razor peptides

Razor Observed Modifications Delta Mass values from the razor peptides

Indistinguishable Proteins proteins that are equally supported by the evidence and cannot be distinguished from the identification in the ‘Protein’ column

(additional columns for TMT/iTRAQ channels if applicable, each contains relative reporter ion abundances)

combined_ion.tsv

`combined_ion.tsv` files contain FDR-filtered ions from all experimental groups, where each row contains a peptide sequence with a certain charge and modification state. Individual PSMs are collapsed. Contents of each column are listed below.

Peptide Sequence peptide amino acid sequence, any modifications not included (‘stripped’ peptide sequence)

Modified Sequence peptide amino acid sequence including modifications

Prev AA residue preceding the identified peptide within the mapped protein sequence; - if none

Next AA residue following the identified peptide within the mapped protein sequence; - if none

Start starting position of the peptide within the mapped protein sequence

End ending position of the peptide within the mapped protein sequence

Peptide Length number of residues in the peptide sequence

M/Z theoretical peptide ion mass-to-charge ratio based on identified sequence, charge, and modifications

Charge charge state of the identified peptide ion

Assigned Modifications variable modifications of the peptide found in the experiment, each listed by mass in Da with modified residue and location within the peptide sequence

Protein ID protein identifier (primary accession number) for the selected protein

Entry Name protein entry name

Gene gene name of the protein corresponding to the identified peptide sequence; this will be from the selected razor protein if the peptide maps to multiple proteins

Protein Description name of the selected protein

[experiment] Spectral Count count of peptide-spectrum matches (PSMs) in the sample that support the peptide identification

[experiment] Intensity normalized ion intensities

combined_modified_peptide.tsv

`combined_modified_peptide.tsv` files contain FDR-filtered peptides from all experimental groups, where each row is a peptide sequence including modifications. Individual ions are collapsed. Contents of each column are listed below.

Peptide Sequence peptide amino acid sequence, modifications not included (‘stripped’ peptide sequence)

Modified Sequence peptide amino acid sequence plus modifications

Prev AA amino acid residue preceding the peptide sequence

Next AA amino acid residue following the peptide sequence

Start starting position of the peptide within the mapped protein sequence

End ending position of the peptide within the mapped protein sequence

Peptide Length number of amino acid residues in the identified peptide

Charges all observed charge states for the modified peptide in the experiment

Assigned Modifications all variable modifications of the peptide found in the experiment, each listed by mass in Da with modified residue and location within the peptide sequence

Protein ID protein identifier (primary accession number) for the selected protein

Entry Name protein entry name corresponding to the identified peptide sequence

Gene gene name of the protein corresponding to the identified peptide sequence; this will be from the selected razor protein if the peptide maps to multiple proteins

Protein Description name of the selected protein

Mapped Genes additional genes the identified peptide may originate from (including any arising from I/L substitutions)

Mapped Proteins additional proteins the identified peptide maps to (including any arising from I/L substitutions)

[experiment] Spectral Count count of peptide-spectrum matches (PSMs) in the sample that support the peptide identification

[experiment] Intensity normalized peptide intensities

[experiment] MaxLFQ Intensity normalized peptide intensities calculated with the MaxLFQ method (this column is only present if ‘MaxLFQ’ is selected)

combined_peptide.tsv

`combined_peptide.tsv` files contain FDR-filtered peptides from all experimental groups, where each row is a (stripped) peptide sequence. Modified versions of peptides are collapsed. Contents of each column are listed below.

Peptide Sequence peptide amino acid sequence, any modifications not included (‘stripped’ peptide sequence)

Prev AA amino acid residue preceding the peptide sequence

Next AA amino acid residue following the peptide sequence

Start starting position of the peptide within the mapped protein sequence

End ending position of the peptide within the mapped protein sequence

Peptide Length number of amino acid residues in the identified peptide

Charges all observed charge states for the peptide in the experiment

Protein ID protein identifier (primary accession number) for the selected protein

Entry Name protein entry name corresponding to the identified peptide sequence

Gene gene name of the protein corresponding to the identified peptide sequence; this will be from the selected razor protein if the peptide maps to multiple proteins

Protein Description name of the selected protein

Mapped Genes additional genes the identified peptide may originate from (including any arising from I/L substitutions)

Mapped Proteins additional proteins the identified peptide maps to (including any arising from I/L substitutions)

[experiment] Spectral Count count of peptide-spectrum matches (PSMs) in the sample that support the peptide identification

[experiment] Intensity normalized peptide sequence intensities

[experiment] MaxLFQ Intensity normalized peptide seqeunce intensities calculated with the MaxLFQ method (this column is only present if ‘MaxLFQ’ is selected)

combined_protein.tsv

`combined_protein.tsv` files contain FDR-filtered proteins from all experimental groups, where each row is a protein group. Contents of each column are listed below.

Protein protein sequence header corresponding to the identified peptide sequence inferred from combined evidence; this will be the selected razor protein if the peptide maps to multiple proteins

Protein ID protein identifier (primary accession number) for the selected protein

Entry Name entry name for the selected protein

Gene gene name for the selected protein

Protein Length number of amino acid sequences in the selected protein

Coverage percent of total protein length represented by the identified peptides

Organism species corresponding to the protein identification

Protein Existence type of evidence for the existence of the protein

Description name of the selected protein

Protein Probability confidence score determined by ProteinProphet from combined evidence, higher values indicate greater confidence

Top Peptide Probability highest PeptideProphet confidence score from all peptides that map to the protein

Combined Total Peptides number of peptides (stripped sequences) that can be mapped to the protein. There could be

peptides only mapped to this protein
peptides can be mapped to multiple proteins but the protein inference algorithm assigns it to this protein
peptides can be mapped to multiple proteins but the protein inference algorithm assigns it to the other protein

Combined Spectral Count number of PSMs corresponding to the razor peptides. Check the description about the razor peptides.

Combined Unique Spectral Count number of PSMs corresponding to the unique peptides. Check the description about the unique peptides for details.

Combined Total Spectral Count number of PSMs corresponding to the total peptides. Check the description about the unique peptides for details.

[experiment] Spectral Count number of PSMs in the sample corresponding to the razor peptides

[experiment] Unique Spectral Count number of PSMs in the sample corresponding to the unique peptides

[experiment] Total Spectral Count number of PSMs in the sample corresponding to the total peptides

[experiment] Intensity normalized (by default) protein intensity using the razor peptides (from the top-N algorithm)

[experiment] Unique Intensity normalized (by default) protein intensity using the unique peptides (from the top-N algorithm)

[experiment] Total Intensity normalized (by default) protein intensity using the total peptides (from the top-N algorithm)

[experiment] MaxLFQ Intensity normalized (by default) protein intensity using the razor peptides (from the MaxLFQ method)

[experiment] MaxLFQ Unique Intensity normalized (by default) protein intensity using the unique peptides (from the MaxLFQ method)

[experiment] MaxLFQ Total Intensity normalized (by default) protein intensity using the total peptides (from the MaxLFQ method)

Indistinguishable Proteins proteins that cannot be distinguished from the selected protein given all sequences/evidence identified in the experiment

global.profile.tsv

`global.profile.tsv` reports the most prominent features from PTM-Shepherd analysis of mass shifts observed from FDR-filtered open search results. Each row corresponds to a different detected mass shift, thus not all PSMs will be represented in this table. Please note that mass shifts are annotated based on UniMod mapping, thus they are not definitive chemical identities and should be used as a starting point along with localization and amino acid enrichment information. Unless otherwise indicated, values are summed from all datasets in the analysis. Column contents are listed below.

peak_apex apex of the detected delta mass peak (in Da)

peak_lower lower bound of the detected peak (Da), determined by precursor tolerance or the detection of an adjacent peak

peak_upper upper bound of the detected peak (Da), determined by precursor tolerance or the detection of an adjacent peak

PSMs the number of PSMs contained within the peak boundary (bin), reported for each dataset if multiple datasets are used as input

peak_signal relative measure of peak prominence/quality. In noisy regions of the delta mass histogram, values are penalized

percent_also_in_unmodified the percentage of PSMs in this mass bin with a corresponding PSM in the unmodified bin

mapped_mass_1 primary modification annotation derived from Unimod, all isobaric modifications listed and separated by “/”

mapped_mass_2 if the delta mass peak is a combination of two masses, a second modification annotation is listed here. As with mapped_mass_1, all isobaric modifications are listed and separated by “/”

similarity MS/MS spectral similarity of modified peptides compared to their unmodified counterparts. When multiple modified-unmodified comparisons are done for a single peptide, these cosine similarity scores are averaged for the peptide. The peptide scores are then averaged across all peptides in the mass shift bin. These comparisons are only done for peptides of the same charge state.

rt_shift retention time shift comparing modified peptides to their unmodified counterparts. When multiple modified-unmodified comparisons are done for a single peptide, the retention time shifts are averaged for the peptide. The peptide shifts are then averaged across all peptides in the mass shift bin. Individual comparisons are only done for peptides in the same LC-MS run. Units are usually seconds but can vary by instrument type

int_log2fc log2 fold-change of average intensity for matched shifted/unshifted peptides, computed as described above. Peptides affect by sample preparation artifacts tend to be lower abundance than their unshifted counterparts, thus this value will be low in these cases

localized_PSMs number of PSMs for this delta mass that showed at least one additional matched ion when the mass shift is placed on a residue

n-term_localization_rate percentage of PSMs with an uninterrupted string of localized residues from the N-terminus. This is calculated differently from other enrichment scores due to the difference in assumptions underlying N-terminal and residue-specific localization, so these values cannot be directly compared to the amino acid enrichment scores.

AA1 amino acid/residue most enriched (most likely to harbor the mass shift) compared to other residues

AA1_enrichment_score equivalent to the odds the delta mass is localized to AA1 compared to other residues

AA1_psm_count weighted number of PSMs where the mass shift localized to AA1. Shifts localizing to multiple residues are divided by the number of localized residues in the spectra, so this is an estimated number of PSMs localized to a particular residue

(same enrichment_score, and psm_count columns for AA2 and AA3 if multiple amino acids are likely to harbor the mass shift)

[experiment]_PSMs number of PSMs with a mass shift in this bin

[experiment]_percent_PSMs number of PSMs from the previous column as a percentage of total PSMs

[experiment]_peptides number of unique peptide sequences with a mass shift in this bin

[experiment]_percent_also_in_unmodified percentage of peptide sequences with a mass shift in this bin that are also found in the zero mass shift bin

global.diagmine.tsv

`global.diagmine.tsv` is a mass shift-centric table that contains the diagnostic features identified for every mass shift. Please note that only mass shifts with diagnostic features detected are reported in the table. Contents of each column are listed below.

peak_apex This field contains the apex of the detected MS1 peak (Da) present in the global.profile.tsv file from PTM-Shepherd.

mod_annotation This field contains the mass shift annotations present in the global.profile.tsv file from PTM-Shepherd. When a mass shift is found to be the combination of two mass shifts, the “Potential Modification 1” and “Potential Modification 2” columns are merged with a semicolon.

type This field can take one of several values. “diagnostic” refers to diagnostic ions, the ions that can be located directly in the spectrum. “peptide” refers to peptide remainder masses, mass shifts that indicate an ion’s presence at a particular distance from an unshifted peptide. Six other values are possible based on parameter setting, each corresponding to one of the major ion series.

mass This field contains the mass of the diagnostic feature. Peptide and fragment remainder masses will have the mass shift away from the theoretical ion. Diagnostic ions will have the m/z of the observed ion, so a non-neutral mass.

delta_mod_mass This field contains the mass that was lost from the original mass shift to arrive at the remainder mass. (Note: only present for peptide and fragment remainder masses.)

remainder_propensity This field contains the average percentage of ions from a particular series that are shifted. For example, a peptide capable of producing 10 b-ions with 2 ions identified ions shifted by the remainder mass and 2 identified ions unshifted would have a propensity of 50%. The propensity score for every representative PSM within a mass shift bin is averaged. (Note: only present for fragment remainder masses.)

percent_mod This field contains the percentage of representative mass shifted PSMs that contain the ion at any intensity.

percent_unmod This field contains the percentage of representative unshifted PSMs that contain the ion at any intensity.

avg_intensity_mod This field contains the average intensity of the ion among representative mass shifted PSMs where the ion is present. To calculate the average across all representative mass shifted spectra, calculate (avg_intensity_mod * percent_mod / 100). Because multiple ions can be matched for fragment remainder ions, this contains the average of the summed intensity of matched ions for each representative PSM.

avg_intensity_unmod This field contains the average intensity of the ions among representative unshifted PSMs where the ion is present. To calculate the average across all representative mass shifted spectra, calculate (avg_intensity_mod * percent_mod / 100). Because multiple ions can be matched for fragment remainder ions, this contains the average of the summed intensity of matched ions for each representative PSM.

intensity_fold_change This field contains the fold change in intensity when comparing the modified to unmodified peptides. This uses intensity across all spectra and can be calculated via (avg_intensity_mod * percent_mod) / (avg_intensity_unmod * percent_unmod).

avg_charge This field contains the average charge of peptides from the mass shift. This enables researchers to to use diagnostic ion information intelligently in designing targeted MS routines or rescoring.

auc This column contains the AUC-ROC statistic for the intensity-based classification of this ion. It is calculated from the U statistic from the Mann-Whitney U Test. This statistic adjusts the two groups such that they are assumed to be of equal size.

global.modsummary.tsv

`global.modsummary.tsv` is a modification-centric table generated from PTM-Shepherd summarization of mass shifts observed in open search workflows. Please note that mass shifts are annotated based on UniMod mapping, thus they are not definitive chemical identities and should be used as a starting point along with localization and amino acid enrichment information. Contents of each column are listed below.

Modification Name/annotation of the modification (as found in the global.profile.tsv file)

Theoretical Mass Shift The theoretical mass (in Da) of the modification from Unimod if annotated, or the peak apex of an unannotated modification

[experiment]_PSMs Number of PSMs with the modification, including any row from the global.profile.tsv file where the modification appears (e.g., a ‘Methylation’ entry in the will include PSMs corresponding to both ‘Methylation’ and ‘Methylation + First isotopic peak’)

[experiment]_percent_PSMs The number of PSMs from the previous column as a percentage of the total PSMs

gene

`[abundance/ratio]_gene_[normalization].tsv` contains isobaric quantification information summarized from the psm.tsv tables by TMT-Integrator to the gene level. If ‘Group by’ is set to ‘Gene level’ (default for non-modification centric quantification workflows) in the ‘Quant (Isobaric)’ tab of FragPipe, only gene-level reports will be generated. Set ‘Group by’ to ‘All’ to also generate protein and peptide-level reports. (Ratios are channel abundance / reference channel abundance, so [channel] - ReferenceIntensity in the tables since values are log2-transformed.)

Index gene name (works best if the analyses were run with properly-formatted FASTA sequence databases, see this page for more information)

NumberPSM total peptide-spectrum matches mapping to the gene that are used in quantification

ProteinID protein identifier mapping to the gene

MaxPepProb highest PeptideProphet probability of the PSMs mapping to the gene

ReferenceIntensity Real reference channel abundance is used if one has been provided, otherwise these values are virtual reference abundances from the average abundance across the channels in a plex, more usage information here. If the experiment contains multiple plexes, average reference intensity across all plexes is used. Values are log2 scaled, with global minimum reference intensity used to impute missing values.

[sample/channel name] normalized and log2 transformed abundance/ratio for the given reporter ion channel from summarization to the gene level

protein

`[abundance/ratio]_protein_[normalization].tsv` contains isobaric quantification information summarized from the psm.tsv tables by TMT-Integrator to the protein level. If ‘Group by’ is set to ‘Protein’ in the ‘Quant (Isobaric)’ tab of FragPipe, only protein-level reports will be generated. Set ‘Group by’ to ‘All’ to also generate gene and peptide-level reports. (Ratios are channel abundance / reference channel abundance, so [channel] - ReferenceIntensity in the tables since values are log2-transformed.)

Index protein name (FASTA sequence header)

NumberPSM total peptide-spectrum matches mapping to the gene that are used in quantification

Gene originating gene name for the protein

MaxPepProb highest PeptideProphet probability of the PSMs mapping to the protein that are used in quantification

[sample/channel name] normalized and log2 transformed abundance/ratio for the given reporter ion channel from summarization to the protein level

peptide

`[abundance/ratio]_peptide_[normalization].tsv` contains isobaric quantification information summarized from the psm.tsv tables by TMT-Integrator to the peptide level. If ‘Group by’ is set to ‘Peptide sequence’ in the ‘Quant (Isobaric)’ tab of FragPipe, only peptide-level reports will be generated. Set ‘Group by’ to ‘All’ to also generate gene and protein-level reports. (Ratios are channel abundance / reference channel abundance, so [channel] - ReferenceIntensity in the tables since values are log2-transformed.)

Index protein name (FASTA sequence header) with the start and end positions of the peptide within the protein sequence

Gene originating gene name for the peptide

ProteinID protein identifier

Peptide stripped peptide sequence

MaxPepProb highest PeptideProphet probability of the PSMs with the sequence that are used in quantification

[sample/channel name] normalized and log2 transformed abundance/ratio for the given reporter ion channel from summarization to the stripped peptide sequence level

multi-site

`[abundance/ratio]_multi-site_[normalization].tsv` contains isobaric quantification information summarized from the psm.tsv tables by TMT-Integrator based on modification sites that have been observed and quantified together. If ‘Group by’ is set to ‘Multiple PTM sites’ in the ‘Quant (Isobaric)’ tab of FragPipe, only multi-site reports will be generated. Set ‘Group by’ to ‘All’ to generate reports at all levels. (Ratios are channel abundance / reference channel abundance, so [channel] - ReferenceIntensity in the tables since values are log2-transformed.)

Index protein identifier with the start and end positions of the potential modification sites within the protein sequence, the count of the possible sites in that sequence window, the count of localized modifications, and the list of modified sites

Gene originating gene name

ProteinID protein identifier

Peptide stripped peptide sequence containing the modification sites, residues with localized modifications are shown in lower case

MaxPepProb highest PeptideProphet probability of the PSMs with the sequence that are used in quantification

[sample/channel name] normalized and log2 transformed abundance/ratio for the given reporter ion channel from summarization to the multiple-mod site level

single-site

`[abundance/ratio]_single-site_[normalization].tsv` contains isobaric quantification information summarized from the psm.tsv tables by TMT-Integrator to the level of single post-translationally modified sites. If ‘Group by’ is set to ‘Single PTM site’ in the ‘Quant (Isobaric)’ tab of FragPipe, only multi-site reports will be generated. Set ‘Group by’ to ‘All’ to generate reports at all levels. (Ratios are channel abundance / reference channel abundance, so [channel] - ReferenceIntensity in the tables since values are log2-transformed.)

Index protein name (FASTA sequence header) with the modified site location within the protein sequence

Gene originating gene name

Peptide stripped peptide sequence containing the modification sites, residues with localized modifications are shown in lower case

SequenceWindow peptide sequence around the localized modified site.

MaxPepProb highest PeptideProphet probability of the PSMs with the sequence that are used in quantification

[sample/channel name] normalized and log2 transformed abundance/ratio for the given reporter ion channel from summarization to the single modification site level (sites are quantified from PSMs with only the site of interest if available, otherwise the median of all localized sites is used if the site is only found with additional sites)

ion_label_quant.tsv

`ion_label_quant.tsv` contains MS1-based isotopic quantification results from IonQuant at the ion (peptide + modification state + charge) level. See the SILAC tutorial for more information. If only Light and Heavy labels are used, columns with ‘Medium’/’M’ will be missing.

Peptide Sequence stripped peptide sequence of the ion

Modified Peptide peptide sequence plus variable modifications denoted in brackets following modified residues

Peptide Length number of amino acid residues in the peptide ion

Charge precursor charge state of the ion

Label Count number of potentially labeled sites within the peptide

[Light/Medium/Heavy] Modified Peptide peptide sequence showing variable modifications plus the positions of labels; blank if not found with the corresponding label

[Light/Medium/Heavy] Intensity maximum observed abundance of the precursor ion with the corresponding labels

Log2 Ratio ML median-centered log2 ratio of Medium to Light intensities, from the maximum observed precursor abundance for each labeled state

Log2 Ratio HL median-centered log2 ratio of Heavy to Light intensities, from the maximum observed precursor abundance for each labeled state

Log2 Ratio HM median-centered log2 ratio of Heavy to Medium intensities, from the maximum observed precursor abundance for each labeled state

Pearson Correlation LM measure of similarity in chromatographic profiles between Light- and Medium-labeled ions

Pearson Correlation LH measure of similarity in chromatographic profiles between Light- and Heavy-labeled ions

Pearson Correlation MH measure of similarity in chromatographic profiles between Medium- and Heavy-labeled ions

[Light/Medium/Heavy] Traced Scans number of MS1 scans quantified for the corresponding label

[Light/Medium/Heavy] Isotopes number of isotopic peaks found for the precursor ion

[Light/Medium/Heavy] Apex Retention Time retention time of the precursor’s apex intensity (usually in seconds but units may vary by instrument type)

[Light/Medium/Heavy] Log10 KL log10 Kullback-Leibler divergence between the observed and theoretical isotope intensity distributions for the ion

[Light/Medium/Heavy] PeptideProphet Probability maximum PeptideProphet probability from all supporting PSMs for the corresponding label

Protein protein sequence header corresponding to the identified peptide ion; this will be the selected razor protein if the peptide maps to multiple proteins (in this case, other mapped proteins are listed in the ‘Mapped Proteins’ column)

Protein ID protein identifier (primary accession number) for the mapped protein

Entry Name entry name for the mapped protein

Gene gene name for the mapped protein

Protein Description name of the mapped protein

Mapped Genes additional genes the identified peptide may originate from (including any arising from I/L substitutions)

Mapped Proteins additional proteins the identified peptide maps to (including any arising from I/L substitutions)

peptide_label_quant.tsv

`peptide_label_quant.tsv` contains MS1-based isotopic quantification results from IonQuant summarized to the peptide (stripped sequence) level. See the SILAC tutorial for more information. If only Light and Heavy labels are used, columns with ‘Medium’/’M’ will be missing.

Peptide Sequence stripped peptide sequence

Modified Peptide peptide sequence plus variable modifications denoted in brackets following modified residues

Peptide Length number of amino acid residues in the sequence

Charges observed precursor charge states for the peptide

Label Count number of potentially labeled sites within the peptide

[Light/Medium/Heavy] Modified Peptide peptide sequence showing variable modifications plus the positions of labels; blank if not found with the corresponding label

Log2 Ratio ML median-centered log2 ratio of Medium to Light intensities, from the maximum observed precursor abundance for each labeled state

Log2 Ratio HL median-centered log2 ratio of Heavy to Light intensities, from the maximum observed precursor abundance for each labeled state

Log2 Ratio HM median-centered log2 ratio of Heavy to Medium intensities, from the maximum observed precursor abundance for each labeled state

Best Pearson Correlation LM highest observed similarity in chromatographic profiles between Light- and Medium-labeled peptides

Best Pearson Correlation LH highest observed similarity in chromatographic profiles between Light- and Heavy-labeled peptides

Best Pearson Correlation MH highest observed similarity in chromatographic profiles between Medium- and Heavy-labeled peptides

Best [Light/Medium/Heavy] PeptideProphet Probability maximum PeptideProphet probability from all supporting PSMs for the corresponding label

Protein protein sequence header corresponding to the identified peptide; this will be the selected razor protein if the peptide maps to multiple proteins (in this case, other mapped proteins are listed in the ‘Mapped Proteins’ column)

Protein ID protein identifier (primary accession number) for the mapped protein

Entry Name entry name for the mapped protein

Gene gene name for the mapped protein

Protein Description name of the mapped protein

Mapped Genes additional genes the identified peptide may originate from (including any arising from I/L substitutions)

Mapped Proteins additional proteins the identified peptide maps to (including any arising from I/L substitutions)

protein_label_quant.tsv

`protein_label_quant.tsv` contains MS1-based isotopic quantification results from IonQuant summarized to the protein level. See the SILAC tutorial for more information. If only Light and Heavy labels are used, columns with ‘Medium’/’M’ will be missing.

Protein protein sequence header

Protein ID protein identifier (primary accession number)

Entry Name entry name for the protein

Gene gene name for the protein

Protein Description name of the protein

Mapped Genes additional genes the supporting peptides map to (including any arising from I/L substitutions)

Mapped Proteins additional proteins the supporting peptides map to (including any arising from I/L substitutions)

Ratios ML number of Medium / Light abundance ratios

Ratios HL number of Heavy / Light abundance ratios

Ratios HM number of Heavy / Medium abundance ratios

Median Log2 Ratios ML median of ion-level log2 Medium / Light abundance ratios

Median Log2 Ratios HL median of ion-level log2 Heavy / Light abundance ratios

Median Log2 Ratios HM median of ion-level log2 Heavy / Medium abundance ratios

Back to FragPipe homepage