> Skip to contents (site navigation)
The general strategy behind the mass spectrometry analysis in SILAC quantitation experiments is rather simple. In the case of LC-ESI-MS, as the sample is ﬂowing out of an LC system, mass spectrometer is continuously switching between full MS and MS/MS scanning. Typically, from every MS scan up to three or four precursors are selected for subsequent MS/MS experiments. During interpretation of the data, peptides/proteins are identiﬁed by the MS/MS scans while the MS scans are used only for quantitation calculations (Figure 1). Since only peaks previously identiﬁed by MS/MS are used for quantitation, most of the information from full MS scans is not utilized at all. In addition, precursor candidates are often selected from the most intense peaks in a particular scan. Altogether, this approach has a signiﬁcant impact on protein sequence coverage and therefore on quantitation results. It also dramatically decreases a chance to quantify post-translational modiﬁcations and low-abundant proteins. Although this procedure is often used with very good results, there is still a lot of information that can be utilized, especially if the data were measured with high accuracy that can be achieved by modern instruments such as FTICR-MS.
Figure 1: Schematic diagram of a standard data processing workflow in protein quantitation experiments.
Due to the sample complexity in protein quantitation experiments, simple peptide mass fingerprint approach in MS scans cannot be used while interpreting the data, even in the case of extreme mass accuracy of FTICR-MS. Once the proteins are successfully identified by MS/MS scans, however, we can use the MS scans to search for peptides raised from these proteins only. By this approach, protein sequence coverage can be increased, which consequently not only improves the relevance of the protein quantitation but also allows us to search for a number of different types of post-translational modifications. Using the Python programming language a software tool for high accuracy data interpretation in protein quantitation experiments has been developed. The overall application workflow is illustrated in Figure 2.
Raw data are imported as mzXML files with centroided peaks. This format is open source and well documented and appropriate converter exists for almost all manufacturer's native files. These data, as well as all the processing results, are then stored locally in a simple SQLite database file. Once the data are imported, charge-state is calculated for all peak clusters in MS scans by using mass differences and averagine intensity distribution. In the next step, MS/MS scans are used to generate a Mascot query (in Mascot Generic Format) and are sent to a local Mascot server. Identified proteins are stored and a list of predefined modifications is applied to all the sequences. These modifications can be set as fixed (e.g. carbamidomethylation) or variable (e.g. acetylation). In addition, theoretical m/z values of positively identified peptides are used to re-calibrate all the MS scans to reduce the average mass error. According to the sequences of identified proteins and specified digestion enzyme, a database of theoretical peptides is generated with all the possible variants of modifications included. Re-calibrated MS scans are then searched against this database within a specified tolerance. Matching peptides, where both light and heavy forms were found, are then used for subsequent quantitation calculations. The L/H ratio is calculated from areas of corresponding extracted ion chromatograms.
Figure 2: Schematic diagram of our improved data processing workflow in protein quantitation experiments.
The software consists of two main parts - processing part and results viewer. Typical quantitation experiment is usually represented by several LC-MS runs, therefore it is efficient to process the data in a batch mode. Many of the parameters can be specified and stored as a method in a human readable XML file. After processing the data, results are shown in the viewer, where a summary for all identified proteins and matched peptides can be evaluated (Figure 3). It is generally a good practice to manually validate the results in quantitation experiments. For this purpose, all the matched peaks are listed in the viewer together with each peptide and can be seen in a context of the corresponding MS scan. Even if it is not usually the case, one m/z value can match more than one peptide. If so, all the co-matched peptides are also listed after the peak is highlighted. Using a small window showing peptide chromatogram, the results can also be validated within the meaning of peptide’s sequence and retention time. Any peptide or even individual peak pair can be discarded and overall quantitation is recalculated on the fly.
Figure 3: Screenshot of the software. The first part (the topmost) of the main window contains a summary for all identified proteins. The second part lists matched peptides for a selected protein, the third part lists the corresponding matched peaks and the fourth part lists possible co-matched peptides of a selected peak pair. In addition, three small windows with MS scan of a selected peak pair, chromatogram of a selected peptide and protein sequence can also be shown.
NovaQ software can be downloaded and used for non-commercial use only. There is a patent pending as well.
Utilization of high-accuracy FTICR-MS data in protein quantitation experiments
Strohalm M, Novak P, Pompach P, Man P, Kavan D, Witt M, Dzubak P, Hajduch M, Havlicek V
J Mass Spectrom 2009, DOI 10.1002/jms.1602
^ Page top