Bioinformatics Advance Access originally published online on May 11, 2007
Bioinformatics 2007 23(11):1394-1400; doi:10.1093/bioinformatics/btm083
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Data reduction of isotope-resolved LC-MS spectra

1Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Motivation: Data reduction of liquid chromatography-mass spectrometry (LC-MS) spectra can be a challenge due to the inherent complexity of biological samples, noise and non-flat baseline. We present a new algorithm, LCMS-2D, for reliable data reduction of LC-MS proteomics data.
Results: LCMS-2D can reliably reduce LC-MS spectra with multiple scans to a list of elution peaks, and subsequently to a list of peptide masses. It is capable of noise removal, and deconvoluting peaks that overlap in m/z, in retention time, or both, by using a novel iterative peak-picking step, a rescue step, and a modified variable selection method. LCMS-2D performs well with three sets of annotated LC-MS spectra, yielding results that are better than those from PepList, msInspect and the vendor software BioAnalyst.
Availability: The software LCMS-2D is available under the GNU general public license from http://www.bioc.aecom.yu.edu/labs/angellab/as a standalone C program running on LINUX.
Contact: pdu{at}us.ibm.com
| 1 INTRODUCTION |
|---|
|
|
|---|
Liquid chromatography-mass spectrometry (LC-MS) has been an important technology in protein analysis, especially in differential protein profiling for biomarker discovery. In a typical experiment, proteins in the sample are extracted, digested and peptides are separated by liquid chromatography. Eluent from the chromatography is measured by electrospray (ESI) MS. Because LC-MS can generate gigabytes of spectra daily, thorough manual inspection is unfeasible, and computational methods are necessary to reduce the raw data to peak lists or peptide lists. It is essential to perform reliable data reduction while preserving the weak peaks, because biologically important proteins are often in low abundance. Though simple in principle, the process of data reduction can be highly non-trivial for the following reasons. First of all, the sample can be very complex and have an enormous dynamic range (Corthals et al., 2000; Kettman et al., 2002). A sample with a large number of proteins implies that there will be many overlapping peaks both in the m/z and the retention time dimensions. In addition, both instrumental noise and chemical noise (e.g. Fig. 1a) complicate the spectra. Therefore, LC-MS spectra can be very crowded and have a non-flat baseline. The peak density for LC-MS spectra is often more than one peak per m/z in the 400–1000 m/z range (e.g. Fig. 1b), making charge assignment and deisotoping difficult to perform.
|
Software tools have been developed for LC-MS data visualization, peptide detection and quantitation, e.g. the proprietary BioAnalyst (AB, Foster City, CA, USA), Wang's method (2003), GISTool (Zhang et al., 2005), PepList (Li et al., 2005), Expression Informatics (Silva et al., 2005), mzMINE (Katajamaa et al., 2006), Mapquant (Leptos et al., 2006) and msInspect (Bellew et al., 2006). Programs such as GISTool, Expression Informatics and PepList perform peak-picking, charge assignment and deisotoping on a scan-to-scan basis, i.e. the 1D approach, and are not making full use of the 2D nature of LC-MS data. For programs that perform 2D peak-picking (Bellew et al., 2006; Hastings et al., 2002; Katajamaa et al., 2006; Wang et al., 2003), the noise removal and deconvolution of overlapping peaks or isotope series are either not described or have room for improvements. Furthermore, some of the above programs are not tested with annotated datasets of a reasonable size because spectrum annotation is time-consuming.
In this article, the algorithm LCMS-2D is introduced, which performs data reduction for LC-MS spectra using the 2D approach. It first reduces the spectra to a list of elution peaks, then further reduces the list of elution peaks to a list of peptide ions. We show that the fidelity of data reduction is very good by testing with three annotated LC-MS datasets and comparing with existing software packages.
| 2 METHODS |
|---|
|
|
|---|
2.1 File preparation
Files are originally in the instrument format for Qstar Pulsar (Applied Biosystems, Foster City, CA, USA). They are first converted to mzXML format (Pedrioli et al., 2004) and then to text files using the programs mzStar and mzxml2other (sashimi.sourceforge.net), with an additional step to recover zero-intensity points based on the linear relationship between m/z spacing and the square root of m/z. Then a simple running-average smoothing procedure is applied along the retention time axis to reduce scan-to-scan noise with a window size of 11 scans, chosen to remove scan-to-scan noise at the cost of ignoring most ions appearing in fewer than 5–6 scans. To prevent one-shot wonder, the window size should be set between three scans to the expected minimum peak width.
2.2 Iterative detection of elution peaks
Peak-picking with noisy elution profiles, crowded elution peaks and non-flat baseline can be highly non-trivial. The strategy is to find the strongest elution peak first, remove it from consideration, then look for the next strongest elution peak, and repeat until no more elution peaks can be found above the preset signal-to-noise (S/N) ratio, chosen to be 2.5. Initially, the spectrum for each scan is converted to a list of single-scan peaks, which may include chemical and instrumental noise. Then, the single-scan peaks for all scans are pooled to form a super list of single-scan peaks. Next, a seed is chosen to be the strongest single-scan peak. An extracted ion chromatogram (XIC) is constructed within +/– dm window of the m/z of the seed peak. The size of dm is set to be the maximum spread of a peak centroid over multiple scans, and dm = 0.15 for three datasets tested. Then a peak-picking routine is applied to find elution peaks above the preset S/N ratio. If such an elution peak exists, all single-scan peaks contained in this elution peak, e.g. single-scan peaks within dm m/z of the seed and within the retention time range of this elution peak, are removed from the super list. In rare cases when no elution peak can be found, those single-scan peaks within dm m/z of the seed and within 5 scans of the seed peak are removed from the super list. In the next iteration, a new seed is chosen as the strongest single-scan peak in the current super list, and the process continues until no more elution peaks can be found above the preset S/N ratio.
Because in theory any local maximum is a peak, the goal of peak-picking is not to find all local maxima, but to find elution peaks typical for peptides that are clearly distinguished from noise. A peptide elution peak typically spans 5–300 scans in the LC conditions used for LC-MS datasets described below in Section 2.4. An advantage of using preset peak width is that some solvent peaks which appear throughout the run are removed, e.g. Figure 1a. In practice, the range of peak width for expected peptide elution peaks may need to be adjusted to reflect different LC-conditions and noise levels.
2.3 Converting elution peaks to a peptide list
In this step, LCMS-2D finds elution peaks with similar retention times and performs charge assignment and deisotoping to reduce them to a list of peptide monoisotopic masses. This is made difficult due to noise and possible presence of other isotope series with similar m/z and retention time. Initially, a super list of all elution peaks is created. Then LCMS-2D finds the strongest elution peak as the template peak, and builds a cluster of elution peaks by collecting all other elution peaks that have similar retention time with the template peak or with any of the peaks in the cluster. Note that two peaks in the same cluster may still have dissimilar retention times. Two elution peaks are defined as having similar retention times if the peak apexes are within 5 scans, or if for any peak, more than half of the peak area above the half peak height overlaps with that of another peak. This criterion is chosen such that it effectively reduces the cluster size, while avoiding separating isotopic peaks of the same peptide into different clusters.
For peaks in the same cluster, charge assignment and deisotoping are performed with the variable selection method (Du and Angeletti, 2006), which aims to find the least number of peptide ions which explain the spectrum well according to the principal of variable selection. Because peaks in the same cluster may still have dissimilar retention times, a modification is made to the above method such that only peaks with similar retention times may be assigned to the same isotope series. All peptide ions reported are added to the list of detected peptides. All peaks in the cluster are then removed from the super list of elution peaks. On the other hand, peaks that are not explained well by detected peptides, e.g. with poor goodness of fit to the expected isotope profiles (Senko et al., 1995; Wehofsky et al., 2001), and peaks that belong to single-member clusters are added to a list of unexplained peaks. In the following iteration, the next strongest elution peak in the super list is used as a new template, and so on. Eventually the list of detected peptides contains a list of peptide masses. If applicable, peptides with the same mass and retention time but different charge states are combined, i.e. their intensities are summed together.
A rescue procedure is performed in an attempt to interpret strong peaks in the list of unexplained peaks. This step is useful because it looks for alternative interpretations for peaks not explained well, and can be used to interpret peaks for which neighboring isotopic peaks are not detected due to elution peak overlapping. Figure 2a shows an example where the elution peak of the second isotopic peak forms a shoulder peak at 70.3 min. For an unexplained elution peak, the basic idea of rescue is to perform 1D peak-picking, deisotoping and charge assignment at the scan of elution peak apex, the spectrum at Figure 2b. Therefore, even if the elution peak of any single isotopic peak is not detected by the 2D peak-picking procedure due to peak overlapping, the isotope series may still be detected as long as at least one of the isotopic peaks form a clear elution peak. To avoid false positives, the rescue procedure is restricted to unexplained elution peaks for which the S/N is at least 5.
|
The peptide list can be optionally filtered to improve reliability based on empirical criteria. Most chemical noise peaks can be removed from the list by removing singly charged ions with monoisotopic masses below 600 Da, at the risk of removing some peptide ions. In addition, ions of four or more charges without any corresponding lower charge states can also be filtered out.
2.4 Method evaluation with annotated data
To evaluate the performance of LCMS-2D, the LC-ESI-MS spectra of a mixture of standard peptides are used, referred to as 16-mix in this article. The 16-mix contains 16 known synthetic peptides with eight unique masses, i.e. for each unique mass there are two peptides. The monoisotopic masses of them are: 947.55, 948.54, 962.53, 963.51, 1023.58, 1024.57, 1038.56 and 1039.54, respectively. The mixture is known to have some contaminants.
The experimental conditions for the 16-mix have been described previously (Du and Angeletti, 2006). Briefly, samples are analyzed by a Qq-TOF mass spectrometer (Qstar Pulsar, AB, Foster City, CA, USA). The 2% CH3CN/0.1% formic acid is used as solvent A. The 80% CH3CN/0.1% formic acid is used as solvent B. A C18 column with 300 µm i.d. x 15 cm (Dionex, CA, USA) is used for the separation. The following gradient is used for LC: 30 min at 5% solvent B (desalting) followed by 5–55% B from 30 to 80 min. The flow rate is 3 µl/min. Microelectrospray sources with 20 µm-i.d. capillary is used for ESI. TOF-MS scan is performed in the m/z 300–1800 with a scan time of 1 s. The instrument is calibrated using CsI-peptide mixture as suggested by the vendor. The 16-mix containing 2.5 x 10–9 M concentration of each peptide in a 50 µl volume is injected.
A second annotated LC-MS dataset is a cell lysate fraction of the cell line FaDu (Rangan, 1972). Proteins are extracted from the cell line, and digested. Subsequently, the peptides are injected into a strong cation exchange column with step gradient at salt concentrations of 0, 10, 20, 30, 40, 60, 100, 200, 300 and 600 mm. Each fraction is then injected into the C18 column and the LC-MS spectra are collected under the same conditions as the 16-mix. The LC-MS spectra of the fraction at 100 mm salt are manually annotated and used in this study. A total of 980 ions are visually inspected to be present in the spectra.
A third dataset is used where two synthetic peptides are added as internal markers to cell lysate fractions of the cell line SCC25 (American Type Culture Collection, CRL-1628). The first peptide sequence is VFLQYLKN, with a monoisotopic mass of 1023.58. The second peptide shares the same sequence, except that the two leucines are labeled with 15N. The two peptides should co-elute with 2 Da apart around 70 min. Except for cell line differences, the SCC25 cell lysate fractions are prepared under the same conditions as that used for the second dataset. These fractions are also collected at 100 mm salt. The LC-MS conditions are the same as well. Nine replicates are performed for SCC25 (Sudha et al., in preparation). The two synthetic peptides are added to every replicate.
Results from other programs for the three datasets are also collected for comparison. There are very few parameter inputs required to run PepList and msInspect, except the option of Strategy = Feature Strategy Peak Clusters is chosen to run msInspect (revision 3296). The vendor software is Analyst QS 1.1 and BioAnalyst 1.1.5 (Applied Biosystems, Foster City, CA, USA). Peptide features are extracted from 16-mix spectra with LCMS Reconstruct with the default S/N threshold of 10, and the mass tolerance of 0.15 Da instead of the default 0.2 Da. Feature extraction for the cell lysate fractions is unable to be performed with the vendor software BioAnalyst, perhaps because the file contains a few MS/MS spectra.
| 3 RESULTS AND DISCUSSION |
|---|
|
|
|---|
3.1 Results of the 16-mix spectra
For the 16-mix spectra, the programs used and the total numbers of ions reported are: LCMS-2D, 636; msInspect, 240; PepList, 84; BioAnalyst: 2134. The differences in the number of ions may simply be due to different S/N threshold used by different programs. Since there are impurities in the 16-mix mixture, the goal is to check if LCMS-2D or other programs can find the 16 peptides. Table 1 lists the results for all programs. LCMS-2D finds both the singly charged and doubly charged ions for all 16 peptides except for the singly charged ion for mass 1038.56 at retention time of 72.6 min, because the two strongest isotopic peaks for that ion are both below S/N of 2.5. The program msInspect misses the doubly charged ions for three peptides, and the singly charged ions of nine peptides. The program PepList misses the doubly charged ions of eight peptides, and the singly charged ions of 10 peptides. BioAnalyst misses both the singly and doubly charged ions of five peptides.
|
LCMS-2D stands out as the only program that detects almost all charge states for all 16 peptides. LCMS-2D detects all the strong doubly charged ions for the 16 peptides while no other program is able to. It is possible that some programs have difficulty separating overlapping isotope series, such as masses 947.55 and 948.55 Da, both at retention time of 67.4 min and 1 Da apart. It is also possible that overlapping elution peaks can lead to missing peptide ions for some programs, e.g. Figure 2. Even in the absence of overlapping isotope series or elution peaks, some programs still miss peptide ions. It should be noted that msInspect and PepList are not designed specifically for Qstar data.
LCMS-2D overcomes overlapping elution profiles by either picking up the overlapping elution peaks when the overlapping is not severe, or by triggering the rescue procedure when any of the isotopic peaks has a strong elution profile above S/N of 5. When the rescue procedure is turned off, three ions would be missing that would otherwise be detected with the procedure. An example of how one of the three ions is rescued is shown in Figure 2, where the second isotopic peak forms a shoulder peak at 70.3 min. The rescue procedure essentially is to use the scan-to-scan approach for unexplained strong elution peaks, which avoids the drawback of relying on foolproof elution peak detection. By using the rescue procedure, LCMS-2D effectively combines the advantages of both the 2D approach and the scan-to-scan approach. Visual examination shows that the rescue procedure does not generate unambiguous false positives.
Surprisingly, LCMS-2D has the best mass accuracy among all four programs used (Table 1). For 13 of the 16 peptides, masses calculated by LCMS-2D either are the most accurate, or tie with the most accurate. The fact that LCMS-2D gives the most accurate masses is not expected, because LCMS-2D simply uses the strongest scan for each elution peak to calculate the peptide mass. Because the 16-mix is a small dataset, whether LCMS-2D improves mass accuracy in general still remains to be tested with more data.
Though it is true that the test datasets are not blind to LCMS-2D but are blind to other programs, the difference in results may be explained by the algorithms used instead of parameter selection. PepList is a scan-by-scan, or 1D method which has limited capabilities in resolving overlapping peaks or handling noise. It calculates the elution profile of a peptide by summing up the intensity of all isotopic peaks for the peptide in each scan (Li et al., 2005). It will encounter problems when any of the isotopic peaks overlap with others during the entire elution peak, which often happens for a complex peptide mixture. It is expected that PepList would have problems with the 16-mix where peaks are close in m/z and in retention time to mimic the behavior of complex samples. For msInspect, the performance is better than BioAnalyst. However, one of the reasons msInspect missed ions is probably because it assemble all the isotopes into groups that appear, maximize and then disappear at similar times, an indication of an eluting isotopic distribution potentially from the same peptide (Bellew et al., 2006). Therefore it is likely to run into the same problem as PepList. Moreover, it in unclear how msInspect treats unassigned peaks.
In addition, results of LCMS-2D on the 16-mix are not sensitive to parameter changes within a reasonable range, including the minimum S/N cutoff, the m/z window dm, and the criterion of retention-time similarity.
3.2 Results of the cell lysate fraction
To show the effects of sample complexity, noise and subtlety in the spectra of the digests of a large number of proteins, the LC-MS spectra for a cell lysate fraction are used to test the performance of LCMS-2D. The cell lysate spectra have been manually annotated to have 980 ions.
The list generated by LCMS-2D for the cell lysate spectra has 1597 ions above the S/N ratio of 2.5. To determine if LCMS-2D misses peptides in the annotated list, defined as false negatives assuming that the annotated list is correct, the first 500 strongest ions in the annotated list are compared to the ion list by LCMS-2D. The first 500 strongest ions are used because they are the most reliable of the ions. Results are summarized in Table 2. A total of 47 ions among the list of 500 annotated ions are missed by LCMS-2D. Of the 47 ions, 43 would have been detected if a lower S/N cutoff is used. For three of the 47 missing ions, the elution peaks of one of the two strongest isotopic peaks severely overlap with others, and the S/N of other isotopic peaks for the ions are not above 5 to trigger the rescue procedure. Specifically for the top two strongest missing ions, i.e. the doubly charged mass of 1314.78 at 65.5 min (monoisotopic peak at 658.39 m/z) and the singly charged mass of 657.34 at 65.86 min (monoisotopic peak at 658.34 m/z), the elution peaks of at least one of the strongest isotopic peaks overlap severely with others (Fig. 3). One of the 47 ions is missing because of ambiguous charge state.
|
|
The ion list by LCMS-2D is also compared to the annotated list to look for false positives, assuming the annotated list is correct. The first 500 strongest ions from the LCMS-2D list are compared to ions in the annotated list. The result is that 89 ions in the 500 list from LCMS-2D are not in the annotated list. None of the 89 ions is strong, which shows LCMS-2D does not generate false positives with strong intensities for this dataset. Of the 89 ions, 68 are actually real; 19 are ambiguous; and two weak ions are assigned the wrong charge states due to the interference of noise. The fact that 68 ions are real implies that manual annotation may miss real ions for the spectra of complex protein mixture, which is expected because manual annotation is not a systematic approach, and is subject to analyst fatigue. Despite that, manual annotation is still a valuable standard especially for measuring the false positive rate, which is usually harder to measure than the false negative rate because of the presence of chemical noise, protein modifications and impurities.
As with the 16-mix spectra, the rescue procedure is also helpful in reducing the number of missing ions for the cell lysate spectra. With the rescue feature turned off, the number of false negatives increases to 62, compared to 47 when rescue is turned on.
For comparison, results for the same cell lysate spectra from other programs are also analyzed and shown in Table 2. A total of 406 ions are reported by msInspect. The number of ions among the top 500 strongest annotated ions that are missed by msInspect is 236, including at least a dozen strong-to-medium intensity ions. Strangely, the program PepList only reports 52 ions for the cell lysate spectra, of which 51 are in the annotated list. The only ion which is not real is assigned the wrong charge of four while the true charge state should be three. Results from BioAnalyst are not available for comparison.
3.3 Result of internal markers in SCC25 cell lysate fractions
In order to test if LCMS-2D can detect peptides among complex peptide mixtures, the spectra of SCC25 cell lysate fractions at 100 mm salt are processed for all nine replicates. The goal is to see if LCMS-2D can detect the two marker peptides in all replicates. Results are shown in Table 3. Except for replicate 9 where marker 1025.58 is too weak to detect, LCMS-2D finds all marker peptides. For most markers, both singly and doubly charged ions are found. In comparison, msInspect and PepList often miss the marker peptide of 1025.58 (data not shown). For this dataset the mass accuracy of LCMS-2D and msInspect are similar. Therefore, the fact that LCMS-2D has the best mass accuracy for the 16-mix could be due to chance.
|
3.4 Reducing LC-MS spectra to a list of elution peaks
It is essentially a process of information compression to convert a list of elution peaks to a list of peptides. Due to reasons such as noise, overlapping isotope series, unknown isotope profiles of solvent ions and unknown atomic composition of peptides, this process is susceptible to error and information loss. An obvious way to eliminate the errors is to skip this step and use the list of elution peaks as the results. Another advantage is that information loss is minimized because even if only the strongest peak in an isotope series is detected, the peak is not discarded. The peak would be discarded if elution peaks have to be converted to peptides. As a result, using the elution peaks directly should increase the sensitivity and extend the dynamic range. For a small peptide of 400 Da., the sensitivity increase can be almost 5-fold because the intensity ratio of the two strongest isotopic peaks is about 5:1. A possible disadvantage of skipping this step is that interpretation becomes less straightforward since more than one elution peak corresponds to one peptide. A second disadvantage is that it becomes difficult to filter out non-peptide ions because isotope profiles are not checked against the expected isotope profiles for peptides. An additional concern is that using a single peak without sister isotopic peaks could increase the false positive rate. By examining the reproducibility of peaks across multiple replicates, it is possible to differentiate noise from peptide ions.
3.5 The 2D approach versus the 1D approach
The advantage of the scan-to-scan, i.e. 1D approach, is that it does not rely on foolproof detection of elution peaks. However, it can be difficult or even impossible to distinguish peptide peaks from noise peaks based on an individual scan, which may be possible with the 2D approach by examining the elution profiles, as noted by Hastings and coworkers (2002). For example, Figure 1a is the XIC of a solvent ion at 317.1 m/z showing no clear elution peaks and elevated baseline, while Figure 4a shows typical XIC of peptide ions. An additional difficulty for the 1-D approach is to perform deconvolution in regions of high peak density, e.g. at 432–438 m/z in Figure 1b, while the 2D approach may be able to tackle the complexity by examining peaks with similar retention time apex at a time. Furthermore, the 1D approach may have trouble performing deisotoping with unexpected isotope profiles, while the 2D approach may stop after reducing the spectrum to a set of elution peaks without deisotoping. The 2D algorithm can produce a list of elution peaks, which avoids errors in deisotoping and charge assignment completely and can also increase the sensitivity and dynamic range. Yet an additional difficulty for the 1D approach is how to handle isotope profiles distorted by scan-to-scan noise. Figure 4 shows an example of the distorted isotope profiles for three successive scans, while the expected intensity ratio of peaks 425.7 and 426.2 at the elution peak apex is 2:1. By only examining the elution peak apex where S/N is the strongest, the 2D approach can alleviate the noise problem.
|
For a fair comparison between the 1D and 2D approach, the program LCMS-1D by the authors (unpublished data) which uses the 1D approach is tested with the three annotated spectra. The main difference between LCMS-1D and other programs based on the scan-to-scan approach is it uses variable selection to separate overlapping isotope series and peaks (Du and Angeletti, 2006). For the 16-mix spectra, LCMS-1D can also find all 16 peptides. At the same time, it generates several false positives due to the scan-to-scan noise at regions of low S/N. For the top 500 strongest ions, its performance with the cell lysate spectra is roughly on par with that of LCMS-2D, except that it reports four strong solvent ions. It seems either of the approaches can detect strong ions for which the scan-to-scan noise and the peak density are relatively low; while LCMS-2D is less susceptible to noise (including chemical noise) compared to LCMS-1D.
3.6 Advantage of the iterative approach
LCMS-2D finds peaks iteratively from the strongest to the weakest. The iterative peak-picking algorithm is similar to that used by Z-SCORE (Zhang and Marshall, 1998). However, this is in a 2D context. Mathematically, this is also conceptually similar to the forward selection approach of model building which adds the most significant remaining variable to the model at a time. The advantage of this approach is to assign high priority to strong peaks such that in case two peaks overlap, the stronger peak should be picked first. Furthermore, the strongest peak is likely to have the most accurate m/z which is used to construct the XIC. The iterative approach is also used to cluster elution peaks based on retention time using the strongest peak as the template, because the strongest elution peak has the highest S/N, therefore it is the least likely to be distorted by noise.
| 4 CONCLUSION |
|---|
|
|
|---|
We have developed a new method LCMS-2D for data reduction of LC-MS spectra of complex peptide mixtures. The novel aspects of LCMS-2D are: (1) a new iterative peak-picking method, (2) a modified variable selection method to perform charge assignment and deisotoping for elution peaks with similar retention times and (3) a method to alleviate the problem of overlapping elution peaks by either picking up the lightly overlapped peaks, or using the rescue procedure to detect the correct peptides. LCMS-2D performs well with the 16-mix spectra and the cell lysate fraction spectra, with very few unambiguous false negatives and false positives. The results are better than those from other programs including the vendor software. The program and the first two annotated datasets are available from the authors. We expect this data reduction method to be useful not only for protein profiling studies, but also for LC-MS in general.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
We acknowledge NIH for financial support: CA101150 (RHA) and CA103547 (MBP). Thanks to Dr. Frank Suits for helpful discussions and careful review of the manuscript. Funding for open-access will be provided by "CA101150 (RHA)" from NIH.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Present Address: IBM Computational Biology Center, P.O. Box 218, Yorktown Heights, NY 10598, USA. Associate Editor: Alfonso Valencia
Received on November 17, 2006; revised on February 9, 2007; accepted on February 28, 2007
| REFERENCES |
|---|
|
|
|---|
Bellew M, et al. A suite of algorithms for the comprehensive analysis of complex protein mixtures using high-resolution LC-MS. Bioinformatics (2006) 22:1902–1909.
Corthals GL, et al. The dynamic range of protein expression: a challenge for proteomic research. Electrophoresis (2000) 21:1104–1115.[CrossRef][Web of Science][Medline]
Du P, Angeletti RH. Automatic deconvolution of isotope-resolved mass spectra using variable selection and quantized peptide mass distribution. Anal. Chem. (2006) 78:3385–3392.[Medline]
Hastings CA, et al. New algorithms for processing and peak detection in liquid chromatography/mass spectrometry data. Rapid Commun. Mass Spectrom. (2002) 16:462–467.[CrossRef][Web of Science][Medline]
Katajamaa M, et al. MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics (2006) 22:634–636.
Kettman JR, et al. Clonal proteomics: one gene – family of proteins. Proteomics (2002) 2:624–631.[CrossRef][Web of Science][Medline]
Leptos KC, et al. MapQuant: open-source software for large-scale protein quantitation. Proteomics (2006) 6:1770–1782.[CrossRef][Web of Science][Medline]
Li X, et al. A software suite for the generation and comparison of peptide arrays from sets of data collected by liquid chromatography-mass spectrometry. Mol. Cell. Proteomics (2005) 4:1328–1340.
Pedrioli PG, et al. A common open representation of mass spectrometry data and its application to proteomics research. Nat. Biotechnol. (2004) 22:1459–1466.[CrossRef][Web of Science][Medline]
Rangan SR. A new human cell line (FaDu) from a hypopharyngeal carcinoma. Cancer (1972) 29:117–121.[CrossRef][Web of Science][Medline]
Senko MW, et al. Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distributions. J. Am. Soc. Mass Spectrom. (1995) 6:229–233.[CrossRef][Web of Science]
Silva JC, et al. Quantitative proteomic analysis by accurate mass retention time pairs. Anal. Chem. (2005) 77:2187–2200.[Medline]
Sudha R, et al. (in preparation) Global proteomic analysis distinguishes biologic differences in head and neck squamous carcinoma cell lines.
Wang W, et al. Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. Anal. Chem. (2003) 75:4818–4826.[Medline]
Wehofsky M, et al. Isotopic deconvolution of matrix-assisted laser desorption/ionization mass spectra for substance-class specific analysis of complex samples. Eur. J. Mass Spectrom. (2001) 7:39–46.
Zhang X, et al. An automated method for the analysis of stable isotope labeling data in proteomics. J. Am. Soc. Mass Spectrom. (2005) 16:1181–1191.[CrossRef][Web of Science][Medline]
Zhang Z, Marshall AG. A universal algorithm for fast and automated charge state deconvolution of electrospray mass-to-charge ratio spectra. J. Am. Soc. Mass Spectrom. (1998) 9:225–233.[CrossRef][Web of Science][Medline]
This article has been cited by other articles:
![]() |
P. Du, G. Stolovitzky, P. Horvatovich, R. Bischoff, J. Lim, and F. Suits A noise model for mass spectrometry based proteomics Bioinformatics, April 15, 2008; 24(8): 1070 - 1077. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




