Bioinformatics Advance Access originally published online on March 22, 2005
Bioinformatics 2005 21(10):2200-2209; doi:10.1093/bioinformatics/bti370
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data
1School of Electronics Engineering and Computer Science, Peking University China
2Institute for Genomics and Bioinformatics, Graz University of Technology 8010 Graz, Austria
3Department of Information Engineering, University of Padova Italy
4Electrical Engineering and Computer Science Department, Information and Telecommunication Center, University of Kansas USA
5Institute for Genomics and Bioinformatics, Christian-Doppler Laboratory for Genomics and Bioinformatics, Graz University of Technology 8010 Graz, Austria
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: High-throughput and high-resolution mass spectrometry instruments are increasingly used for disease classification and therapeutic guidance. However, the analysis of immense amount of data poses considerable challenges. We have therefore developed a novel method for dimensionality reduction and tested on a published ovarian high-resolution SELDI-TOF dataset.
Results: We have developed a four-step strategy for data preprocessing based on: (1) binning, (2) KolmogorovSmirnov test, (3) restriction of coefficient of variation and (4) wavelet analysis. Subsequently, support vector machines were used for classification. The developed method achieves an average sensitivity of 97.38% (sd = 0.0125) and an average specificity of 93.30% (sd = 0.0174) in 1000 independent k-fold cross-validations, where k = 2, ..., 10.
Availability: The software is available for academic and non-commercial institutions.
Contact: zlatko.trajanoski{at}tugraz.at
| 1 INTRODUCTION |
|---|
The novel biotechnology of high-throughput and high-resolution MALDI-TOF (matrix-assisted laser desorption and ionization time-of-flight) mass spectrometry (MS) makes it promising to explore the low-molecular-weight (LMW) region of the blood proteome for the diagnosis of significant patterns for various diseases (Lilien et al., 2003; Liotta et al., 2003; Petricoin and Liotta, 2003, 2004; Wulfkuhle et al., 2003). Molecular and statistical approaches to identifying ovarian cancer in the early stage are urgently needed, and much work has already been done (Anderson et al., 2003; Bao-Ling et al., 2002; Petricoin et al., 2002a,b; Qu et al., 2002; Vlahou et al., 2003; Wu et al., 2003; Yu et al., 2004). In this work we considered the SELDI-TOF (surface-enhanced laser desorption and ionization time-of-flight) low-resolution and high-resolution raw MS data provided by National Cancer Institute (NCI),1 relative to a study conducted to discriminate ovarian cancer from normal tissue. The published high-resolution data achieved with extensive quality control and assurance (QC/QA) analysis allow superior classification patterns when compared to those obtained with low-resolution instrumentation (Conrads et al., 2004). Recently NCI has published on its website the ovarian high-resolution QqTOF SELDI data mining results of the concordant m/z regions found by some particular classifications, with both sensitivity and specificity of almost 100%. Even now, universal robust methods for identifying ovarian cancer from MS data are still in development.
One of the best challenges is to keep the discriminatory features between two classes of interest while reducing the intolerable dimensionality (Duda et al., 2001). Petricoin et al. (2002a) used genetic algorithms and self-organizing clustering analysis to extract the discriminatory proteomic pattern from the low-resolution training set, achieving sensitivity and specificity of 100% simultaneously in some particular testing trials. Since the result of genetic algorithm converges to a local optimal solution, distinct random initializations of this iterative searching algorithm may lead to distinct solutions. This brings some problems when trying to identify significant biomarkers. Still, it is believed that there should be some interesting relationships between the extracted discriminatory patterns (Zhu et al., 2003). From the ovarian high-resolution MS data, Conrads et al. (2004) showed that the most frequent m/z ratios extracted by a similar method are 845.0895, 8602.237 and 8709.548. Vlahou et al. (2003) tested the method of classification and regression tree (CART) on the ovarian cancer discrimination from benign diseases and healthy controls, which resulted in a cross-validation accuracy of 81.5%.
In this work we have developed a novel method for dimensionality reduction and tested on a published ovarian high-resolution SELDI-TOF data set. We will show that the accuracy can be improved to 8590% on the binned MS data. Several statistical methods for the classification of ovarian cancer based on MS spectra have been compared in Wu et al. (2003), and the method of random forest was demonstrated to outperform other methods like linear discriminant analysis, quadratic discriminant analysis, k-nearest neighbor (k-NN), bagging (Bauer and Kohavi, 1999) and boosting classification trees.
| 2 SYSTEMS AND METHODS |
|---|
We make the following assumptions:
- If the intensity distributions of control and cancer are distinct at a specific m/z, this m/z ratio is supposed to be a useful feature to the classification.
- For the control and cancer data respectively, the m/z at which the random variable of intensity has a small coefficient of variation (CV) is representative.
The MS dataset can be written as S = {(xi,yi)|xi
m, yi = ±1, i = 1,2,...,n, where xi is an intensity vector according to a sorted sequence of m/z ratios and yi is the class label of xi (1 for the healthy, +1 for cancer). A binary optimal classifier is a function f:
m
±1 such that f(xi) = yi for both a training subset and testing subset of S. When the feature space is high-dimensional, feature selection becomes crucial as the first step towards pattern recognition. For the raw ovarian high-resolution SELDI-TOF dataset composed of 95 control samples and 121 cancer samples, the dimension of the original feature space is over 370 000.
To improve the performance of identifying ovarian cancer, we make use of a four-step data preprocessing procedure: (1) binning, (2) KolmogorovSmirnov (KS)-test based feature selection, (3) restriction of coefficient of variation and (4) discrete wavelet transformation. All the procedures do not depend on the particular classifier that will be used later, since they act just on the numerical characteristics of the MS data. Therefore, various classifiers could be trained and tested on the preprocessed data.
| 3 BINNING OF RAW MS DATA |
|---|
In the first step, binning of raw MS data is performed (Fig. 1). Since the length of the observed m/z sequence varies in the raw MS data, we prepared the data as follows:
- Align the study sets according to the sorted union of m/z ratios into an intensity frame with missing data.
- To reduce the dimensionality, we binned the frame, at a given bin length l > 0, into a matrix A of m-by-n, where n = 216 (121 ovarian cancer samples and 95 control samples) and m is determined by l (Table 1). Each bin is an interval of the form [b,b+l], where b,l
+. For the binned data, without ambiguity, the m/z ratio stands for the left boundary of an interval. After binning with b
, l = 1, the dimension is reduced from 373 401 to 11 301.
|
|
The missing data are ignored in binning. Using 01 cost and 10-fold cross-validation, the classification tree achieves a precision of 8590% on the binned MS data. But since CART is a greedy algorithm that decreases the Gini impurity the most at each step (Duda et al., 2001), in general it does not guarantee an optimal reduction of entropy. The precision of the decision tree is regarded as a reference base line for the ovarian cancer diagnosis.
| 4 KS-TEST BASED FEATURE SELECTION |
|---|
For each m/z ratio ri, we compare the distributions of values in data vectors Xi = (xi,1,...,xi,k) and
by a two-sided KS-test (i.e., the null hypothesis H0 is that Xi and
have the same distribution) with a given significance level
. For instance with
= 5%, H0 cannot be rejected at m/z = 703, but rejected at m/z = 8000. The dimension of feature space is reduced to 8094 after choosing only the features that do not pass the KS-test at 5% level.
More flexible feature selection can be based on the reasonably accurate p-values that are guaranteed by
, where ni and
are the sample sizes of Xi and
respectively (Lehmann, 1975). As a comparison, Figure 2a shows the distributions of raw m/z ratios selected by KS-tests with distinct p-values (ignoring the missing intensity observations) which are quite different from those of the binned data plotted in Figure 2(b).
|
As Liotta et al. (2003) pointed out, the traditional single biomarker for a particular cancer makes little sense from a biological perspective, with poor identification of early-stage cancer and the benign. For instance, the widely used biomarker of cancer antigen 125 (CA125) for ovarian cancer can only detect 5060% of patients with stage I ovarian cancer. In contrast to the traditional way, the LMW biomarkers foreshow a satisfiable clinical application to early diagnosis of ovarian cancer (Petricoin et al., 2002a, b). From the viewpoint of pattern recognition, the biomarkers are those key features that allow a well-done classification. Usually, classification and feature selection are entangled. To avoid the over-fitting problem, several trials of feature selection are suggested, independent of classifiers as much as possible. Also, in our opinion, the biomarkers could be many selected binned m/z ratios, not necessarily particular m/z ratios if they are not able to yield a satisfiable result (Diamandis, 2004).
| 5 RESTRICTION OF COEFFICIENT OF VARIATION |
|---|
For a positive random variable X, the coefficient of variation (CV) is defined as c = sd(X)/E(X), which can be estimated by
where s and
are the sample standard deviation and sample mean respectively. The m/z ratio with relatively small CV is considered as a useful feature for the classification. The CV of intensity for the healthy and cancerous will be considered separately. Given CV thresholds of intensity, for instance tH = 0.4, tC = 0.4 for the healthy and cancerous, the feature space dimension is reduced from the second to the fourth column in Table 2.
|
The estimated distributions of binned m/z ratios selected by the KS-test with distinct p-values and a CV restriction of t = tH = tC = 0.4 are illustrated in Figure 2b. We suggest a threshold such that 8595% of the binned m/z ratios are included by the respective empirical cumulative distribution function (CDF) of CV for control and cancer. By empirical CDFs of CV, one can choose the probabilities for control and cancer in advance, then get the corresponding CV thresholds (Fig. 3). In the following, we will set p = 0.05 and tH = tC = 0.4, which reduces the dimension of the feature space from the original 373 401 to 6757.
|
Alternatively, for normally distributed n control (or cancer) intensities at a particular binned m/z ratio, the point estimate of c can be replaced by a (1
)-confidence upper bound
conservatively, determined by the following equation:
![]() | (1) |
is the non-central t distribution function with freedom degree n 1 and non-central parameter
. Especially when
0.3, by McKay's theorem (McKay, 1932),
![]() | (2) |
)-confidence upper bound
![]() | (3) |
| 6 WAVELET TRANSFORMATION |
|---|
Wavelet analysis has achieved a broad and successful application to pattern recognition in the last decade, such as in image compression, turbulence or earthquake prediction. It is also an efficient way to compress self-similar data, localizing a signal in both time and frequency (Chui, 1992; Daubechies, 1992). Compared with Fourier transformation, wavelets have advantages in analyzing physical situations where the signal contains discontinuities and sharp spikes. Recently, there is a growing interest in applying wavelet analysis to biomolecular related signals (Liò, 2003).
After applying the pyramidal algorithm (Mallat, 1989) of discrete wavelet transformation (DWT, linear complexity), the binned selected MS spectrum is compressed further to a 3382-dimensional vector of approximation coefficients, which contains most key information for classification (Fig. 4). Considering the dimension reduction, the undecimated DWT without restriction on the length of signal (traditionally it must be a power of 2) is not preferred, although it is superior to the ordinary one in many statistical applications (Nason et al., 2000). The mother wavelet used in the experiment is Daubechies family (db4) and the boundary values are symmetrically padded. The vector of approximation coefficients acts as a fingerprint of the original raw MS data, with a compression rate of more than 100. What is more, the two samples are separated in a manner of keeping the main discriminatory information for classification. Theoretically, a heavier compression rate can be achieved, at the risk of losing some useful information, by choosing a higher level of approximation coefficients.
|
When applying the procedure recursively, a KS-test on the dataset of wavelet approximation coefficients, with a significance level
= 5%, can hardly reduce the dimension of the feature space any more, neither does the reasonable restriction of CV. Therefore, in a sense of data reduction, the vector of approximation coefficients inherits the key discriminatory traits of MS data.
For the high-resolution ovarian data, the vector of detail coefficients contains almost no information for the healthy, since SVMs identify all the data as cancers. In contrast, the detail coefficients calculated from the low-resolution ones lead to acceptable results (Table 3). Without KS-test based feature selection and restriction of CV, the original low-resolution ovarian dataset 8-7-02 (91 controls versus 162 cancers, http://ncifdaproteomics.com/lowresovarian.php) with 15 154 features are well classified by SVMs of Gaussian kernel parameterized by
= 2, C = 1 (Section 4). Compressed by db2, even the data of detail coefficients yield a good result, especially on the identification of ovarian cancer.
|
| 7 CLASSIFICATION BY SVM |
|---|
The SVM method is a widely used classification method of Statistical Learning Theory, originally started by Vapnik and Chervonenkis in the 1960s (Vapnik, 1998). In case that the training set S is linearly separable, the support vector (SV) classifier (Burges, 1998; Cortes and Vapnik, 1995; Cristianini and Shawe-Taylor, 2000; Schölkoft, 1997) is the hyperplane
(where w
m,b
) with the maximal margin separating the two classified subsamples of S. Generally in the linearly non-separable case, we reach at a soft margin allowing training errors (Shawe-Taylor and Cristianini, 2000), where the classifier H is the solution of the optimization problem that is solved by the method of quadratic programming.
Another approach to the linearly non-separable case is the kernel method (Shawe-Taylor and Cristianini, 2004; Schölkoft and Smola, 2002), by which the training data are mapped into a higher dimensional feature space
(may be infinite) and become more separable, provided that the map
makes
a kernel function guaranteed by the Mercer's condition (Vapnik, 1998).
Besides these techniques, there are still many significant methods of SVM in the published literature that are far more than what it would be appropriate to include here. Quite a few linkages to free SVM softwares or packages, implemented in C (or C++), Fortran, Java, Perl, R and MatLab are available at http://www.support-vector.net/software.html, for instance some popular ones like Joachims' SVMlight (Joachims, 1999) and Lin's libSVM (Lin, 1999).
Applied to the vectors of normalized wavelet approximation coefficients, the Gaussian radial basis function K(xi,xj) = exp(
||xi xj||2) was adopted as the kernel of the non-linear SV classifier.
The performance of classification, for a more comprehensive review, was examined by (1) k-fold cross-validation, where k = 2,3,...,10; (2) k-fold proportional validation: randomly select 100(1 1/k)% controls and 100(1 1/k)% cancers as the training set and test the classifier on the remaining samples; and (3) leave-one- out cross-validation. For each k-fold validation, the random experiment was repeated 1000 times independently (Fig. 5).
|
In leave-one-out cross-validation, six controls and two cancers were misclassified (NCI's high-resolution ovarian data contain 95 controls and 121 cancers). In addition, by the comparison of standard deviation between the classification precisions, all the k-fold (cross, proportional) validations show that the SV classifier is relatively stable to the cancer samples but mutable to those controls (Table 4), which coincides with our intuition about the nature of noise in control data. For each k-fold validation, the average precisions are estimated by the results of 1000 independent random experiments. Obviously, k-fold proportional validation is stricter than k-fold cross-validation, and better at surveying the robustness of classifier for each category. For instance, the worst specificity and sensitivity in 2-fold cross-validations were 85.96 and 90.30%, respectively, while for 2-fold proportional validations they were 76.60 and 86.67% [Fig. 5(a)]. Moreover, the fact of bigger and bigger deviations of control precision in k-fold proportional validations indicates that the SVM overfitting problem seems more serious for the control samples. The same thing also happens to the SV classification on the more reduced data by principal component analysis (PCA) discussed in the next section.
|
Another compelling application is the bagging (Bauer and Kohavi, 1999) of one-hidden-layer neural network. In leave-one-out cross-validation, each training set is resampled 100 times to estimate the output (Fig. 6, Table 5). Totally, only one cancer and four controls are misclassified. However, its complexity is inferior to SVMs.
|
|
Besides decreasing the computational complexity, the procedure of raw MS data
binned MS data
KS-test based feature selection
restriction of CV
wavelet analysis depurates the original data, and explores their category traits for the coming classification. Some other classifiers are able to benefit from the procedure as shown in the next section. | 8 ALTERNATIVE CLASSIFIERS |
|---|
The method illustrated in the previous section leads to a good result in classification, especially when dealing with cancer samples. In an effort to reach a better precision of identifying controls, we tested several algorithms (voted perceptron, discriminant analysis, decision trees, naïve Bayes, some meta learning schemes like bagging and decorate, random forest) on the preprocessed data as described in Sections 2 and 3, but the precisions were generally inferior to those obtained by SVMs (results are shown in Table 6).
|
Since the high number of features (3382) could affect the performance of the mentioned classification algorithms, we performed a PCA on the preprocessed data, in order to further decrease their dimensionality. Since finding the optimal number of components for accurate classification is a non-trivial task, we applied several algorithms to the PCA-reduced dataset with a different amount of components, and at last selected the first nine as the coordinates of the updated feature space, which explain 90% variance of the data.
In leave-one-out cross-validation, nine controls and two cancers are misclassified by the SVM with parameters
= 1.7 and C = 0.7. The other experimental results of this SVM are reported in Table 7. If the training set is large enough, for instance as big as those in 4-fold cross-validation, the PCA reduction affects the distribution of sensitivity little. But it is not the case for specificity (Fig. 7).
|
|
The classifier of Voted Perceptron (VP) (Freund and Schapire, 1999), a method of combining Rosenblastt's perceptron algorithm (Duda et al., 2001) with Helmbold and Warmuth's leave-one-out method (Helmbold and Warmuth, 1995), is also an approach to small sample analysis, taking advantage of the boundary data of largest margin just as SVM does. Using the more reduced data projected on the first nine components, VP achieves a best performance in k-fold cross-validations, compared with quadratic discriminant analysis (QDA), linear discriminant analysis (LDA), Mahalanobis discriminant analysis (MDA), k-NN, naïve Bayes (NB), bagging (bootstrap aggregating), ADtree and J48tree (Table 8). When the pooled covariance matrix of training data is not positive definite, a common solution is to randomly perturb the training data.
|
| 9 IMPLEMENTATION |
|---|
The software was implemented by OSU SVM toolbox for MATLAB, based on Dr Lin's libSVM-v.2.33. Most of the other classification algorithms were taken or re-adapted from the versions present in WEKA (Witten and Frank, 2000).
| 10 DISCUSSION |
|---|
We have developed an efficient method for dimensionality reduction from MS data based on a 4-step strategy: (1) binning; (2) two-sample KS test, (3) restriction of coefficient of variation and (4) wavelet analysis. By efficient preprocessing of high-resolution ovarian MS data, SVMs achieve a satisfiable performance of identifying cancer and the healthy. On the one hand, data preprocessing reduces the dimension of feature space; on the other hand it extrudes the most significant category traits for the coming classification.
Although low-resolution data also lead to a high precision in identifying cancers, a recent study (Baggerly et al., 2004) pointed out many of the problems that seem to characterize this dataset, partly connected with experimental procedure and design, partly with technical problems (especially calibration issues). One of the most relevant observations in our experiments is the fact that it is possible to get very good classification results employing just detail coefficients of DWT. This can either mean that noise is in fact still enough for classification purposes, or that the whole data is affected by some form of corruption that prevents achieving a perfect classification.
The classifier-independent data preprocessing of proteomic MS data shows a promising approach to the coming classification. More robust classifiers (such as Bayesian SVM and Bayesian neural network) are still urgently needed, as well as their ensemble. In addition, the precisions could be further improved by some resampling method (Gelman et al., 2004), which assigns every testing sample point a probability of being cancer.
| Acknowledgments |
|---|
This work was supported by the Austrian GEN-AU project BIN (Bioinformatics Integration Network), ÖAD (Austrian Exchange Service) and the EU Marie Curie Training Site grant Genomics of Lipid Metabolism.
| Footnotes |
|---|
1See http://home.ccr.cancer.gov/ncifdaproteomics/ppatterns.asp for details.
Received on September 14, 2004; revised on March 1, 2005; accepted on March 1, 2005
| REFERENCES |
|---|
Anderson, D.C., et al. (2003) A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: support vector machine classification of peptide MS/MS spectra and SEQUEST scores. J. Proteome Res., 2, 137146[CrossRef][ISI][Medline].
Baggerly, K.A., et al. (2004) Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments. Bioinformatics, 20, 777785
Bao-Ling, A., et al. (2003) Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men. Cancer Res., 62, 36093614.
Bauer, E. and Kohavi, R. (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Machine Learning, 36, 105139[CrossRef].
Burges, C.J.C. (1998) A tutorial on support vector machines for pattern recognition. Data Mining Knowledge Discov., 2, 121167[CrossRef].
Chui, C.K. An Introduction to Wavelets, (1992) , New York Academic Press.
Conrads, T.P., et al. (2004) High-resolution serum proteomic features for ovarian cancer detection. Endocrine-Related Cancer, 11, 163178[Abstract].
Cortes, C. and Vapnik, V.N. (1995) Support vector networks. Machine Learning, 20, 273297.
Cristianini, N. and Shawe-Taylor, J. An Introduction to Support Vector Machines (and Other Kernel-Based Learning Methods), (2000) , Cambridge Cambridge University Press.
Daubechies, I. Ten Lectures on Wavelets, (1992) , Philadelphia SIAM.
Diamandis, E.P. (2004) Mass spectrometry as a diagnostic and a cancer biomarker discovery tool: opportunities and potential limitations. Mol. Cell. Proteomics, 3, 367378
Duda, R.O., Hart, P.E., Stork, D.G. Pattern Classification, (2001) , New York John Wiley & Son, Inc.
Freund, Y. and Schapire, R.E. (1999) Large margin classification using the perceptron algorithm. Machine Learning, 37, 277296[CrossRef].
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B. Bayesian Data Analysis, (2004) 2nd edn Chapman & Hall/CRC Press.
Helmbold, D.P. and Warmuth, M.K. (1995) On weak learning. J. Comput. Syst. Sci., 50, 551573[CrossRef].
Joachims, T. (1999) Making large-scale SVM learning practical. In Schölkoft, B., Burges, C.J.C., Smola, A.J. (Eds.). Advances in Kernel MethodsSupport Vector Learning, MIT Press, pp. 169184.
Lehmann, E.L. Nonparametrics: Statistical Methods Based on Ranks, (1975) , San Francisco Hoden-Day.
Lilien, R.H., et al. (2003) Probabilistic disease classification of expression-dependent proteomic data from mass spectrometry of human serum. J. Comput. Biol., 10, 925946[CrossRef][Medline].
Technical report Lin, C.J. (1999) Formulations of support vector machines: a note from an optimization point of view. National Taiwan University, Dept. of Computer Science.
Liò, P. (2003) Wavelets in bioinformatics and computational biology: state of art and perspectives. Bioinformatics, 19, 29
Liotta, L.A., et al. (2003) Clinical proteomics: written in blood. Nature, 425, 905[CrossRef][Medline].
Mallat, S.G. (1989) A theory for multiresolution signal decomposition: the wavelet representation. IEEE Pattern Anal. Machine Intell., 11, 674693[CrossRef].
McKay, A.T. (1932) Distribution of the coefficient of variation and the extended t distribution. J. R. Statist. Soc., 95, 695698.
Nason, G.P., et al. (2000) Wavelet processes and adaptive estimation of the evolutionary spectrum. J. R. Statist. Soc., 62, 271292[CrossRef].
Petricoin, E.F. and Liotta, L.A. (2003) Mass spectrometry-based diagnostics: the upcoming revolution in disease detection. Clin. Chem., 49, 533534
Petricoin, E.F. and Liotta, L.A. (2004) SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer. Curr. Opin. Biotechnol., 15, 2430[CrossRef][ISI][Medline].
Petricoin, E.F., et al. (2002a) Use of proteomic patterns in serum to identify ovarian cancer. Lancet, 359, 572577[CrossRef][ISI][Medline].
Petricoin, E.F., et al. (2002b) Proteomic patterns in serum and identification of ovarian cancer. Lancet, 360, 170171[ISI][Medline].
Qu, Y., et al. (2002) Boosted decision tree analysis of surface-enhanced laser desorption/ionization mass spectral serum profiles discriminates prostate cancer from noncancer patients. Clin. Chem., 48, 18351843
Schölkoft, B. Support Vector Learning, (1997) , Munich R. Oldenbourg Verlag.
Schölkoft, B. and Smola, A.J. Learning with Kernels, (2002) MIT Press.
Shawe-Taylor, J. and Cristianini, N. (2000) Margin distribution and soft margin. In Smola, A.J., Bartlett, P.L., Schölkoft, B., Schuurmans, D. (Eds.). Advances in Large Margin Classifiers, MIT Press, pp. 349358.
Shawe-Taylor, J. and Cristianini, N. (2004) Kernel Methods for Pattern Analysis. , Cambridge Cambridge University Press.
Vapnik, V.N. Statistical Learning Theory, (1998) , New York John Wiley & Son, Inc.
Vlahou, A., et al. (2003) Diagnosis of ovarian cancer using decision tree classification of mass spectral data. J. Biomed. Biotechnol., 5, 308314.
Witten, H. and Frank, E. Data Mining: Practical Machine Learning Tools with Java Implementations, (2000) , San Francisco Morgan Kaufmann.
Wu, B., et al. (2003) Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics, 19, 16361643
Wulfkuhle, J.D., et al. (2003) Proteomic applications for the early dectection of cancer. Nature, 3, 267275.
Yu, J.K., et al. (2004) An integrated approach to the detection of colorectal cancer utilizing proteomics and bioinformatics. World J. Gastroenterol., 10, 31273131[Medline].
Zhu, W., et al. (2003) Detection of cancer-specific markers amid massive mass spectral data. PNAS, 100, 1466614671
This article has been cited by other articles:
![]() |
M. Hilario and A. Kalousis Approaches to dimensionality reduction in proteomic biomarker studies Brief Bioinform, March 1, 2008; 9(2): 102 - 118. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Saeys, I. Inza, and P. Larranaga A review of feature selection techniques in bioinformatics Bioinformatics, October 1, 2007; 23(19): 2507 - 2517. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Baumgartner and D. Baumgartner Biomarker Discovery, Disease Classification, and Similarity Query Processing on High-Throughput MS/MS Data of Inborn Errors of Metabolism J Biomol Screen, February 1, 2006; 11(1): 90 - 99. [Abstract] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||








(Control SVs)/



