Bioinformatics Advance Access originally published online on February 15, 2005
Bioinformatics 2005 21(10):2403-2409; doi:10.1093/bioinformatics/bti324
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Boosting proportional hazards models using smoothing splines, with applications to high-dimensional microarray data
1Rowe Program in Human Genetics, University of California Davis, CA 95616, USA
2School of Mathematics and Systematic Sciences, Shandong University Jinan, Shandong 250100, PRC
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Motivation: An important area of research in the postgenomics era is to relate high-dimensional genetic or genomic data to various clinical phenotypes of patients. Due to large variability in time to certain clinical events among patients, studying possibly censored survival phenotypes can be more informative than treating the phenotypes as categorical variables. Due to high dimensionality and censoring, building a predictive model for time to event is more difficult than the classification/linear regression problem. We propose to develop a boosting procedure using smoothing splines for estimating the general proportional hazards models. Such a procedure can potentially be used for identifying non-linear effects of genes on the risk of developing an event.
Results: Our empirical simulation studies showed that the procedure can indeed recover the true functional forms of the covariates and can identify important variables that are related to the risk of an event. Results from predicting survival after chemotherapy for patients with diffuse large B-cell lymphoma demonstrate that the proposed method can be used for identifying important genes that are related to time to death due to cancer and for building a parsimonious model for predicting the survival of future patients. In addition, there is clear evidence of non-linear effects of some genes on survival time.
Contact: hli{at}ucdavis.edu
| 1 INTRODUCTION |
|---|
|
|
|---|
With the completion of sequencing the human genome and with the development of high-throughput technologies, we are now able to obtain information about an individual's entire genome or the entire genomic profile of a tumor. Very high-dimensional genetic and genomic data are being generated in pharmaceutical industries and in biomedical and clinical research. Examples of high-throughput data include the genome-wide SNP data, microarray-based gene-expression data and proteomic data. It is believed that understanding an individual's genetic makeup or individual tumor's genomic profile would provide the key for explaining many clinical variations. For example, with the DNA microarray technology, one can simultaneously measure expression levels for thousands of genes in cancer tissues, which offers the possibility of a powerful, genome-wide approach to study the genetic basis of different types of tumors. Several studies (Golub, 1999; Rosenwald et al., 2002) have demonstrated great success in predicting cancer class using the gene-expression data. Different classes of cancer may correspond to different clinical outcomes of a given treatment. In addition, studies also demonstrated that additional predictive power can be obtained by incorporating genomic information in addition to the traditional predictive factors such as tumor grades, sizes and stages (Rosenwald et al., 2000).
Due to large variability in time to a clinical event such as cancer recurrence among cancer patients, studying possibly censored survival phenotypes can be more informative than treating the phenotypes as binary or categorical variables. Since the follow-up time is limited, some patients' exact survival time cannot be measured. For these patients, we only have their right censored survival time. The emphasis of this paper is to develop methods for predicting patient's time to clinical events using high-dimensional genetic or genomic data and to identify important genes and their effects on the risk of the event. The most popular method in regression analysis for censored survival data is the Cox regression model (Cox, 1972). However, due to the very high dimensional space of the predictors, e.g., the genes with expression levels measured by microarray experiments, the standard maximum Cox partial likelihood method cannot be applied directly to obtain parameter estimates. There are two main solutions in the literature so far. One approach is based on dimension reduction such as singular value decomposition or the partial Cox regression (Bair and Tibshirani, 2004, Li and Gui, 2004 http://repositories.cdlib.org/cbmb/L1Cox). The other approach is to use the penalized partial likelihood, including both the L2 and the L1 penalization and threshold gradient descent methods (Li and Luan, 2003; Gui and Li, 2004, 2005). Many of these procedures have been demonstrated to work well in building predictive models for survival of a future patient. However, it should be pointed out that all the methods listed here assume a simple linear risk score function for the predictive variables in the Cox proportional hazards model. Simple linearity assumption in the model may not fit the gene-expression data well and such an assumption is very hard to verify in the high-dimensional settings. Although methods are available in the biostatistics literature for estimating the score function non-parametrically (Hastie and Tibshirani, 1990; Gentleman and Crowley, 1991; O'Sullivan, 1988; Fan et al., 1997), these methods only apply in the very low-dimensional covariate space.
Boosting is one of the most successful and practical methods in the machine learning field (Schapire, 1990; Freund, 1995; Freund and Schapire, 1997) for building predictive models and has gained a great popularity in the area of bioinformatics (Dettling and Bühlmann, 2003). Boosting was originally proposed as a multiple prediction and aggregation method for classification: a fitting method, called the base learner, is fitted multiple times on reweighted data and the final boosting estimator is then constructed via a linear combination of such multiple estimates (Schapire, 1990; Bühlmann, 2003). Friedman et al. (2000) showed that the boosting procedure is an optimization method for finding a classifier minimizing a particular exponential loss function, and Friedman (2001) proposed a general boosting framework for regression settings. Such an insight opens the door for many possible extensions of the boosting method. Bühlmann and Yu (2003) proposed a novel componentwise boosting procedure based on cubic smoothing splines for linear regression and logistic regression models and L2 loss functions and demonstrated that the boosting procedure works well in very high-dimensional settings (Bühlmann, 2003; Bühlmann, 2004). Most of these new developments in boosting methods are for the classification and linear regression problems and therefore censoring is not an issue. In this paper, we propose to develop a boosting procedure with cubic smoothing splines for building a potentially non-linear predictive Cox proportional hazards model and for identifying potentially non-linear functional forms for the predictive variables. We apply the general gradient boosting procedure of Friedman (2001) particularly to the general Cox proportional hazards models and provide a specific algorithm for fitting such models.
The rest of the paper is organized as follows. We first present the general Cox proportional hazards model and the gradient boosting algorithm using smoothing splines for estimating the predictive function non-parametrically. We then present simulation results and results of the application to a real microarray gene-expression data. We finally give a brief discussion of the methods and results.
| 2 STATISTICAL MODELS AND METHODS |
|---|
|
|
|---|
2.1 Cox proportional hazards model
Let us suppose that we have a sample size of n from which to estimate the relationship between the survival time and the genetic/genomic profiles such as the gene-expression levels X1,...,Xp of p genes. Owing to censoring, for i = 1,...,n, the i-th datum in the sample is denoted by (ti,
i,xi1,xi2,...,xip), where
i is the censoring indicator, ti is the survival time if
i = 1 or censoring time if
i = 0, and xi = [xi1,xi2,...,xip}' is the vector of the genetic/genomic profiles of p genes for the i-th sample. Our aim is to build the following Cox regression model for the hazard of cancer recurrence or death at time t,
![]() | (1) |
0(t) is the unspecified baseline hazard function, F(X) called the risk score function, is an unknown p-dimensional smooth function of the gene-expression levels X = (X1,...,Xp), satisfying condition F(0) = 0 which makes the model (1) identifiable. In practice, it often assumes a particular functional form for function F(X). For example, F(X) =
(ß')X, in which case the hazard function becomes
![]() | (2) |
(·) is a one-dimensional link function. The standard Cox model corresponds to the case that the link function is identity in model (1). When
(·) is known, model (1) can be treated as parametric and ß can be estimated by maximizing the likelihood function or partial likelihood function (Cox, 1972). However, one should note that in the high-dimensional space, it is very difficult to specify the functional form in
(·) in model (1). The main goal of this paper is to extend the boosting procedure of Friedman (2001) to the censored data regression model (1) in order to estimate the function F(X) non-parametrically. Such a procedure is important in the high-dimensional data regression settings, providing a flexible way to model the covariate effects.
2.2 The smoothing spline based boosting algorithm
We propose to use the boosting procedure for estimating the function F(X) in model (1) non-parametrically. Boosting essentially is an iterative procedure to update function estimators successively. Friedman (2001) developed a novel general framework, called Gradient Boosting Machine, to obtain additive expansions adapted to any fitting criterion. The framework is quite general and works for various models. The algorithm involves the initiation, projecting gradient to learner, line search and iteration steps. For the general Cox model we considered, we use the negative of the log-partial likelihood as a loss function. Therefore, minimizing the loss function is equivalent to maximizing the partial likelihood. By using the partial likelihood, we separate the problem of estimating F(x) from estimating the baseline hazard function
0(t) in model (1). Following Bühlmann and Yu (2003) we use componentwise cubic smoothing splines as base learner in the projection step. In the following, we provide some details of the boosting algorithm for the Cox model using cubic splines.
Suppose we have collected samples {(ti,
i,xi):i = 1,...,n} from model (1), where ti
R1, xi = (xi1,xi2,...,xip)'
Rp,
i
{0,1} and p
1 is the dimension of predictors. To estimate the function, F(x):Rp
R1, treating the negative of the Cox's log-partial likelihood
![]() | (3) |
The algorithm
Initialization. Set F(0)(x) = 0, k = 0, and pre-choose a small positive number
, called learning rate. Usually
= 0.05 or 0.01.
Step 1: Calculating the gradient. Compute
, the negative gradient directions of loss (3) at samples with respect to the current estimator, F(k)(x).
![]() |
Step 2: Fitting the gradient using univariate splines. Select an index, lk
{1,...,p}, such that the lk-th component of predictors most explains the directions
. More specifically, lk is chosen by the following equation:
![]() |
is the conventional one-dimension cubic smooth spline fit to the responses
using predictors {xil : i = 1,...,n}.
Step 3: Line search. Fit a linear proportional hazards model to the response (ti,
i) with predictor
), offset F(k)(xi). The fitted regression coefficient is denoted as
k.
Step 4: Updating the function. Set k = k + 1, and update the boosted estimator of the score function: F(k)(·)
F(k+1)(·) by
![]() |

k gk(x) and gk(·) denotes the function
![]() |
component of x
Rp. Step 5: Iteration. Go to Step 1.
In the function updating step, a learning rate parameter
is introduced to regularize the resulting function estimate through shrinkage. At the k-th step, we obtain the estimate of the function, F(k)(X), which is a non-parametric additive function of each component of X, some of which are identically zero. It should be noted that when the iteration k increases by 1, one more term is added to the fitted procedure; however, this term may have already been in the model. Due to the dependence of this new term on the previous terms, the complexity of the fitted model is not increased by a constant amount.
2.3 The stopping rule
Based on the samples, we could get a series of boosted estimators {F(k)(x):k = 1,2,...} by implementing the above algorithm. To choose a suitable k, F(k)(X) is treated as the boosted estimation of the score function F(X). So it is necessary to stop the boosting procedure at a suitable step to avoid over-fitting. Here the number of boosting iterations k works as the smoothing or regularization parameter. Ideally, one should select the best step k and the optimal learning rate
using cross-validation. Such a cross-validation procedure can be computationally quite intensive since it requires running the boosting procedure many times. In our analysis, we fix
at a smaller value and determine k. As pointed out by Friedman (2001) the predictive performance is usually not too sensitive to the choice of
, although the number of iterations is affected. Decreasing the value of
increases the best value for k. Based on our simulations, we found that the log-likelihood profile versus boosting steps first increases rapidly and flattens to the maximum point, and then decreases afterwards. This may be due to the fact that when the estimator approaches the best, i.e., the maximized point for Equation (4) the gradient becomes near null. The update of the gradient direction based only at samples (finite dimension) approximates the theoretical gradient (infinite dimension). At or near the maximum point, the theoretical gradient is almost null; consequently the approximations often likely point to the wrong direction which results in the decrease of the partial likelihood function. Based on this observation, we use an alternative procedure for stopping the boosting iteration and choose the boosting step when the loss function becomes flat dramatically or starts to level. This is done by examining the plot of the likelihood function against the number of iterations. Our empirical studies show that this procedure almost gives the same performance as the cross-validation.
| 3 SIMULATIONS |
|---|
|
|
|---|
In this section, we present simulation studies to demonstrate the effectiveness of the boosting algorithm in estimating the functional forms, in selecting important predictors and in building predictive models. In all the examples, the learning rate
is fixed at
= 0.05.
We simulated survival time based on model (1) assuming the following risk score function:
![]() |
![]() |
3.1 Estimating the true functional forms
We first examine how well the boosting procedure can recover the true functional forms fi(X), i = 1,...,4 for variables X1X4. Figure 1 shows the estimates of the functional f(X1)f(X4) for 50 replications. Note that the estimates are biased; which is expected since the boosting procedure is a regularized procedure. In general, we see that the estimates tend to be shrunk towards zero, as commonly observed in shrinkage estimates. However, the estimated curves indeed capture the true non-linear functional forms of the variables quite well. As a comparison, we also present in this figure the estimates of f(X5) and f(X6) which have true values of zero. For these functions, the estimated functions are also close to zero.
|
3.2 Predictive performance
In order to assess the predictive performance of the boosting procedure, for each simulated training dataset of size 200, we also generated a testing set of the size of 100. We built the model with the training datasets and estimated the risk scores for the testing datasets. The p-values for testing, whether or not the estimated score is related to survival for the testing datasets, range from 1013 to 0.02 with a median value of 107, indicating that the predicted scores are highly predictive of the survival times in the testing datasets. Figure 2a shows the KaplanMeier curves for 10 randomly selected testing data where each testing dataset was divided into high- and low-risk groups using the median of the estimated scores by the boosting procedure. Clear differences in survival can be observed.
|
As a comparison, for each training dataset, we built a linear predictive model using the four variables X1X4, which are known to be related to the risk of event. We similarly estimated the risk scores for individuals in the testing dataset based on the predictive model derived from the training dataset. Note that this is the best linear model one can build since only relevant variables are used. The p-values for testing, whether or not the estimated score is related to survival for the testing datasets, range from 109 to 0.29 with a median value of 0.0002, indicating that the predicted scores are less predictive of the survival times than the scores predicted by the boosting procedure. Figure 2b shows the KaplanMeier curves for 10 randomly selected testing data where the testing datasets were divided into high- and low-risk groups using the median of the estimated scores by the linear Cox model. Again the difference in survival curves are not as large as those shown in Figure 2a. This comparison clearly demonstrates that the Cox model with a linear risk score function may not work well where there is non-linearity and the smoothing spline based boosting procedure can capture this non-linearity and therefore improve the predictive performance.
3.3 Selection of important variables
We finally investigate how the boosting procedure performs in selecting important variables from among a large number of predictors. Suppose the predictor X is p-dimensional, X = (X1,...,Xp)'
Rp, the approximation or estimator is
. Following Friedman (2001), we define the relative influence of input variable Xj as
![]() |
In order to examine whether the boosting procedure and the measurement (I) can be used to select the relevant variables that are related to survival risk, we generated 100 independent replicates based on the model described above with a sample size of 200, and for each replicate, we sorted the variables based on their influences scores Ij. Figure 3(ad) shows the frequencies of over 100 replications when each gene appears among the top 3, 5, 7 and 10 genes with the highest influence scores. For example, with a probability over 70%, three out of four variables are among the top seven individual influence scores, two of them with a probability >0.98 (Fig. 3c). However, note that the fourth predictor, X4, was hardly selected. This is due to the fact that the signal-to-noise ratio of this particular variable is relatively low as compared with the other three variables. From Figure 1, we see that the true function of f4(X4) looks more like function f5(X5) or f6(X6) than the first three functions [Fig. 1(ac)].
|
| 4 APPLICATION TO DLBCL DATASET |
|---|
|
|
|---|
We applied the boosting procedure to a microarray gene-expression data reported in Rosenwald et al. (2002) to assess the performance of our procedure in real data analysis. The dataset contains a total of 240 patients with diffuse large B-cell lymphoma (DLBCL), including 138 deaths during the follow-ups. The geneexpression measurements of 7399 features are available for each patient. Among the 7399 genes, only 434 ones have no missing values across the 240 patients. We used the same nearest neighbor method as in Li and Gui (2004) and Gui and Li (2005) to fill the missing expression levels.
4.1 Feature pre-selection
In order to assess the predictive performance of the method, we randomly split the 240 patients into a training set of 160 patients and a testing set of 80 patients, where the training data set is used for building the predictive model and the testing data is used to evaluate the model. Since only a small set of genes are related to certain clinical phenotypes, one should expect that the original gene-expression data has a much lower inherent dimension and therefore dimension reduction prior to fitting the regression models is necessary. Since the boosting procedure for the Cox model incorporates multivariate feature selection, it does not critically depend on preliminary gene filtering or selecting by univariate methods. Based on the training set, we calculated the partial score statistic for each of the 7399 features and chose genes with the top scores as potential predictors in the prediction. The number of genes to be used for building the model can be chosen using cross-validated partial likelihood. We have tried various numbers of genes, including 25, 50 and 75 genes and observed quite similar predictive results. In the following subsection, we give a detailed analysis for using 50 genes with the highest Cox scores.
4.2 Selection of genes related to risk of death
Using 50 genes, the partial likelihood for the training dataset obtained its maximum at the 1145th boosting step. From the 24 genes selected by the boosting procedure, 12 genes showed clear non-linear functional forms (Fig. 4 shows the plots of the functional forms for these 12 genes). Table 1 presents the descriptions of these 24 genes, which have the highest influence scores. Among these genes, LC_25054 and LC_24432 increase the risk of death only when the genes are over-expressed. As a comparison, other genes such as LC_26146, AA748786, AA769543 and AA824616 increased the risk when the genes are either upregulated or downregulated. Such bell-shaped functional forms cannot be modeled by simple linear risk score functions and in turn they can potentially provide insights into the mechanisms that are related to death from DLBCL. This may suggest that changes of expressions in either direction among these genes can result in disturbance of the underlying signaling pathways and subsequently increase the risk of death. Finally, it is worth noting that the influence scores do not correspond to the Cox partial scores indicating the importance of studying the genetic effects in a multivariate model rather than in a marginal model.
|
|
4.3 Prediction on testing dataset
Based on the predictive model built with the training set, we calculate the risk scores for the 80 patients in the testing dataset based on their expression levels. These predictive scores are highly predictive of the observed times at death after treatment (p = 0.002). If we divide the 80 patients into two groups by the median of the predicted risk scores, we observe that the death-free survival curves are significantly different between the two groups (p = 0.002, Fig. 5), indicating that the clinically relevant groups of patients can indeed be identified by the proposed model. These results indicate that the model built by the boosting procedure can be used for predicting the risk of developing an event in future patients.
|
| 5 DISCUSSION |
|---|
|
|
|---|
We have presented a componentwise smoothing spline based boosting procedure for estimating the Cox proportional hazards models with the aim of building a potentially non-linear predictive model for future prediction and for studying the effects of gene expression on the risk of a clinical event. Such a method does both variable selection and shrinkage, a property that is very useful in analysis of high-dimensional data. The procedure is related to the LARSCox of Gui and Li (2004) and the threshold gradient descent procedure of Gui and Li (2005) but they are not the same because no linearity is assumed for the predictive variables. We have demonstrated that the procedure can indeed recover the underlining non-linear structures, even in a high-dimensional predictor space. In addition, the procedure can potentially be used for identifying important genes whose expressions are related to the risk of developing an event. Although some available methods can predict the future survival well, it is important to emphasize that the procedure presented in this paper provides a way to detect potentially non-linear relationships between the gene expressions and the risk of an event. Our simulation results indicate that when there is true non-linearity for some predictors on the hazard function, the boosting procedure can be more predictive than the Cox model with a simple linear risk score.
Owing to the high-dimensionality and low inherent dimensionality of the microarray gene-expression data, some kind of pre-selection of genes is necessary in order to reduce the computational demand of the proposed procedure and to reduce noises. In our analysis of the lymphoma dataset of Rosenwald et al. (2002), we first screen the genes based on univariate Cox scores, and then use the boosting procedure to further select important genes and their functional forms in building the predictive model. While gene pre-selection based on univariate analysis is commonly used in relating the gene-expression data to various clinical phenotypes, such a procedure may not be optimal since some genes may not be significant in marginal univariate analysis but can be significant in combination with other genes. Other procedures for gene pre-selection might be used to obtain better predictive performance. In addition, we only consider the additive components in the risk score function by using componentwise smoothing splines. The methods can be extended to include two-way or even higher-order interactions between genes by using thin splines in Step 2 of the algorithm. Such an extension provides the possibility of studying non-linear interactions among genes. This deserves further investigation.
In this paper, we consider only the proportional hazards models which are most commonly used in analyzing censored survival data. However, the general gradient boosting procedure proposed in Friedman (2001) can be applied to other regression models for censored survival data. For example, we can consider the accelerated failure time models and use the censoring distribution weighted residual sum of squares as the loss function (Robins and Rotnitsky, 1992) in the boosting procedure. Alternatively, we can also consider the more general semi-parametric transformation models (Cheng, 1995). We are currently investigating the theoretical properties and empirical performance of generalizing the boosting algorithm to these models and will present the results in another paper. As a final comment, traditional covariates such as demographic and pathological variables can be analyzed together with the expression profiles in the proposed modeling framework. Alternatively, we can consider a semi-parametric model where the traditional predictors are modeled in parametric forms and the expression levels are modeled non-parametrically. We can iterate between updating the parametric parameters by fitting a Cox model with an offset term and updating the non-parametric term using the proposed boosting procedure in this paper.
In conclusion, we have proposed and investigated a boosting procedure with smoothing splines for estimating the general Cox models. Such a procedure does not assume any linear or parametric forms for the predictive variables and thus provides a way of estimating potential non-linear effects of gene expressions on the risk of developing a clinical event.
| Acknowledgments |
|---|
This work was supported by the NIH grant ES009911 (HL) and the National Natural Science Foundation of China grant 10441004 (YL).
Received on December 16, 2004; revised on January 30, 2005; accepted on February 9, 2005
| REFERENCES |
|---|
|
|
|---|
Bair, E. and Tibshirani, R. (2004) Semi-supervised methods for predicting patient survival from gene expression papers. PLoS Biol., 2, 50115022.
Bühlmann, P. Hornik, K., Leisch, F., Zeileis, A. (2003) Boosting methods: why they can be useful for high-dimensional data. Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003).
Bühlmann, P. (2004) Boosting for high-dimensional linear models. Technical Report. ETH Zürich (In press).
Bühlmann, P. and Yu, B. (2003) Boosting with the L2-Loss: regression and classification. J. Amer. Stat. Assoc., 98, , pp. 324339[CrossRef].
Cheng, S.C., et al. (1995) Analysis of transformation models with censored data. Biometrika, 82, 835845
Cox, D.R. (1972) Regression models and life-tables (with discussion). J. R. Stat. Soci. B, 34, 187220.
Dettling, M. and Bühlmann, P. (2003) Boosting for tumor classification with gene expression data. Bioinformatics, 19, 10611069
Fan, J., et al. (1997) Local likelihood and local partial likelihood in hazard regression. Ann. Stat., 25, 16611690[CrossRef].
Freund, Y. (1995) Boosting a weak learning algorithm by majority. Inform. Comput., 121, 256285[CrossRef].
Freund, Y. and Schapire, R. (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55, 119139.
Friedman, J. (2001) Greedy function approximation: a gradient boosting machine. Ann. Stat., 29, 11891232.
Friedman, J., et al. (2000) Additive logistic regression: a statistical view of boosting (with discussion). Ann. Stat., 28, 337407[CrossRef].
Gentleman, R. and Crowley, J. (1991) Local full likelihood estimation for the proportional hazards model. Biometrics, 47, 12831296[CrossRef][Web of Science][Medline].
Golub, T.R., et al. (1999) Molecular classification of cancer: class discovery and class Prediction by gene expression monitoring}. Science, 286, 531537
Center for Bioinformatics and Molecular Biostatistics. Paper L1Cox Gui, J. and Li, H. (2004) Penalized cox regression analysis in the high-dimensional and low sample size settings, with applications to microarray gene expression data. Technical Report UCSF.
Gui, J. and Li, H. (2005) Threshold gradient descent method for censored data regression, with applications in pharmacogenomics. Pac. Symp. Biocomput., 10, 272283.
Hastie, T. and Tibshirani, R. (1990) Exploring the nature of covariate effects in the proportional hazards model. Biometrics, 46, 10051016[CrossRef][Web of Science][Medline].
Li, H. and Gui, J. (2004) Partial Cox regression analysis for high-dimensional microarray gene expression data. Bioinformatics, 20, i208i215[Abstract].
Li, H. and Luan, Y. (2003) Kernel Cox regression models for linking gene expression profiles to censored survival data. Pac. Symp. Biocomput., 8, 6576.
O'Sullivan, F. (1988) Nonparametric estimation of relative risk using splines and cross-validation. SIAM J. Sci. Stat. Comput., 9, 531542[CrossRef].
Robins, J. and Rotnitsky, A. (1992) Aids Epidemiology, methodoligical issues. In Jewell, N., Dietz, K., Farewell, I. (Eds.). Recovery of Information and Adjustment for Dependent Censoring using Surrogate Markers, , Boston, MA Birkhäuser, pp. 297331.
Rosenwald, A., et al. (2002) The use of molecular profling to predict survival after themotheropy for diffuse large-B-cell lymphoma. N. Engl. J. Med., 346, 19371947
Schapire, R. (1990) The strength of weak learnability. Machine Learning, 5, 197227[Web of Science].
This article has been cited by other articles:
![]() |
S. Funaro, G. La Torre, M. Madonna, L. Galiuto, A. Scara, A. Labbadia, E. Canali, A. Mattatelli, F. Fedele, F. Alessandrini, et al. Incidence, determinants, and prognostic value of reverse left ventricular remodelling after primary percutaneous coronary intervention: results of the Acute Myocardial Infarction Contrast Imaging (AMICI) multicenter study Eur. Heart J., March 1, 2009; 30(5): 566 - 575. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Lu and L. Li Boosting method for nonlinear transformation models with censored survival data Biostat., October 1, 2008; 9(4): 658 - 667. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Wei and H. Li Nonparametric pathway-based regression models for analysis of genomic data Biostat., April 1, 2007; 8(2): 265 - 284. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
















