Skip Navigation


Bioinformatics Advance Access originally published online on July 20, 2007
Bioinformatics 2007 23(18):2470-2476; doi:10.1093/bioinformatics/btm364
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary data
Right arrow All Versions of this Article:
23/18/2470    most recent
btm364v2
btm364v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Li, Y.
Right arrow Articles by Zhu, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Li, Y.
Right arrow Articles by Zhu, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Analysis of array CGH data for cancer studies using fused quantile regression

Youjuan Li and Ji Zhu *

Department of Statistics, University of Michigan, Michigan, USA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 APPLICATIONS
 4 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Motivation: The identification of DNA copy number changes provides insights that may advance our understanding of initiation and progression of cancer. Array-based comparative genomic hybridization (array-CGH) has emerged as a technique allowing high-throughput genome-wide scanning for chromosomal aberrations. A number of statistical methods have been proposed for the analysis of array-CGH data. In this article, we consider a fused quantile regression model based on three motivations: (1) quantile regression may provide a more comprehensive picture for the ratio profile of copy numbers than the standard mean regression approach; (2) for simplicity, most available methods assume uniform spacing between neighboring clones, while incorporating the information of physical locations of clones may be helpful and (3) most current methods have a set of tuning parameters that must be carefully tuned, which introduces complexity to the implementation.

Results: We formulate the detection of regions of gains and losses in a fused regularized quantile regression framework, incorporating physical locations of clones. We derive an efficient algorithm that computes the entire solution path for the resulting optimization problem, and we propose a simple estimate for the complexity of the fitted model, which leads to convenient selection of the tuning parameter. Three published array-CGH datasets are used to demonstrate our approach.

Availability: R code are available at http://www.stat.lsa.umich.edu/~jizhu/code/cgh/

Contact: jizhu{at}umich.edu

Supplementary information: Supplementary data are available at Bioinformatics online.


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 APPLICATIONS
 4 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Chromosomal aberrations such as deletions, amplifications and structural rearrangements are hallmarks of cancer (Lengauer et al., 1998; Pinkel and Albertson, 2005). Therefore, identifying genomic regions associated with systematic aberrations provides insights into the initiation and progression of cancer, and improves the diagnosis, prognosis and therapy strategies.

In recent years, array-based comparative genomic hybridization (array-CGH) has emerged as a technique offering genome-wide scanning for chromosomal aberrations. In this approach, genomic DNA sequences from tumor and normal reference samples are labeled by different color fluorochrome, and then hybridized to the array consisting of hundreds to thousands of cloned DNA fragments with known exact chromosomal locations. If both samples contain the same quantity of target sequences, they bind equally to the complementary sequences, resulting fluorescent signal ratio equal to 1. In contrast, when the tumor sample contains DNA with duplication (deletion) or other gain (loss) in a particular genomic region, more (less) tumor DNA bind to the complementary sequence, and the ratio is greater (smaller) than 1. Therefore, the ratio represents the relative DNA aberration at a particular genomic region.

A number of statistical methods have been developed to analyze the ratio profile from array-CGH. For instance, a method based on robust locally weighted regression (lowess) has been previously used in Beheshti et al. (2003). A quantile smoothing method using the total variation as the roughness penalty has been shown to give desirable visualization of the CGH profile in Eilers and Menezes (2004). Other smoothing algorithms that denoise the array-CGH data and appear suited for handling abrupt changes in the profile include adaptive weights smoothing (Hupe et al., 2004; Polzehl and Spokoiny, 2000), wavelets (Hsu et al., 2005) and lasso (Huang et al., 2005). Jong et al. (2004) use a genetic local search algorithm to examine the CGH profile by maximizing a likelihood with a penalty term containing the number of breakpoints. Olshen et al. (2004) propose a circular binary segmentation (CBS) method that detects aberrations by recursively using a likelihood ratio test statistics. Other methods include the hidden Markov model (HMM) which intends to model the possible dependence of a clone with its near neighbors (Fridlyand et al., 2004), and a cluster along chromosomes (CLAC) method which builds hierarchical clustering-style trees along the chromosome and then selects the ‘interesting’ clusters via certain criterion (Wang et al., 2005).

One piece of information not considered in most of these methods is the physical position of the clone along the genome, and uniform spacing between neighboring clones is often assumed. However, incorporating this information can only help if done correctly (Lai et al., 2005). We also note that most of these methods involve tuning parameters that need to be carefully selected. A key issue in the implementation is how to appropriately select these tuning parameters. If an algorithm is very sensitive to the values of the tuning parameters or if the complexity of the algorithm does not allow the user to determine the tuning parameters easily, it may be viewed as a weakness of the algorithm (Lai et al., 2005).

In this article, we modify the fused quantile regression model in Eilers and Menezes (2004) by incorporating the physical distance between adjacent clones. To solve the resulting optimization problem, we derive an efficient algorithm that computes the entire solution path for all values of the tuning parameter. Furthermore, we propose a convenient measure for the complexity of the fitted model, which facilitates selection of the tuning parameter.

Let Formula denote the Formula -ratio of the normalized signal between the tumor and the reference samples at clone i, and Formula be the physical position of the corresponding clone. If the entire hybridization and measurement process are well behaved, the signal ratio will reflect the relative copy numbers, i.e.


Formula 1

(1)
where µi is the true Formula -ratio of DNA copy numbers of tumor over reference samples at position i, and {varepsilon}i is a random noise.

Similar (but not identical) to Eilers and Menezes (2004), we formulate the detection of DNA copy number changes as a fused quantile regression problem (fused-QR). Let {tau}isin(0,1) be the quantile of interest, and fi be the smooth series that approximate yi, specifically, the 100{tau} % quantile of the Formula -ratio at clone xi. We then estimate fi via:


Formula 2

(2)
We explain the three parts of criterion (2) as follows:

  • The first term {rho}{tau}(.) is the so-called check function that measure the closeness of fi to yi (Koenker and Bassett, 1978):


    Formula 3

    (3)
    A median curve with {tau} = 0.5 can show the trend of the ratio profile, and the lower and upper quantile curves can give the spread of the ratios.

  • The second term is a so-called penalty that measures the smoothness of fi. Since copy number changes involve chromosome segments, the ratios of adjacent clones should be similar. Hence, we discourage changes in adjacent clones by penalizing |fifi 1|, adjusted by the distance between clones, i.e. |xi xi – 1|. A similar penalty without |xixi 1| is called the fused penalty in Tibshirani et al. (2005), and it has also been used in Eilers and Menezes (2004) and Huang et al. (2005) for smoothing array-CGH data.The fused penalty in (2) offers parsimony: making {lambda} sufficiently large will cause some of the fitted |fifi – 1| to be exactly 0. When |fifi – 1| is not equal to 0, it corresponds to a ‘jump’ in the ratio profile. So the fitted ratio profile will consist of flat plateaus, flat valleys and sharp jumps. A sharp jump corresponds to the beginning or the end of a copy number aberration.
  • {lambda} > 0 is a tuning parameter that controls the balance between the smoothness of the fitted model and its fidelity to the data: the larger the {lambda}, the smoother the fi, but at the cost of worse fit to the data, and vice versa. Hence, selecting an appropriate value of the tuning parameter is crucial for the performance of the fitted model.

The remainder of the article is organized as follows. In Section 2, we first present an equivalent version of (2), then derive an algorithm for computing the entire solution path for this model, and also propose a convenient method for selecting the tuning parameter. In Section 3, we apply the proposed approach to three published array-CGH datasets. We summarize the article in Section 4.


    2 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 APPLICATIONS
 4 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
2.1 Statistical model
We denote β0 = f1, βj = (fj + 1fj) / (xj + 1xj), j = 1,...,n – 1, then


Formula 4

(4)
Using this new set of parameters Formula , (2) can be re-written as:


Formula 5

(5)
or equivalently,


Formula 6

(6)
where s is a tuning parameter equivalent to {lambda}. Note that (5) and (6) are in a form of L1 loss plus L1 penalty. For a given value of the tuning parameter, the optimization can be transformed into a linear programming problem, hence solved efficiently by most commercial packages (Eilers and Menezes, 2004). However, instead of solving the problem for one value of the tuning parameter, here we are interested in solving for the entire solution path for all values of the tuning parameter. This will facilitate the selection of the tuning parameter. In the next section, we show that the solution path Formula is piecewise linear in s, which allows us to derive an efficient algorithm to compute the entire Formula .

Before delving into technical details, we illustrate the concept of piecewise linearity of the solution path by a simple example. We simulate a ratio profile of thirty clones:


Formula 7

(7)
where {varepsilon}i are independent Gaussian noise with {sigma} = 0.15. We let µi = 1 for i = 11,...,20, indicating gains at the second 10 clones, and µi = 0 for all other clones. We fit the fused-QR model with {tau} = 0.5. Figure 1 shows the solution path of Formula as a function of s, and the final fitted median curve with an appropriately chosen s.


Figure 1
View larger version (21K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Illustrative example ({tau} = 0.5). Upper panel: the solution path Figure 1 as a function of s, which is piecewise linear. Lower panel: the simulation data (dots), the fitted median curve with an appropriately tuned s (solid), and the truth (dashed).

 
2.2 The path algorithm
In this section, we outline the essential components of the path algorithm and leave all details in the Supplementary Material.

We re-write (6) as:


Formula 8

(8)
The above gives the Lagrangian primal function:


Formula 9

(9)
where {lambda}, {alpha}i, {gamma}i, {kappa}i, {eta}i are non-negative Lagrangian multipliers. Setting the derivatives of Lp to zero, we achieve


Formula 10

(10)


Formula 11

(11)


Formula 12

(12)


Formula 13

(13)
where dj = xj + 1xj, and the Karush–Kuhn–Tucker (KKT) conditions are


Formula 14

(14)


Formula 15

(15)


Formula 16

(16)


Formula 17

(17)

Since the Lagrange multipliers must be non-negative, we conclude from (12) and (13) that both Formula and Formula . We can also see that when yifi > 0 (hence {xi}i > 0), we have {alpha}i = {tau} and {gamma}i = 0; when yifi < 0 ({zeta}i > 0), we have {alpha}i = 0 and {gamma}i = 1 – {tau}. These lead to the following relationships:


Formula

Let {theta}i = {alpha}i{gamma}i. Using these relationships, we can define the following four sets which are useful for the path algorithm:

  • Formula (elbow)
  • Formula (left of the elbow)
  • Formula (right of the elbow)
  • Formula (active set)

Since our goal is to compute the solution path Formula , we are interested in how the KKT conditions change when the parameter s increases. As s increases, we define an event to be

  1. a data point hits the elbow, i.e. a residual yi fi changes from non-zero to zero, or
  2. a coefficient βj changes from non-zero to zero, i.e. an index leaves Formula .

Notice that these two events correspond to the non-smooth points of {sum}i{rho}{tau}(yifi) and Formula , respectively. Then we can see:

  • As s increases, the sets Formula , Formula , Formula and Formula will not change (or equivalently, the KKT conditions will not change), unless an event happens. When the KKT conditions do not change, for uniqueness of the solution, we have the number of non-zero coefficients equal to the number of observations on the elbow, i.e. Formula according to (10) and (11).
  • As s increases, points in Formula stay at the elbow, unless an event happens. Therefore, βj satisfy:


    Formula 18

    (18)
    Since Formula , there is one free unknown in this set of equations, which allows Formula to change linearly when s increases, unless an event happens. Thus, overall the entire solution path Formula is piecewise linear.

The basic idea of our algorithm is as follows: we start with s = 0 and increase it, keeping track of the location of all data points relative to the elbow and also of the magnitude of the fitted coefficients along the way. As s increases, by continuity, points in Formula must linger on the elbow. Since all points at the elbow have yifi = 0, we can establish a path for Formula . The elbow set will stay stable until either some other point comes to the elbow or one non-zero fitted coefficient becomes zero. The path terminates when we reach the interpolating solution.

2.3 Selection of the tuning parameter
Our path algorithm facilitates selection of the tuning parameter among all possible values. In this section, we propose a convenient approach for this selection. We choose to use the Schwarz information criterion (SIC) (Schwarz, 1978), which is common in the quantile regression literature (Koenker et al. 1994):


Formula 19

(19)
where df is a measure of the complexity of the fitted model. Stein (Stein, 1981), Efron (Efron, 1986) and several other researchers argue that df can be estimated by Formula . We have been able to prove that in the fused-QR setting, Formula is equal to the number of non-zero βj's in the fitted model (details of the proof are in the Supplementary Material). Furthermore, according to (4), a non-zero βj corresponds to a jump between adjacent clones in the fitted model. Therefore, the number of fitted jumps gives a convenient estimate for df.

In practice, we can first use the path algorithm to compute the entire solution path, then select s that minimizes the SIC.


    3 APPLICATIONS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 APPLICATIONS
 4 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
In this section, we apply our method to three real datasets.

3.1 Fibroblast cell line data
This is a widely analyzed BAC array dataset provided by Snijders et al. (2001). The arrays consist of approximately 2400 BAC clones and provide precise measurement with an SD of ~0.05–0.1 Formula -ratio. Single experiments were conducted on 15 fibroblast cell lines and the true copy number alterations were previously characterized by cytogenetics. One attractive feature of using this dataset is to prove the validation of our method since the true copy number changes are known.

We applied our fused-QR algorithm to four chromosomes: chromosome 3 and 9 on GM03563, and chromosome 10 and 11 on GM05296. Figure 2 shows the fitted median curves for the four datasets. As we can see, all four chromosomes show partial chromosomal alterations. These alterations detected by the fused-QR median curves, compared with the results from Snijders et al. (2001), agree with the cytogenetic analysis very well.


Figure 2
View larger version (11K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Examples of the application of the fused-QR to the fibroblast cell lines ({tau} = 0.5): (a) a copy gain at one side of the chromosome; (b) two altered points at one end; (c) a copy gain in the middle of the chromosome and (d) a copy deletion in the middle of the chromosome.

 
3.2 Colorectal cancer data
The development and progression of colorectal cancer is a multi-step process leading to genomic alterations in tumors. Nakao et al. (2004) report the results of array-CGH in a set of 125 primary colorectal tumors. In the study, DNA was extracted from 125 frozen colorectal cancer samples obtained from the University of Barcelona Hospital Clinic in Barcelona, Spain. The arrays used in this study consisted of 2463 BAC clones that covered the human genome at a 1.5 mb resolution. After applying certain spot exclusion criteria, there were 2120 clones in the final dataset.

Following Eilers and Menezes (2004), we applied our method to samples X59, X524, X186 and X204 with a focus on chromosome 1. For a given quantile {tau}, we first computed the entire fused-QR solution path using our algorithm, then selected the tuning parameter using the SIC criterion. Figure 3 shows the ratio profiles for the four samples: the solid lines are the fitted median curves, and the dashed lines are upper/lower 15 % quantile curves.


Figure 3
View larger version (16K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3. Chromosome 1 from samples X59, X524, X186 and X204. Horizontal axis: physical position (mb) of the clones. Vertical axis: Figure 3-ratio of intensities from the tumor versus reference samples. Solid lines represent the fitted median curves, and the dashed lines are fitted upper and lower 15 % quantile curves.

 
The interesting thing is that, when using our path algorithm to select the tuning parameter s and when using the actual physical locations of the clones, the results do not completely agree with Eilers and Menezes (2004). In Eilers and Menezes (2004), uniform spacing between adjacent clones was assumed, and cross-validation was used to select the tuning parameter from a pre-specified grid of values. On the other hand, we utilized the physical locations of the clones in our model, and selected the tuning parameter from all possible values for s, as opposed to from a grid.

In particular, Figure 4 compares the results for sample X524 from three methods: (1) Eilers and Menezes (2004); (2) fused-QR using uniform spacing but the path algorithm for selecting s and (3) fused-QR using both the actual clone positions and the path algorithm. As we can see from the median curves, Eilers and Menezes (2004) missed a region of gain and a region of loss around the position of 70 Mb. Fused-QR using uniform spacing detected the region of gain but failed to detect the region of loss at this position, while fused-QR using actual physical locations captured both the gain and the loss.


Figure 4
View larger version (12K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 4. Chromosome 1 from sample X524. (a) The scatterplot of observed ratios. (b) Results from Eilers and Menezes (2004). (c) Results from fused-QR using uniform spacing. (d) Results from fused-QR using the clones' positions. The solid lines are the fitted median curves.

 
3.3 Breast tumor data
Pollack et al. (2002) used cDNA array-based CGH to profile DNA copy number alterations in a series of breast cancer cell lines and primary tumors. Specifically, they performed CGH on 44 breast tumor samples and 10 breast cancer cell lines, using cDNA microarrays containing 6691 different mapped human genes. To demonstrate our fused-QR algorithm for identifying copy number aberrations, we focused on 10 chromosomes of the breast cancer cell line MDA157, which is known to have a large number of alterations. The fitted profiles are shown in Figure 5. When compared with the results from the CBS method (Olshen et al., 2004) and the CLAC method (Wang et al., 2005), as summarized in (Tibshirani and Wang 2007), our results seem promising, in terms of both resisting outlier measurements (chromosome 3, 5 and 9) and detecting weak alterations (chromosome 7 and 15).


Figure 5
View larger version (33K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 5. Array-CGH profiles of 10 chromosomes of breast cancer cell line MDA157. The solid lines are fitted median curves and the dashed lines are fitted lower/upper 15 % quantile curves.

 

    4 CONCLUSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 APPLICATIONS
 4 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
We have proposed a fused quantile regression algorithm for smoothing array-CGH data. It is well known that quantile regression enjoys several attractive features. For example, quantile curves tend to present a more complete picture of the data than a single mean curve: the median curve shows the overall trend, while the upper and lower quantile curves show the spread. Quantile regression is also known to be not subject to the distributional assumption of the noise term in (8), while several other methods (Beheshti et al., 2003; Hsu et al., 2005; Olshen et al., 2004) rely on the constant variance assumption.

In our work, we have made three additional contributions:

  • We developed an efficient algorithm that computes the entire solution path for the fused-QR model. This facilitates selection of the tuning parameter. For example, Eilers and Menezes (2004) pointed out that the choice of the tuning parameter is ‘unlikely to be unique’ in real applications, and it is worth to try different values of the tuning parameter to ‘see whether strong patterns present themselves’. Our path algorithm allows biologists to play with the tuning parameter more easily and see how pattern changes with the value of the tuning parameter.
  • We proposed a very convenient estimate for the complexity of the fitted model, i.e. the number of jumps in the fitted curve. This further allows convenient selection of the tuning parameter.
  • We have incorporated the physical location of clones into the modeling procedure. Clone locations contain important information for understanding the aberrations. For example, if two regions indicating the same direction of aberrations are far apart, the chance that they refer to the same aberration should be lower than when they were closer. As we have seen in Section 3.2, using the physical location information can result in different fitted models in identifying copy number changes. However, we note that further complementary experiments are needed to test whether these differences are biologically meaningful.

However, we note that similar to most current algorithms, we did not address the issue of how to set a threshold to call for detection, i.e. we return only a fitted profile of the copy number changes, without calling the detected regions as significant or not. In practice, we may use the false discovery rate (FDR) to help use decide an appropriate threshold for the detection. Tibshirani and Wang (2007) proposed two ways to estimate the FDR for a given threshold, one when normal reference samples are available, and the other when normal reference samples are not available. We plan to explore along the same line in our future studies.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 APPLICATIONS
 4 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
We would like to thank the Editor, the Associate Editor and three reviewers for their thoughtful and useful comments. Y.L. and J.Z. are partially supported by grant DMS-0505432 from the National Science Foundation.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Jonathan Wren

Received on March 25, 2007; revised on June 12, 2007; accepted on July 8, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 APPLICATIONS
 4 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Beheshti B, et al. Chromosomal localization of DNA amplifications in neuroblastoma tumors using cDNA microarray comparative genomic hybridization. Neoplasia (2003) 5:53–62.[Web of Science][Medline]

    Eilers PHC, Menezes RX. Quantile smoothing of array CGH data. Bioinformatics (2004) 21:1146–1153.[CrossRef][Web of Science][Medline]

    Efron B. How biased is the apparent error rate of a prediction rule? J. Am. Stat. Assoc (1986) 81:461–470.[CrossRef][Web of Science]

    Fridlyand J, et al. Hidden Markov models approach to the analysis of array CGH data. J. Multivariate Anal (2004) 90:132–153.[CrossRef]

    Hsu L, et al. Denoising array-based comparative genomic hybridization data using wavelets. Biostatistics (2005) 6:211–226.[Abstract]

    Huang T, et al. Detection of DNA copy number alterations using penalized least squares regression. Bioinformatics (2005) 21:3811–3817.[Abstract/Free Full Text]

    Hupe P, et al. Analysis of array CGH data: from signal ratio to gain and loss of DNA regions. Biostatistics (2004) 20:3413–3422.

    Jong K, et al. Breakpoint identification and smoothing of array comparative genomic hybridization data. Bioinformatics (2004) 20:3636–3637.[Abstract/Free Full Text]

    Koenker R, Bassett G. Regression quantiles. Econometrica (1978) 46:33–50.[CrossRef][Web of Science]

    Koenker R, et al. Quantile smoothing splines. Biometrika (1994) 81:673–680.[Abstract/Free Full Text]

    Lai WR, et al. Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics (2005) 21:3763–3770.[Abstract/Free Full Text]

    Lengauer C, et al. Genetic instabilities in human cancers. Nature (1998) 396:643–649.[CrossRef][Medline]

    Nakao K, et al. High-resolution analysis of DNA copy number alterations in colorectal cancer by array-based comparative genomic hybridization. Carcinogenesis (2004) 25:1345–1357.[Abstract/Free Full Text]

    Olshen AB, et al. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics (2004) 5:557–572.[Abstract]

    Pinkel D, Albertson D. Array comparative genomic hybridization and its applications in cancer. Nat. Genet (2005) 37:s11–s17.[CrossRef][Web of Science][Medline]

    Pollack JR, et al. Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc. Natl Acad. Sci (2002) 99:12963–12968.[Abstract/Free Full Text]

    Polzehl J, Spokoiny S. Adaptive weights smoothing with applications to image restoration. J. R. Stat. Soc. Ser. B (2000) 62:335–354.[CrossRef]

    Schwarz G. Estimating the dimension of a model. Annu. Stat (1978) 3:98–108.

    Snijders AM, et al. Assembly of microarrays for genome-wide measurement of DNA copy number. Nat. Genet (2001) 29:263–264.[CrossRef][Web of Science][Medline]

    Stein C. Estimation of the mean of a multivariate normal distribution. Annu. Stat (1981) 9:1135–1151.[CrossRef]

    Tibshirani R, Wang P. Spatial smoothing and hot spot detection for CGH data using the fused lasso. Biostatistics (2007) 1–12.

    Tibshirani R, et al. Sparsity and smoothness via the fused Lasso. J. R. Stat. Soc. Ser. B (2005) 67:91–108.[CrossRef]

    Wang P, et al. A method for calling gains and losses in array CGH data. Biostatistics (2005) 6:45–58.[Abstract]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BiostatisticsHome page
C. D. Greenman, G. Bignell, A. Butler, S. Edkins, J. Hinton, D. Beare, S. Swamy, T. Santarius, L. Chen, S. Widaa, et al.
PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data
Biostat., October 15, 2009; (2009) kxp045v1.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
E. Budinska, E. Gelnarova, and M. G. Schimek
MSMAD: a computationally efficient method for the analysis of noisy array CGH data
Bioinformatics, March 15, 2009; 25(6): 703 - 713.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary data
Right arrow All Versions of this Article:
23/18/2470    most recent
btm364v2
btm364v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Li, Y.
Right arrow Articles by Zhu, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Li, Y.
Right arrow Articles by Zhu, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?