Bioinformatics Advance Access originally published online on January 31, 2007
Bioinformatics 2007 23(6):747-754; doi:10.1093/bioinformatics/btm010
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
An ensemble approach to microarray data-based gene prioritization after missing value imputation
1Department of Computer Science, The George Washington University, 801 22nd Street, Suite 704 and 2Department of Statistics and Biostatistics Center, The George Washington University, 2140 Pennsylvania Avenue, N.W. Washington, DC 20052, USA
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Motivation: Microarrays have been widely used to discover novel disease related genes. Some types of microarray, such as cDNA arrays, usually contain a considerable portion of missing values. When missing value imputation and gene prioritization are sequentially conducted, it is necessary to consider the distribution space of prioritization scores due to the existence of missing values. We propose an ensemble approach to address this issue. A bootstrap procedure enables us to generate a resample multivariate distribution of the prioritization scores and then to obtain the expected prioritization scores.
Results: We used a published microarray two-sample data set to illustrate our approach. We focused on the following issues after missing value imputation: (i) concordance of gene prioritization and (ii) control of true and false positives. We compared our approach with the traditional non-ensemble approach to missing value imputation. We also evaluated the performance of non-imputation approach when the theoretical test distribution was available. The results showed that the ensemble imputation approach provided clearly improved performances in the concordance of gene prioritization and the control of true/false positives, especially when sample sizes were about 5–10 per group and missing rates were about 10–20%, which was a common situation for cDNA microarray studies.
Availability: The Matlab codes are freely available at http://home.gwu.edu/~ylai/research/Missing.
Contact: ylai{at}gwu.edu
| 1 INTRODUCTION |
|---|
|
|
|---|
Microarrays enable us to simultaneously monitor gene expressions at a genomic scale (Der et al., 1998). They provide the tool to extract biological significances such as the changes in expression profiling of genes under distinct types (e.g. normal versus cancer types), which shed the light on use of them in a number of studies over a broad range of biological disciplines including cancer classification (Golub et al., 1999), identification of the unknown effects of a specific therapy (Perou et al., 2000), identification of genes relevant to a certain diagnosis or therapy (Cho et al., 2003) and cancer prognosis (Shipp et al., 2002; van't Veer et al., 2002). Due to their relatively high costs, the sample sizes of microarray studies are generally small, which may lead to considerable false positive rates. Since microarrays are widely used in pilot studies before the follow-up large sample validation studies, it is crucial to control the false positives in genes prioritized by microarray studies.
Some types of microarray, such as cDNA arrays, usually contain a considerable portion of missing values. These missing values exist due to various reasons including insufficient resolution, image corruption, dust or scratches on the slides or experimental error during the laboratory process (Kim et al., 2005). Effective imputation methods, which intend to recover these missing values, are important: on one hand, it is costly to repeat the experiments; and on the other hand, the repeat of experiments cannot guarantee data completeness.
Before data analysis, missing value imputation is generally required. Many algorithms for the gene expression data analysis, like support vector machines (Vapnik, 1995), and multivariate statistical analysis methods such as principal component analysis (Golub and van Loan., 1996), singular value decomposition (Alter et al., 2000) and generalized singular value decomposition (Alter et al., 2003), require a complete data set as the input. It can also be the case when microarrays are used in pilot studies for gene prioritization. Scores from a certain statistical test are generally used to rank genes. The sample sizes of different genes must be uniform so that the test scores of different genes can be comparable. One may consider using the corresponding P-values to rank genes, in which the sample sizes of different genes can be different. This approach is feasible when we know the theoretical test distribution. If such a distribution is unknown, which is usually the case in practice, we have to consider the permutation method for evaluating P-values. However, it is generally difficult to accurately evaluate the P-values for genes with missing observations when their sample sizes are small.
Recently, many methods have been developed for missing value imputations. These include a SVD-based method and a weighted k-nearest neighbors imputation (Troyanskaya et al., 2001), Bayesian approaches (Oba et al., 2003; Zhou et al., 2003), a fixed rank approximation algorithm (FRAA) (Friedland et al., 2003), a least squares method (Bo et al., 2004), a local least squares imputation (Kim et al., 2005), a collateral imputation method (Sehgal et al., 2005) and a SVM and orthogonal coding scheme based method (Wang et al., 2006). Furthermore, Kim et al. (2004) proposed to reuse the imputed data to improve missing value estimation, Tuikkala et al. (2006) proposed to consider gene ontology information for improving missing value estimation, and Gan et al. (2006) proposed to consider a set theoretic framework and biological knowledge to improve missing value estimation.
The impact of missing value imputation on differentially expressed gene identification has also been recently studied (Jornsten et al., 2005; Scheel et al., 2005). For multi-sample microarray data, gene prioritization is equivalent to detecting differentially expressed genes. When missing value imputation and gene prioritization are sequentially conducted, it is necessary to consider the distribution space of prioritization scores due to the existence of missing values. However, this issue has not been addressed since all the aforementioned missing value imputation methods only provide one estimate for each missing observation.
Ensemble methods, such as boosting (Freund and Schapire, 1997) and random forest (Breiman, 2001), have been widely used in the field of machine learning. When the number of predictor variables is relatively large, these methods can usually achieve satisfactory classification performance through combining a group of weak classifiers. In this study, we propose an ensemble approach to address the issue of microarray data based gene prioritization after missing value imputation. We first describe some preliminaries. Then, we detail a bootstrap based procedure. A two-sample microarray data set is used to illustrate our approach.
| 2 METHODS |
|---|
|
|
|---|
2.1 Preliminaries
Throughout the article, we will use G
mx n to represent a gene expression data matrix with m genes (rows) and n experiments (columns). Gm
2508x 10} may consist of the observed component X and the missing component Y : |
|
|
|
The existing imputation methods designed for microarray missing value estimation focus on the once-for-all estimation of the missing values based on the observed gene expression values. The subsequent gene prioritization and other analyses are entirely separated from the missing value imputation. Typically, a complete matrix G' is obtained by replacing the missing component Y with the estimated
. Then, the complete data matrix G' is used for the subsequent analyses. The subsequent analyses are no more relevant to the previous imputation process after G' is constructed. We will use C to denote the operation of the subsequent analyses. The traditional way to the whole process can be represented by
, which is a non-ensemble approach.
In this article, we introduce a novel ensemble approach, which is capable of incorporating any imputation method for missing value estimation, where the target outcome for the missing component Y is featured to be random as Y
and to follow a multivariate distribution
. We walk through
, i.e. the instantiation of Y
, through a bootstrap procedure. In this way, we can obtain an estimate for the operation C through an ensemble over C(X
Y
) . The whole process can be represented by
. Generally,
We evaluate the proposed approach using a two-sample microarray data set for various sample sizes and missing rates. The proposed approach is a framework, which can incorporate any imputation method I and any analysis operation C. The chosen local least squares imputation is coupled with L2 norm based similarity measure (referred to as LLSimpute/L2) because of its satisfactory performance (Kim et al., 2005), although other imputation methods can be selected in practice. We employ the simple Student's t-test or the widely used SAM t-test for gene prioritization as the operation C.
Our approach can be summarized in three steps: (i) bootstrapping imputation for individual samples; (ii) constructing resample complete matrices; (iii) averaging resample prioritization vectors. Figure 1 gives a flow chart for this approach. The details are described as follows.
|
2.2 Bootstrapping imputation for individual samples
The bootstrap method, first proposed by Efron (1979), enables us to generate a resample distribution of estimates in a non-parametric manner. Since different samples are generally unrelated, we perform a bootstrap procedure for each column (sample) to generate a resample distribution of missing value estimates.
Without loss of generality, we describe the bootstrap procedure for the first column g· 1 . First, we sample n – 1 number s2, s3, ..., sn from 2, 3, ..., n with replacement; Then, we use these resample n – 1 columns (g... s2, g... s3, · , g· sn) to impute the missing values in the column g· 1 (see Section 2.5 for the description of imputation method). The above two steps are repeated b times and we obtain b resample columns
. Notice that those non-missing values in the original column are not changed in these resample columns.
After performing the above procedure for all individual columns, we obtain n x b resample vectors
with b resample replicates for each sample (column). b will be referred to as the size of bootstrap in the rest of the article.
2.3 Constructing resample complete matrices
For each individual column j, j=1,2, ... ,n , we randomly select a resample replicate
from these b resample replicates generated by the above procedure. In this way, we construct a resample matrix
. (There are bn possible combinations.) We obtain r resample matrices by repeating the above step r times. (Notice that those non-missing values in the original matrix are not changed in these resample matrices.) r will be referred to as the size of ensemble in the rest of the article.
2.4 Averaging resample prioritization vectors
Using the method for gene prioritization (see Section 2.6 for detail), we can obtain a resample score vector for each resample matrix. We consider the average of these r resample vectors as the estimate of prioritization vector
. When the size of ensemble r is large, we consider averaging by mean. However, this is usually time-consuming. Generally, we set r=100 and consider averaging by median, which is less sensitive to outliers and is especially useful when P-values are used as prioritization scores. Notice that if there is no missing value in the original data, then this average vector will be the same as the prioritization vector calculated based on the original data.
2.5 Imputing missing values
The LLSimpute/L2 method proposed by Kim et al. (2005) is used for missing value imputation. We briefly describe it as follows. Without loss of generality, we consider the first gene and calculate the L2-norm based similarity measures between this gene and the rest m – 1 genes. These m – 1 genes are ranked according to the calculated similarity measures and the top k genes are identified as k nearest neighbors. The missing values in the first gene vector are estimated through the least squares method. We can perform this procedure for all gene vectors and have all missing values imputed. In this study, we follow the convention and choose k = 10.
2.6 Prioritizing genes
For a two-sample data set, we first consider the Student's t-test for prioritizing genes. We use n1 and n2 to denote the sample sizes of the first and the second groups, respectively, n1+n2=n . For each gene, we use x11, x12, ..., x1n1 and x21, x22, ..., x2n2 to denote its measurements (observed for non-missing or imputed for missing) in the first and the second groups, respectively. The Student's t-test is given by t=
s , where
= (
1-
2) and
;
,
and
|
|
We also consider the SAM t-test (Tusher et al., 2001) for prioritizing genes. This test adds a fudge factor to the denominator of the Students t-test: t=
(s + s0) . s0 is numerically determined according to the given data. This test has been widely used in two-sample microarray data analyses since it can generally improve the control of false positives by excluding genes with relatively small variances. However, the theoretical distribution of this test is unknown, and we have to use the permutation method to evaluate the P-values of tests. Since it is difficult to evaluate the P-values for genes with missing observations (missing values are actually not allowed in the implemented R function sam), the performance of the non-imputation approach cannot be evaluated.
Remark: Although test scores and their corresponding P-values have equivalent effects in prioritizing genes when genes have uniform sample sizes, we recommend to use P-values since they can provide additional significance information of the tests.
| 3 RESULTS |
|---|
|
|
|---|
We use the two-sample ZAP-70 dataset (Wiestner et al., 2003), which is publicly available at http://llmpp.nih.gov/cll/, to evaluate our ensemble imputation approach against the non-ensemble imputation approach as well as the non-imputation approach. We focus on the following two questions with the consideration of various sample sizes and missing rates:
- Will the ensemble imputation approach provide better concordant prioritization of genes?
- Will the ensemble imputation approach provide improved control of true and false positives?
3.1 ZAP-70 Dataset
There are 12447 genes and 107 cases involved in this two-sample data set, which was collected for the study of identification of a chronic lymphocytic leukemia subtype with unmutated immunoglobulin (Ig) genes, inferior clinical outcome and distinct gene expression profile. The sample sizes of the Ig-mutated and Ig-unmutated group are 79 (n1) and 28 (n2), respectively. The overall missing rate is 12.1%. There are 2508 genes with no missing value. We denote this subset as G
2508x 107 and use it for our evaluation study with different sample sizes (5+5 , 10+10 , and 15+15) and missing rates (5 , 10 , and 20%).
Without loss of generality, we briefly describe the procedure for data matrix generation of the sample size 5+5 coupled with 5% missing rate. First, we randomly choose five columns (samples) from each group to form a new complete data matrix Gc
2508x 10 . Then, we randomly knock out entries as missing with 5% probability. The newly constructed incomplete matrix Gm
2508x 10} , which contains missing values, is used for imputation. Based on these matrices, we consider both the ensemble and the non-ensemble approaches to gene prioritization after missing value imputation. We use vC =(vC1, vC2, ..., vCm)T
mx 1} , vE =(vE1, vE2, ..., vEm)T
mx 1 , and vS =(vS1, vS2, ... , vSm)T
mx 1} to denote the prioritization score vectors generated by the complete data matrix Gc , the ensemble and non-ensemble imputation approaches based on the incomplete data matrix Gm , respectively. For the analysis based on the Student's t-test, since theoretical P-values are used for gene prioritization, it is feasible to consider the non-imputation approach: prioritization scores are calculated only based on these observed data in Gm . We use vN =(vN1, vN2, ..., vNm)T
mx 1 to denote this score vector.
To answer the aforementioned two questions, we conduct the following two evaluations. We set the size of bootstrap b = 100, which enables us to generate a possibly huge ensemble. In order to determine the size of ensemble, for different sizes of ensemble, we calculated the Pearson correlation between the prioritization vectors vC and vE . Both the Student's and SAM t-tests were considered for prioritizing genes. Figure 2 shows the relationship between the correlation and size of ensemble: the correlation tends to be stable when the size of ensemble is close to 100 for all different configurations of sample size and missing rate. Therefore, we fix the size of ensemble r = 100 for the following evaluations.
|
When the SAM t-test is considered for prioritizing genes, it is difficult to evaluate the performance of non-imputation approach. Therefore, we only compare the ensemble imputation approach with the non-ensemble imputation approach.
3.2 Concordance of gene prioritization
One issue about gene prioritization after missing value imputation is whether the prioritization vector, which is used to select genes, is concordant with the one under the situation of no missing values. We use the Pearson correlation coefficient (Pearson, 1894) to measure the concordance. The improvement from the ensemble approach is measured by the Pearson Correlation Improvement Rate (PCIR) defined as follows. We first calculate the Pearson correlation coefficients
E between vE and vC , and
S between vS and vC . If vN is available, then
N between vN and vC is also calculated. Since these correlations are usually large, even small improvement will be considered significant. Therefore, we define the Pearson Correlation Improvement Rate (PCIR) between
E and
S as:
|
|
E and
N can also be similarly calculated. As shown in Table 1, the proposed ensemble imputation approach consistently outperforms the non-ensemble imputation approach as well as the non-imputation approach: we always obtain positive PCIRs. Compared to the non-ensemble imputation approach, the ensemble imputation approach achieves 50% and higher PCIRs when the sample size is 5+5, and 20% and higher PCIRs when the sample sizes are 10+10 and 15+15. These are observed when either the Student's or SAM t-test is used for prioritizing genes.
|
3.3 Control of true and false positives
When genes are selected after missing value imputation for the follow-up studies, it is necessary to understand the impact of missing value imputation on the control of true and false positives. Since only the selected genes will be used for the follow-up large sample validation studies, it is crucial to control the false positives in genes selected from microarray studies. Controlling the true positives is also important since it is undesirable to miss too many truly differentially expressed genes.
One difficulty to address this issue is that we do not know which genes are truly differentially or non-differentially expressed. In this study, for each configuration, we define gold standards based on the prioritization vector vC from the complete data Gc . Previous studies showed that an accurate estimate of the number (N) of differentially expressed genes can be obtained when the sample size is relatively large (Lai, 2006). Therefore, we first use the original large and complete data Gm
2508x 10} to estimate N with a recently proposed method (Lai, 2006). Then, we define these genes with top N ranks as gold standard positives and the rest genes as gold standard negatives. With these gold standards defined, we can generate the widely used receiver operating characteristic (ROC) curves (true positive rate against false positive rate for different cutoff points) to compare different approaches. Since different results may be obtained when different test statistics are used for gene prioritization, the above procedure is performed separately for the Student's and the SAM t-tests.
When the Student's t-test is used for gene prioritization, we obtain the estimate N = 1013 (40.4%); when the SAM t-test is used for gene prioritization, we obtain the estimate N = 984 (39.3%). Figures 3 and 4 show the ROC curves for different configurations when the Student's and the SAM t-test are used as test statistic. Although the curves become lower as missing rate increases, the ensemble imputation approach consistently outperforms the non-ensemble imputation and non-imputation approaches. The advantage is especially distinct when the sample sizes are about 5–10 per group and the missing rates are about 10–20%, which is a common situation for cDNA microarray studies.
|
|
| 4 DISCUSSION |
|---|
|
|
|---|
We proposed an ensemble approach to address the issue of gene prioritization after missing value imputation. Compared with the traditional missing value imputation methods, which only provide one estimate for each missing observation, our approach considers the distribution space of prioritization scores due to the existence of missing values. We simulated the distribution space through a bootstrap procedure. To compare different approaches, we evaluated their performances in the concordance of gene prioritization and the control of true/false positives. A published two-sample microarray gene expression data set was used for our evaluations. The results confirmed the advantages of our proposed approach; the results also allowed us to compare the non-ensemble imputation approach with the non-imputation approach. From Table 1 and Figure 3, the non-imputation approach showed comparable or even better performances in many cases when it was compared with the non-ensemble imputation approach.
We also evaluated the classification performance of genes selected from a pilot study. Support vector machine (Vapnik, 1995) was used as the classifier and different numbers of genes with top ranks were selected. The results showed that a relatively low classification error rate could be achieved, regardless of the choice of approach to missing value imputation, when a certain number (30–100) of genes were included. This is not surprising since the classification performance depends on not only the differentiability of selected genes but also the combination of these genes.
Our proposed ensemble approach is novel. It is capable of incorporating any imputation method for missing value estimation and any statistical test for gene prioritization. In this study, for simplicity, we chose to use the Student's or SAM t-test for gene prioritization. Because of its satisfactory performance (Kim et al., 2005), we chose the local least squares coupled with L2 norm based similarity measure for missing value imputation. Kim et al. (2005) also proposed an automatic estimator for the number k of nearest neighbor genes, and showed its satisfactory performances. However, this procedure is very time consuming. We actually performed this procedure for some of our evaluations and observed similar results. Since our purpose was to introduce an ensemble approach to gene prioritization after missing value imputation, we simply followed the convention and fixed k = 10 to save our computation time.
It should be noted that the benefits of our proposed approach may be affected by different methods for imputing missing values. Many imputation methods have been proposed and each one has its advantage in a certain situation. It is necessary to conduct further studies so that the impacts of different imputation methods can be well understood. It should also be noted that the current approach may not be applicable to time series microarray data (Gan et al., 2006) since the dependence structure among observations from different time points. We are currently investigating a further development so that we can generate this ensemble idea to integrate the missing value imputation with gene prioritization for time series microarray data.
If the missing rate is almost zero, then there will be no clear difference between the ensemble and the non-ensemble approaches; if almost all the data are missing, then the missing value imputation will not work well and both approaches will have poor performances. If the sample size is extremely small, then the missing value imputation will not work well and there will be no clear difference between the two approaches; If the sample size is relatively large, then the result of gene prioritization will be relatively robust and insensitive to the missing values, and therefore there will be no clear difference between the two approaches. The above discussion implies that there are certain ranges of the sample size and the missing rate such that the ensemble approach will provide considerable improvements over the non-ensemble approach (as we observed in Table 1 and Figs 3 and 4). To better understand the performance of ensemble approach, it is necessary to conduct more theoretical and simulation studies.
| ACKNOWLEDGEMENT |
|---|
|
|
|---|
We thank the associate editor and two anonymous reviewers for their valuable comments. This work was supported by a NIH grant DK-75004.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Associate Editor: Golan Yona
Received on August 31, 2006; revised on December 26, 2006; accepted on January 14, 2007
| REFERENCES |
|---|
|
|
|---|
Alter O, et al. Singular value decomposition for genome-wide expression data processing and modeling. (2000) 97. USA: Proc. Natl Acad. Sci. 10101–10106.
Alter O, et al. Generalized singular value decomposition for comparative analysis of genome-scale expression datasets of two different organisms. (2003) 100. USA: Proc. Natl Acad. Sci. 3351–3356.
Bo TH, et al. LSimpute: accurate estimation of missing values in microarray data with least squares methods. In: Nucleic Acids Res. (2004) 32:e34.
Breiman L. Random forests. In: Mach. Learn. (2001) 45:5–32.[CrossRef]
Cho JH, et al. New gene selection method for classification of cancer subtypes considering within class variation. In: FEBS Lett. (2003) 551:3–7.[CrossRef][Web of Science][Medline]
Der SD, et al. Identification of genes differentially regulated by interferon alpha, beta, or gamma using oligonucleotide arrays. (1998) 95. USA: Proc. Natl Acad. Sci. 15623–15628.
Efron B. Bootstrap methods: another look at the jackknife. In: Ann. Stati. (1979) 7:1–26.[CrossRef]
Friedland S, et al. A simultaneous reconstruction of missing data in DNA microarrays. (2003) Institute for Mathematics and its Applications Preprint Series No. 1948.
Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. J. Compu. Sys. Sci. (1997) 55:119–139.[CrossRef]
Gan X, et al. Microarray missing data imputation based on a set theoretic framework and biological knowledge. In: Nucleic Acids Res. (2006) 34:1608–1619.
Golub GH, van Loan CF. Matrix Computations (1996) Baltimore, CA: Johns Hopkins University Press.
Golub TR, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. In: Science (1999) 286:531–537.
Jornsten R, et al. DNA microarray data imputation and significance analysis of differential expression. Bioinformatics (2005) 21:4155–4161.
Kim H, et al. Missing value estimation for DNA microarray gene expression data: local least squares imputation. Bioinformatics (2005) 21:187–198.
Kim KY, et al. Reuse of imputed data in microarray analysis increases imputation efficiency. BMC Bioinformatics (2004) 5:160.[CrossRef][Medline]
Lai Y. A statistical method for estimating the proportion of differentially expressed genes. In: Comput. Biol. Chem. (2006) 30:193–202.[CrossRef][Web of Science][Medline]
Oba S, et al. A Bayesian missing value estimation method for gene expression profile data. Bioinformatics (2003) 19:2088–2096.
Pearson K. Contributions to the mathematical theory of evolution. (1894) 185. London: Phil. Trans. R. Soc. 71–110.
Perou CM, et al. Molecular portraits of human breast tumors. In: Nature (2000) 406:747–752.[CrossRef][Medline]
Scheel I, et al. The influence of missing value imputation on detection of differentially expressed genes from microarray data. Bioinformatics (2005) 21:4272–4279.
Sehgal M.SB, et al. Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data. Bioinformatics (2005) 21:2417–2423.
Shipp MA, et al. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. In: Nat. Med. (2002) 8:68–74.[CrossRef][Web of Science][Medline]
Troyanskaya O, et al. Missing value estimation methods for DNA microarray. Bioinformatics (2001) 17:520–525.
Tuikkala J, et al. Improving missing value estimation in microarray data with gene ontology. Bioinformatics (2006) 22:566–572.
Tusher VG, et al. Significance analysis of microarrays applied to the ionizing radiation response. (2001) 98. USA: Proc. Natl Acad. Sci. 5116–5121.
van t Veer LJ, et al. Gene expression profiling predicts clinical outcome of breast cancer. In: Nature (2002) 415:530–536.[CrossRef][Medline]
Vapnik V. The Nature of Statistical Learning Thery (1995) New York: Springer-Verlag.
Wang X, et al. Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme. In: BMC Bioinformatics (2006) 7:32.[CrossRef][Medline]
Wiestner A, et al. ZAP-70 expression identifies a chronic lymphocytic leukemia subtype with unmutated immunoglobulin genes, inferior clinical outcome, and distinct gene expression profile. In: Blood (2003) 101:4944–4951.
Zhou X, et al. Missing-value estimation using linear and non-linear regression with Bayesian gene selection. In: Bioinformatics (2003) 19:2302–2307.
This article has been cited by other articles:
![]() |
R. Varshavsky, A. Gottlieb, D. Horn, and M. Linial Unsupervised feature selection under perturbations: meeting the challenges of biological data Bioinformatics, December 15, 2007; 23(24): 3343 - 3349. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




