Bioinformatics Advance Access originally published online on April 6, 2005
Bioinformatics 2005 21(11):2766-2772; doi:10.1093/bioinformatics/bti416
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Common denominator procedure: a novel approach to gene-expression data mining for identification of phenotype-specific genes
1Xantos Biomedicine AG Max-Lebsche-Platz 31, 81377 München, Germany
2Rechenzentrum der Universität Hannover Schloßwender Straße 5, 30159 Hannover, Germany
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Motivation: We have established a novel data mining procedure for the identification of genes associated with pre-defined phenotypes and/or molecular pathways. Based on the observation that these genes are frequently expressed in the same place or in close proximity at about the same time, we have devised an approach termed Common Denominator Procedure. One unusual feature of this approach is that the specificity and probability to identify genes linked to the desired phenotype/pathway increase with greater diversity of the input data.
Result: To show the feasibility of our approach, the Cancer Genome Anatomy Project expression data combined with a defined set of angiogenic factors was used to identify additional and novel angiogenesis-associated genes. A multitude of these additional genes were known to be associated with angiogenesis according to published data, verifying our approach. For some of the remaining candidate genes, application of a high-throughput functional genomics platform (XantoScreenTM) provided further experimental evidence for association with angiogenesis.
Availability: Software available on request from the authors.
Contact: s.roehrig{at}xantos.de
| 1 INTRODUCTION |
|---|
|
|
|---|
Despite the progress made by various data mining procedures (Huminiecki and Bicknell, 2000; Vasmatzis et al., 1998), there is still a large gap between the amount of available expression data and the availability of data related to the function of those genes. More than 5 million ESTs from
8000 cDNA libraries have been compiled in the database of NCI's Cancer Genome Anatomy Project (CGAP) (Strausberg et al., 2002) for Homo sapiens alone (as of 06/2004). These human ESTs have been clustered to
115 000 unique gene (UniGene) clusters (Wheeler et al., 2004). Recently, novel technologies have been set up, which speed up the functional assignment of genes. Exemplary of these techniques are the functional genomics screens that have been devised by Human Genome Sciences, Inc., (Fiscella et al., 2003) or by Xantos Biomedicine AG (Grimm and Kachel, 2002; Koenig-Hoffmann et al., 2005; Zitzler et al., 2004). However, improved in silico analyses are desired to complement the existing in silico and experimental approaches to enable and accelerate the discovery of disease relevant target genes. Here we report a novel in silico approach for extraction of phenotype associated genes from gene-expression centered databases. As an example, the in silico identification of novel angiogenesis associated genes is described. Underlying our approach is the observation that proteins participating in a molecular pathway linked to a particular phenotype frequently are expressed in the same place or close proximity at about the same time. Accordingly, our method for identifying phenotype- or pathway-associated genes via data mining utilizes the fact that any given tissue sample should contain the mRNA encoding proteins participating in the active pathways of that tissue. Provided with data from a sufficient number of different tissue samples with the same pathway activated, it should be possible to identify proteins participating in that pathway.
Particular combinations of pathways manifest themselves in corresponding phenotypes. Therefore, it should be possible to detect not only pathway-related but also phenotype-related genes owing to their co-expression. Consequently, we developed a JavaTM application, which uses the CGAP expression data to rank UniGene clusters by their co-occurrence with pre-defined phenotype- or pathway-specific genes. Well-defined pro-angiogenic factors were used as input for our in silico procedure with the objective to discover novel genes which are also associated with angiogenesis. A comparison of the resulting candidate gene list of this in silico approach with the results of an experimental high-throughput screen for pro-angiogenic factors showed the feasibility of our approach.
| 2 MATERIALS AND METHODS |
|---|
|
|
|---|
2.1 Input data
CGAP expression data as of June 2004 was used (ftp://ftp1.nci.nih.gov/pub/CGAP/Hs_ExprData.dat) for all analyses. Many sequencing efforts do not routinely extend beyond a limited redundancy in the ESTs, and therefore, do not comprehensively reflect the distribution of mRNAs. For our analysis we prefer to see even sparsely expressed genes in a redundant manner. To select libraries and sequence sets with reasonable mRNA (EST) numbers, complexity and distribution, we included only libraries in our analyses which met the following criteria: the average ratio between the number of ESTs per expressed UniGene cluster observed in that library had to be >3; assuming that >1000 genes are expressed in a given tissue at a given time (Adams et al., 1995) the libraries had to represent at least 1000 different UniGene clusters; the libraries had to represent at most 5000 different UniGene clusters, assuming that larger numbers may be attributed to mixed, pooled or inhomogeneous samples. These requirements were met by 75 libraries (Supplementary Table S1).
Our procedure is based on the determination of presence or absence of a user-defined set of known angiogenesis-associated genes. The indicator genes that we have chosen were all known pro-angiogenic factors (IndicatorGeneSet): HIF1A (Hs.412416), DDIT4 (Hs.111244), VEGF (Hs.73793), IGFR1 (Hs.239176), ECGF1 (Hs.435067) and EPAS1 (Hs.8136) (Shoshani et al., 2002; Stoeltzing et al., 2003; Tsukagoshi et al., 2003; Wiesener et al., 1998). An additional set of genes associated with angiogenesis, termed AngioTestGroup, was selected in a semiautomatic manner, based on publicly available gene annotation. To this end, three consecutive steps were applied: (1) all genes annotated in Gene Ontology (Camon et al., 2004; Harris et al., 2004) with angiogenesis, response to hypoxia or any of their children in the Gene Ontology hierarchy were selected; (2) all genes with a GRIF (Pruitt and Maglott, 2001) description containing phrases hypoxia, hypoxic, angiogenic or angiogenesis were added; (3) negative modulators of angiogenesis were manually eliminated. For the assessment of the quality of a given LibraryProfile (which needs to be composed of at least 8 libraries, 10% of the 75 CGAP libraries), genes present in <8 libraries were removed from the AngioTestGroup. Likewise, genes present in >50% of the chosen 75 CGAP libraries were removed, owing to lack of specificity. A total of 73 human UniGene clusters met these criteria and comprised the AngioTestGroup.
2.2 Data mining procedure
The generation of a LibraryProfile is based on the determination of presence or absence of genes from the IndicatorGeneSet in individual libraries. All combinations of 36 of these genes were used to generate 42 LibraryProfiles. Libraries were ranked according to the number of chosen indicator genes they contained. First, all libraries containing every chosen indicator gene were added to the nascent LibraryProfile. Second, if the nascent LibraryProfile consisted of <8 libraries, we continued adding all libraries containing one less indicator gene, as long as the libraries contained at least two indicator genes (to ensure a minimal co-expression of the observed indicator gene subset). The finished LibraryProfile is a set of libraries with a common or similar presence/absence pattern with respect to the chosen indicator genes.
For each of the 42 LibraryProfiles, a GeneScore was calculated for every human UniGene cluster. This GeneScore was the percentage of LibraryProfile libraries containing the particular UniGene cluster. Ubiquitously expressed genes have a high probability to achieve a high GeneScore independently of the chosen LibraryProfile. Therefore, all clusters with a probability of >36.8% (simp
1.0, see below), to have reached their GeneScore by chance, were subsequently eliminated from our analyses. The probability of achieving at least the obtained GeneScore by chance was reflected by the discrete score improbability simp. It was defined as the negative natural logarithm of the probability to reach at least GeneScore s, rounded to the first decimal place. It was calculated by the following equation with N representing the number of selected libraries, 75 in this case; n number of libraries from N containing the observed UniGene cluster; k number of LibraryProfile libraries; x number of libraries from n contained in k; s GeneScore of the observed UniGene cluster; si : = i/k for i = 0...k.
If s > 0, n
k, N k
n, then:
![]() |
k, N k
n), simp was not calculated. These clusters were excluded from further analyses. Inspired by Monte Carlo sampling (Metropolis et al., 1953), random control profiles were used to distinguish between nonspecific LibraryProfiles and angiogenesis-associated profiles (AngioProfiles). This is possible without making any assumptions on the underlying distribution of the LibraryProfiles in regard to GeneScore, simp and the number of libraries in the LibraryProfile. Therefore, three random UniGene cluster control profiles and three random library control profiles were created for every LibraryProfile. To this end, the number of libraries from the selected 75 libraries in which a given indicator gene is present was calculated. Then, for each indicator gene used for the generation of the LibraryProfile, one random UniGene cluster present in the same number of libraries (out of the 75) was selected. The IndicatorGeneSet and the AngioTestGroup were excluded from this selection. These random UniGene clusters were then used to generate three control profiles for each LibraryProfile. Random library control profiles were generated using a random sampling from the 75 libraries: under the rationale that LibraryProfiles may be composed of different numbers of libraries for each combination of indicator genes, matching numbers of random libraries were used to generate an additional three control profiles for each LibraryProfile. Subsequently, the mean value of the GeneScore over all AngioTestGroup genes was calculated for each LibraryProfile and its corresponding six control profiles. Sixteen LibraryProfiles had a higher mean value than the highest mean value of its six control profile sets. These were termed AngioProfiles.
Following the selection of AngioProfiles, candidate genes were defined for each of the 16 AngioProfiles. First, low scoring genes (GeneScore
34%) were removed. For a given GeneScore, all remaining UniGene clusters were ranked into ProbabilityGroups according to their discrete simp. Next, a simp cutoff criterion was determined for each GeneScore. UniGene clusters having a particular GeneScore with a simp above that threshold were considered candidate genes. The threshold for each GeneScore was determined by the highest simp meeting the following constraints: (1) a threshold (of a lower GeneScore) must not be lower than one of a higher GeneScore; (2) to cap the overall number of candidate genes, ProbabilityGroups must contain <50 UniGene clusters; (3) the percentage of UniGene clusters with equal or higher score and simp in at least one matching control profile must be <33%, ignoring ProbabilityGroups containing <10 genes for statistical reasons. After completing this procedure independently for all GeneScores, the remaining list of genes was further pruned by removing genes with better or equal score and simp in at least one of the profile specific controls.
Multiple occurrence of the same candidate gene in different AngioProfiles indicate a higher probability of the corresponding UniGene cluster to be associated with angiogenesis. Therefore, candidate genes were ranked according to their multiplicity of occurrence in the AngioProfiles yielding our final candidate gene list. To compare this list with the experimental results from our high-throughput screen for pro-angiogenic factors, a BLAST (Altschul et al., 1990) of sequences from the screen hits against all UniGene clusters (identity
98%,
250 nt) was performed. It identified 611 UniGene clusters that were not members of the AngioTestGroup or the IndicatorGeneSet.
The rationale behind the setting of the parameters of our procedure (e.g. threshold settings, library exclusion, indicator gene selection) is to be stringent enough to ensure sufficient specificity, while at the same time prevent exclusion of too many datasets from further processing. The parameter and stringency settings used here were found to be suitable for the processing of EST data from CGAP libraries. Using a different source of expression data necessitates the use of different parameter and stringency settings.
| 3 RESULTS |
|---|
|
|
|---|
Our novel data mining procedure is based on determination of presence or absence of a user-defined set of known angiogenesis-associated genes (IndicatorGeneSet) in cDNA-libraries. Genes are represented by UniGene clusters, whose expression in particular libraries is listed in the CGAP expression data. To realize the procedure we have applied the following steps (Fig. 1): (1) Input data for our procedure were defined including the IndicatorGeneSet, an additional test set of angiogenesis associated genes (AngioTestGroup) as well as a selection of libraries which are suitable for our procedure; (2) LibraryProfiles were automatically defined as sets of libraries with common or similar presence/absence patterns with respect to the IndicatorGeneSet (Fig. 2A). (3) For each of these LibraryProfiles UniGene clusters were automatically ranked according to similarity to the IndicatorGeneSet with respect to their presence/absence patterns in libraries of the LibraryProfile. Similarity was expressed as a GeneScore for each UniGene cluster (Fig. 2B); (4) Most suitable LibraryProfiles for identification of angiogenesis-associated genes were automatically selected. The LibraryProfiles which best resembled the distribution of angiogenesis-associated UniGene clusters in comparison with random control profiles were termed AngioProfiles (Fig. 2C); (5) Candidate genes were automatically defined for each AngioProfile. Repeated specific appearance of UniGene clusters in independent AngioProfiles suggests a higher probability for an angiogenesis-associated phenotype. Therefore, candidate genes were ranked according to their multiplicity of occurrence in the AngioProfiles in our resulting candidate gene list (Fig. 2D).
|
|
3.1 Internal control
Since the IndicatorGeneSet was used for selection of the 42 LibraryProfile and the AngioTestGroup for selection of the 16 AngioProfiles, the AngioTestGroup and the IndicatorGeneSet are not independent of the procedure. Enrichment of these genes should be observable. The above described candidate gene list harbored 2031 candidate genes that occurred in at least one of the 16 AngioProfiles, i.e. had a multiplicity of one (Table 1). Five of the 6 UniGene clusters from the IndicatorGeneSet and 18 of the 73 UniGene clusters from the AngioTestGroup were contained in those 2031 candidate genes, comprising 0.002 and 0.89%, respectively of the candidate genes. The percentage of angiogenesis-associated genes increased with greater multiplicity: for example from 55 candidate genes that were present in eight or more AngioProfiles, two (3.6%) were still indicator genes and one (1.8%) was an AngioTestGroup gene.
|
3.2 Experimental validation
As an independent assessment of the candidate gene list, we compared our in silico defined candidate lists (without genes from the IndicatorGeneSet and the AngioTestGroup) with genes tested positive in a high-throughput screen for factors inducing human umbilical vein endothelial cell (HUVEC) proliferation. This experimental screen had previously been performed at Xantos Biomedicine AG. This screen identified well known angiogenesis factors, such as VEGF or FGF, as well as a list of 466 novel target candidates. These were mapped to 611 UniGene clusters which were not members of the AngioTestGroup or the IndicatorGeneSet, using BLAST (see Materials and methods section). The comparison of both lists confirmed the specificity of our in silico method: at a multiplicity of one, 4.4% of the 2031 candidate genes were members of the experimentally defined list of angiogenesis-associated candidate genes. This fraction increased to >13.5% at higher multiplicity (candidates present in eight or more AngioProfiles). At this multiplicity, 13 (25.0%) of the 52 remaining candidate genes were either experimentally verified by the high-throughput screen or previously known to modulate angiogenesis. This increased further to 50.0% for candidate genes occurring in at least 12 of the 16 selected profiles.
3.3 Selected candidates
Table 2 shows a list of genes that we identified by our procedure as genes with a proposed angiogenesis-associated phenotype, grouped by functional and structural categories. For example, the group matrix and motility includes known angiogenesis related genes ITGB5 (Nisato et al., 2003), FN1 (Krishnamachary et al., 2003), CAPN1 (Su et al., 2004), ADAM15 (Horiuchi et al., 2003; Trochon-Joseph et al., 2004) and LAMA5 (Sasaki and Timpl, 2001). Interestingly (or rather consequently), our in silico screen also points to LGALS3BP as being angiogenesis-associated. LGALS3BP is a protein which binds FN1, ITGB5 and COL6A2 (Marchetti et al., 2002). COL6A2 is known to modulate angiogenesis (Daniels et al., 1996; Iyengar et al., 2003), and was also a candidate gene, albeit only in 4 of the 16 AngioProfiles. Other functional gene groups include soluble factors and surface receptors, like the known angiogenesis modulators TGFBI (Thorey et al., 2004) and CD151 (Wright et al., 2004). Another soluble factor is SVAP1 which is associated with vascular biology (Davis et al., 2002). Of interest for further research are also those candidate genes for which so far no function has been assigned. For example, the HUVEC proliferation screen hit LOC56270may be particularly exciting.
|
| 4 DISCUSSION AND CONCLUSION |
|---|
|
|
|---|
CGAP expression data contains information of UniGene clusters in different libraries. Most of these libraries are individually derived from defined tissues (e.g. from biopsies). In accordance with our rationale, these libraries represent gene sets that are expressed at the same place (in the same sample) and at the same time (time of biopsy). If a phenotype or pathway of interest has been active at the time the sample was taken, a multitude of genes associated with the pathway or phenotype should have been expressed in the sample and thus, be present in the library. On the other hand, such samples and derived libraries not only represent the expressed genes of the phenotype or pathway of interest, but in fact of many other pathways.
The challenge is the enrichment and extraction of the specific phenotype-associated genes from the multitude of genes that are additionally expressed in these samples. We solved this challenge by the application of a novel procedure that we have termed Common Denominator Procedure. The basis of the procedure is the definition of as many libraries and as diverse as possible with the desired common phenotype. This can be achieved by automatically detecting the presence or absence of a set of well-defined indicator genes (IndicatorGeneSet) that are known to be closely associated with the phenotype. Provided a sufficient number of these libraries with the phenotype of interest can be defined, extraction of the desired genes from the whole set of genes should be possible because these genes are the common denominator. The more the libraries used as input and the more diverse these libraries are (e.g. from different tissues, diseases and stages/grades) the smaller and hence more specific should be the common denominator. The latter feature of our concept, i.e. increased diversity of input data leads to increased specificity, is where our Common Denominator Procedure deviates from (and in some aspects is superior to) other approaches.
Such already established procedures include, e.g. application of clustering, differentiation and filtering steps for identification of disease specific genes from EST, SAGE or proteomics data (Becquet et al., 2002; Cai et al., 2004; Krieg et al., 2004; Vasmatzis et al., 1998), as well as from expression data generated via hybridization analyses [e.g. Affymetrix-Chips, (Nishizuka et al., 2003; Wei et al., 2004)]. However, all these approaches benefit from or even require the use of data from well-defined samples. In fact, presence of any tissue with divergent identity (e.g. not of disease origin) within the samples to be analyzed will taint these in silico analyses. Accordingly, elaborate sample description (pathology, e.g. grade, stage of disease, gender, age) is beneficial and needed for these approaches (Brazma et al., 2001). In addition to sample description, great efforts have been invested on the technical side. To ensure sample homogeneity elaborate procedures, such as tissue microdissection were applied to remove all non-disease components from library sources. This technique separates desired cells from infiltrating lymphocytes, matrix/stroma components, nerves or vasculature, hence leading to library sources of excellently defined origins (Burgemeister et al., 2003; Fend et al., 1999).
In contrast, our Common Denominator Procedure is performed completely independent of any pathological characterization or classification. It will, at least in theory, perform better the more diverse the original source of data, i.e. the biological samples, are. This includes diversity within the samples from which libraries were made. For example, it is known that the generation of many phenotypes relies not only on pathways within one defined cell type, but on complex interactions between cells and adjacent tissues. The phenotype tumor angiogenesis provides a good example as it involves not only the induction of classical intracellular signaling pathways like hypoxia regulation but also feedback loops within cells, as well as cross-talk between cancer cells, endothelial cells, neighboring tissues and adjacent stroma. These cross talks can be assigned to trans-acting factors, e.g. secreted proteins, which modulate the interplay between matrix, endothelium, tumor tissue, and external stimuli like hypoxia. Because of this complex modulation of tumor angiogenesis, it is of advantage to include into the analysis not only one specific tissue type, but instead many cells, such as the aforementioned cancer cells, stroma, endothelial cells and possibly even infiltrating lymphocytes.
One of the limitations of our procedure is the need for reliable information about the expression of genes in tissues. To ensure this, a good representation of the mRNA expression by the EST composition of the library prepared from the tissue sample is essential. However, the mRNA representation of a CGAP library can be quite low. More than 80% contained <100 different UniGene clusters. Therefore, we had to restrict our input data to a subset with good mRNA representation. This carries the risk of losing valuable information contained in the other libraries.
For the validation of our procedure and estimation of the specificity of our candidate gene list, we determined two independent parameters: (1) presence of genes known to modulate angiogenesis; (2) presence of experimentally validated, previously unknown angiogenesis associated genes. We performed a literature search for the 55 candidate genes that occurred in at least eight AngioProfiles, i.e. those with the highest ranking. Three of them were genes from the AngioTestGroup or the IndicatorGeneSet (5.4%). This enrichment can be considered as a positive internal control. From the remaining 52 UniGene clusters eight (15.4%) were also known modulators of angiogenesis. For all 44 remaining genes, modulation of angiogenesis is not yet described, although some are clearly linked to vascular biology (e.g. sVAP1, MCM6). Xantos previously performed a high-throughput screen for the identification of angiogenesis, based on a HUVEC proliferation assay. Comparison of the remaining candidate genes with the hits of the proliferation screen revealed two (3.8%) already known angiogenesis modulators and further five (11.4%) UniGene cluster to be associated with angiogenesis. This corresponds to a 22-fold enrichment considering all 103 281 human UniGene clusters in the CGAP expression data (0.58%) or still a 4-fold enrichment considering the restriction of CGAP libraries (see above) and the number of ESTs per UniGene cluster within this restricted dataset (3.4%, Fig. 3). In summary, 13 (25.0%) of the 52 candidate genes, occurring in at least eight AngioProfiles, were either previously known to modulate angiogenesis or experimentally verified. This increased to 50.0% for candidate genes occurring in at least 12 of the 16 AngioProfiles.
|
The literature-based and experimental validation of the candidate genes shows that our common denominator procedure can identify angiogenesis-associated genes. Since its phenotype specificity is solely based on a small set of angiogenesis-indicator genes (IndicatorGeneSet), the procedure can easily be extended to other phenotypes by definition of different phenotype-associated IndicatorGeneSets. On account of its high specificity, our common denominator procedure is suitable as primary screen for target discovery. It can also be combined with functional genomics techniques for identification of target genes for the diagnosis and therapy of human diseases.
| Acknowledgments |
|---|
We would like to thank Bettina Ehring, Alexander Felber, Beate Gawin, Johannes Görl, Kerstin König-Hoffmann, Rolf Schäfer, Alexander Spychaj and Peter Buckel for thoughtful comments and discussions.
Received on February 3, 2005; revised on March 10, 2005; accepted on March 29, 2005
| REFERENCES |
|---|
|
|
|---|
Adams, M.D., et al. (1995) Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. Nature, 377, 3174[Medline].
Altschul, S.F., et al. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403410[CrossRef][ISI][Medline].
Becquet, C., et al. (2002) Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data. Genome Biol., 3, RESEARCH0067[Medline].
Brazma, A., et al. (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat. Genet., 29, 365371[CrossRef][ISI][Medline].
Burgemeister, R., et al. (2003) High quality RNA retrieved from samples obtained by using LMPC (laser microdissection and pressure catapulting) technology. Pathol. Res. Pract., 199, 431436[CrossRef][ISI][Medline].
Cai, L., et al. (2004) Clustering analysis of SAGE data using a Poisson approach. Genome Biol., 5, R51[CrossRef][Medline].
Camon, E., et al. (2004) The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res., 32, D262D266
Daniels, K.J., et al. (1996) Expression of type VI collagen in uveal melanoma: its role in pattern formation and tumor progression. Lab Invest., 75, 5566[ISI][Medline].
Davis, L.S., et al. (2002) Inflammation, immune reactivity, and angiogenesis in a severe combined immunodeficiency model of rheumatoid arthritis. Am. J. Pathol., 160, 357367
Fend, F., et al. (1999) Immuno-LCM: laser capture microdissection of immunostained frozen sections for mRNA analysis. Am. J. Pathol., 154, 6166
Fiscella, M., et al. (2003) TIP, a T-cell factor identified using high-throughput screening increases survival in a graft-versus-host disease model. Nat. Biotechnol., 21, 302307[CrossRef][ISI][Medline].
Grimm, S. and Kachel, V. (2002) Robotic high-throughput assay for isolating apoptosis-inducing genes. Biotechniques, 32, 670677[Medline].
Harris, M.A., et al. (2004) The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res., 32, D258D261
Horiuchi, K., et al. (2003) Potential role for ADAM15 in pathological neovascularization in mice. Mol. Cell. Biol., 23, 56145624
Huminiecki, L. and Bicknell, R. (2000) In silico cloning of novel endothelial-specific genes. Genome Res., 10, 17961806
Iyengar, P., et al. (2003) Adipocyte-secreted factors synergistically promote mammary tumorigenesis through induction of anti-apoptotic transcriptional programs and proto-oncogene stabilization. Oncogene, 22, 64086423[CrossRef][ISI][Medline].
Koenig-Hoffmann, K., et al. (2005) High throughput functional genomics: identification of novel genes with tumor suppressor phenotypes. Int. J. Cancer, 113, 434439[CrossRef][ISI][Medline].
Krieg, R.C., et al. (2004) ProteinChip Array analysis of microdissected colorectal carcinoma and associated tumor stroma shows specific protein bands in the 3.4 to 3.6kDa range. Anticancer Res., 24, 17911796[ISI][Medline].
Krishnamachary, B., et al. (2003) Regulation of colon carcinoma cell invasion by hypoxia-inducible factor 1. Cancer Res., 63, 11381143
Marchetti, A., et al. (2002) Expression of 90K (Mac-2 BP) correlates with distant metastasis and predicts survival in stage I non-small cell lung cancer patients. Cancer Res., 62, 25352539
Metropolis, N., et al. (1953) Equation of state calculation by fast computing machines. J. Chem. Phys., 21, 10871092[CrossRef].
Nisato, R.E., et al. (2003) alphav beta 3 and alphav beta 5 integrin antagonists inhibit angiogenesis. in vitro. Angiogenesis, 6, 105119.
Nishizuka, S., et al. (2003) Diagnostic markers that distinguish colon and ovarian adenocarcinomas: identification by genomic, proteomic, and tissue array profiling. Cancer Res., 63, 52435250
Pruitt, K.D. and Maglott, D.R. (2001) RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res., 29, 137140
Sasaki, T. and Timpl, R. (2001) Domain IVa of laminin alpha5 chain is cell-adhesive and binds beta1 and alphaVbeta3 integrins through ArgGlyAsp. FEBS Lett., 509, 181185[CrossRef][ISI][Medline].
Shoshani, T., et al. (2002) Identification of a novel hypoxia-inducible factor 1-responsive gene, RTP801, involved in apoptosis. Mol. Cell. Biol., 22, 22832293
Stoeltzing, O., et al. (2003) Regulation of hypoxia-inducible factor-1alpha, vascular endothelial growth factor, and angiogenesis by an insulin-like growth factor-I receptor autocrine loop in human pancreatic cancer. Am. J. Pathol., 163, 10011011
Strausberg, R.L., et al. (2002) An international database and integrated analysis tools for the study of cancer gene expression. Pharmacogenomics J., 2, 156164[CrossRef][Medline].
Su, Y., et al. (2004) Cigarette smoke extract inhibits angiogenesis of pulmonary artery endothelial cells: the role of calpain. Am. J. Physiol. Lung Cell. Mol. Physiol., 287, L794L800
Thorey, I.S., et al. (2004) Transgenic mice reveal novel activities of growth hormone in wound repair, angiogenesis, and myofibroblast differentiation. J. Biol. Chem., 279, 2667426684
Trochon-Joseph, V., et al. (2004) Evidence of antiangiogenic and antimetastatic activities of the recombinant disintegrin domain of metargidin. Cancer Res., 64, 20622069
Tsukagoshi, S., et al. (2003) Thymidine phosphorylase-mediated angiogenesis regulated by thymidine phosphorylase inhibitor in human ovarian cancer cells in vivo. Int. J. Oncol., 22, 961967[ISI][Medline].
Vasmatzis, G., et al. (1998) Discovery of three genes specifically expressed in human prostate by expressed sequence tag database analysis. Proc. Natl Acad. Sci. USA, 95, 300304
Wei, J.S., et al. (2004) Prediction of clinical outcome using gene expression profiling and artificial neural networks for patients with neuroblastoma. Cancer Res., 64, 68836891
Wheeler, D.L., et al. (2004) Database resources of the National Center for Biotechnology Information: update. Nucleic Acids Res., 32, D35D40
Wiesener, M.S., et al. (1998) Induction of endothelial PAS domain protein-1 by hypoxia: characterization and comparison with hypoxia-inducible factor-1alpha. Blood, 92, 22602268[Medline].
Wright, M.D., et al. (2004) Characterization of mice lacking the tetraspanin superfamily member CD151. Mol. Cell. Biol., 24, 59785988
Zitzler, J., et al. (2004) High-throughput functional genomics identifies genes that ameliorate toxicity due to oxidative stress in neuronal HT-22 cells: GFPT2 protects cells against peroxide. Mol. Cell. Proteomics, 3, 834840
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



