Skip Navigation

Bioinformatics 2005 21(Suppl 1):i159-i168; doi:10.1093/bioinformatics/bti1022
This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ernst, J.
Right arrow Articles by Bar-Joseph, Z.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ernst, J.
Right arrow Articles by Bar-Joseph, Z.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

Clustering short time series gene expression data

Jason Ernst 1,*, Gerard J. Nau 2 and Ziv Bar-Joseph 1

1Center for Automated Learning and Discovery, School of Computer Science, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
2Department of Molecular Genetics and Biochemistry, University of Pittsburgh School of Medicine 200 Lothrop Street, Pittsburgh, PA 15261, USA

*To whom correspondence should be addressed.

Motivation: Time series expression experiments are used to study a wide range of biological systems. More than 80% of all time series expression datasets are short (8 time points or fewer). These datasets present unique challenges. On account of the large number of genes profiled (often tens of thousands) and the small number of time points many patterns are expected to arise at random. Most clustering algorithms are unable to distinguish between real and random patterns.

Results: We present an algorithm specifically designed for clustering short time series expression data. Our algorithm works by assigning genes to a predefined set of model profiles that capture the potential distinct patterns that can be expected from the experiment. We discuss how to obtain such a set of profiles and how to determine the significance of each of these profiles. Significant profiles are retained for further analysis and can be combined to form clusters. We tested our method on both simulated and real biological data. Using immune response data we show that our algorithm can correctly detect the temporal profile of relevant functional categories. Using Gene Ontology analysis we show that our algorithm outperforms both general clustering algorithms and algorithms designed specifically for clustering time series gene expression data.

Availability: Information on obtaining a Java implementation with a graphical user interface (GUI) is available from http://www.cs.cmu.edu/~jernst/st/

Contact: jernst{at}cs.cmu.edu

Supplementary information: Available at http://www.cs.cmu.edu/~jernst/st/


Received on January 15, 2005; accepted on March 27, 2005

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
D. Nam, S. H. Yoon, and J. F. Kim
Ensemble learning of genetic networks from time-series expression data
Bioinformatics, December 1, 2007; 23(23): 3225 - 3231.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
S. M. Brady, D. A. Orlando, J.-Y. Lee, J. Y. Wang, J. Koch, J. R. Dinneny, D. Mace, U. Ohler, and P. N. Benfey
A High-Resolution Root Spatiotemporal Map Reveals Dominant Expression Patterns
Science, November 2, 2007; 318(5851): 801 - 806.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Sahoo, D. L. Dill, R. Tibshirani, and S. K. Plevritis
Extracting binary signals from microarray time-course data
Nucleic Acids Res., June 28, 2007; 35(11): 3705 - 3712.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. Yoneya and H. Mamitsuka
A hidden Markov model-based approach for identifying timing differences in gene expression under different experimental factors
Bioinformatics, April 1, 2007; 23(7): 842 - 849.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. Rivals, L. Personnaz, L. Taing, and M.-C. Potier
Enrichment or depletion of a GO category within a class of genes: which test?
Bioinformatics, February 15, 2007; 23(4): 401 - 407.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
I. Martinez, L. Lombardia, B. Garcia-Barreno, O. Dominguez, and J. A. Melero
Distinct gene subsets are induced at different time points after human respiratory syncytial virus infection of A549 cells
J. Gen. Virol., February 1, 2007; 88(2): 570 - 581.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.