Skip Navigation


Bioinformatics Advance Access originally published online on November 17, 2005
Bioinformatics 2006 22(2):251-252; doi:10.1093/bioinformatics/bti787
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/2/251    most recent
bti787v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Criel, J.
Right arrow Articles by Tsiporkova, E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Criel, J.
Right arrow Articles by Tsiporkova, E.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oxfordjournals.org

Gene Time E{chi}pression Warper: a tool for alignment, template matching and visualization of gene expression time series

Jo Criel 1 and Elena Tsiporkova 2,*

1Devgen N.V., Science IT Technologiepark 30, B-9052 Ghent, Belgium
2Department of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology (VIB), Ghent University Technologiepark 927, B-9052, Ghent, Belgium

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS AND IMPLEMENTATION
 REFERENCES
 

Summary: An application tool for alignment, template matching and visualization of gene expression time series is presented. The core algorithm is based on dynamic time warping techniques used in the speech recognition field. These techniques allow for non-linear (elastic) alignment of temporal sequences of feature vectors and consequently enable detection of similar shapes with different phases.

Availability: The Java program, examples and a tutorial are available at http://www.psb.ugent.be/cbd/papers/gentxwarper/

Contact: eltsi{at}psb.ugent.be


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS AND IMPLEMENTATION
 REFERENCES
 
Detecting patterns in gene expression time-series data is a challenging knowledge discovery task due to the variation in time progression inherent to biological processes that may unfold with different rates in response to different experimental conditions or within different organisms and individuals. Classical distance metrics as Euclidean or a variation thereof fail to capture this temporal variation since they are very sensitive to small distortions in the time axis and, consequently, produce poor similarity measures between time series. Dynamic time warping (DTW) is a much more robust distance measure for time series, allowing similar shapes to match even if they are out of phase in the time axis (Fig. 1).


Figure 1
View larger version (26K):
[in this window]
[in a new window]
 
Fig. 1 A DTW alignment between two time series.

 
The DTW alignment algorithm was developed originally for speech recognition (Sakoe and Chiba, 1978) and it aims at aligning two sequences of feature vectors by warping the time axis iteratively until an optimal match (according to a suitable metrics) between the two sequences is found. Because of its flexibility, DTW is widely used in many scientific disciplines and business applications. In a pilot study, Aach and Church (2001) investigated the stability of the DTW algorithm on Saccharomyces cerevisiae cell cycle expression data, by mainly focusing on the alignment of the expression profiles of the class ptg50 (990 genes) in two different time series. For this purpose they used four command-line executable C++ programs, implementing classical and interpolated DTW algorithm and generating postscript files containing visualizations of the alignment.

We present here a gene time expression warping tool GenT{chi}Warper, a Java-based program supplied with a powerful graphical user interface that enables alignment, template matching and visualization of time-series data in an easy and a flexible fashion. The original symmetric DTW algorithm (Sakoe and Chiba, 1978) has been extended with new features, as for instance the possibility for defining an anchor point in the alignment and for performing partial alignments by sliding the time series against each other along the time axis. Additionally, some typical microarray data transformations and several distance metrics to be applied between the feature vectors at each time point have been provided to the user. To our knowledge GenT{chi}Warper is the first user-friendly DTW tool available to the biological community.


    2 METHODS AND IMPLEMENTATION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS AND IMPLEMENTATION
 REFERENCES
 
GenT{chi}Warper operates in two main modes: aligning datasets and template matching. As the name suggests, the aligning datasets mode allows finding the best time alignment between two sets of gene expression time series and it can be useful for comparative studies of the temporal behaviour of a set of genes in different experimental conditions (e.g. cell cycle expression data generated with different synchronization techniques) or in different organisms (yeast, plants, human, etc.). The two profile sets are supplied separately in two different files and the aligned suite can be saved into a file, which may consequently be subjected to further studies with other microarray analysis tools. The template matching mode (Fig. 2) allows mining gene expression time series for patterns that fit best a template expression profile. Consequently, it facilitates the identification of a cluster of genes whose expression profiles are related, possibly with a non-linear time shift, to the profile of a gene supplied as a template. An additional feature enables also the computation of a gene pairwise DTW distance matrix for a complete microarray dataset. The template matching mode can be employed in studies requiring gene-centric approaches. For instance, Zhu et al., 2002 demonstrated that a transcription-factor-centric clustering can be successful, even when limited to linear time delay, in identifying transcription factor binding sites.


Figure 2
View larger version (98K):
[in this window]
[in a new window]
 
Fig. 2 The graphical interface of the template matching mode. The genes whose expression profiles can be selected as a template are shown to the left, and the N best matching genes with their fitscores to the right. The middle window visualizes the compared expression profiles in real time (top) and their DTWalignment (bottom).

 
In both modes the performance of the core alignment algorithm is subject to modification via several parameters: data adjustment, metric, warping window, offset and anchor point. The ‘data adjustment’ option enables z- and log2-transformations of the input expression profiles before alignment. Both are essential for enabling the comparison of gene expression time series between experiments and between species. The ‘metric’ parameter allows a flexible choice between four different distance measures: Manhattan, Euclidean, Chebychev and Pearson correlation. The ‘warping window’ constraint is meant to facilitate reduction of the search space and consequently leads to a faster processing of large datasets. The extreme usage of this feature, however, may have a negative effect on the accuracy of the final alignment.

The ‘offset’ function enables sliding the time series against each other along the time axis. This may have multiple applications. (1) Many biological processes, as for instance cell cycle, are conserved between species, and for a given gene with a known function in one species, one may attempt to identify a set of genes in another species with a potentially similar function. However the duration of the different cell cycle phases may vary considerably between species and one way to correct for this is to apply an offset, eventually in a combination with an anchor point (see below), that positions the corresponding cell cycle phases of interest against each other. (2) The possibility for applying an offset is also essential in case the biological process under study displays a phase shift due to the design of the experiment. For instance, cell cycle progression is usually studied via genome-wide expression profiling of synchronized cell suspension cultures and usually different methods will generate a synchronized resumption of different phases of the cell cycle. (3) Additionally, the offset parameter can be useful for performing ‘causality searches’ as they were named by Aach and Church (2001). By specifying a non-zero offset one may slide the sets of expression profiles against each other along the time axis, in this way discovering genes with similar trajectories but shifted in time. Thus putative targets of a known transcription factor can be identified using its profile as a template and evaluating the list of the best matching genes for different offset values (see Zhu et al., 2002).

The ‘anchor point’ option provides the possibility to explicitly align a time point from one time series with a time point from another time series and can be used in similar cases as the ones listed above for the offset. For instance, setting an anchor point might be very useful in case there is detailed information about the exact times when the compared biological processes go through some fixed state or when one of the time series to be compared is sampled during a relatively shorter time interval than the other.

GenT{chi}Warper is supplied with a powerful graphical interface. Visualization panels provide a comparative view of (1) the original expression profiles and (2) their aligned with the DTW algorithm counterparts. In addition, the alignment mode interface reports the DTW table, with the optimal warping path through it indicated in red, and the final fitscore. The interface of the template matching mode (Fig. 2) enables the user to select the expression profile of a gene of interest as a template by simply scrolling up and down a list of gene names.

Useful case studies with GenT{chi}Warper can be found at http://www.psb.ugent.be/cbd/papers/gentxwarper/casestudy/

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Steen Knudsen

Received on July 27, 2005; revised on September 20, 2005; accepted on November 15, 2005

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS AND IMPLEMENTATION
 REFERENCES
 

    Aach, J. and Church, G.M. (2001) Aligning gene expression time series with time warping algorithms. Bioinformatics, 17, 495–508[Abstract/Free Full Text].

    Sakoe, H. and Chiba, S. (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process, . 26, 43–49[CrossRef].

    Zhu, Z., et al. (2002) Computational identification of transcription factor binding sites via a transcription-factor-centric clustering algorithm. J. Mol. Biol, . 318, 71–81[CrossRef][ISI][Medline].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
F. Hermans and E. Tsiporkova
Merging microarray cell synchronization experiments through curve alignment
Bioinformatics, January 15, 2007; 23(2): e64 - e70.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Ferre and P. Clote
BTW: a web server for Boltzmann time warping of gene expression time series.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W482 - W485.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/2/251    most recent
bti787v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Criel, J.
Right arrow Articles by Tsiporkova, E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Criel, J.
Right arrow Articles by Tsiporkova, E.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?