Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ray, W. C.
Right arrow Articles by Daniels, C. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ray, W. C.
Right arrow Articles by Daniels, C. J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 17 no. 12 2001
Pages 1105-1112
© 2001 Oxford University Press

Tricross : using dot-plots in sequence-id space to detect uncataloged intergenic features

William C. Ray 1,2,*, Robert S. Munson Jr 1,2,3 and Charles J. Daniels 3

1 Children’s Research Institute
2 The Department of Pediatrics, The Ohio State University, 700 Childrens Dr Columbus, OH 43205, USA
3 The Department of Microbiology, The Ohio State University, 484 West 12th Ave, Columbus, OH 43210, USA

Received on April 13, 2001 ; revised on June 15, 2001 ; accepted on June 26, 2001

Motivation: The process of determining the functional sequencecontent of an organism is confounded by several factors. Large protein coding sequences are relatively easy to find by statistical methods. Smaller proteins however may escape detection due to their size falling below some arbitrary researcher-defined minimum cutoff, or the inability to precisely define a promoter, or translational start (Delcher et al. , Nucleic Acids Res. , 27, 4636–4641, 1999). Promoter and regulatory sequences themselves are difficult to define due to a significant amount of allowable sequence variation, as well as a probable lack of any completely accurate whole-organismal gene catalogs to date. Finally, certain genes coding functional RNAs may have insufficient structural or sequence constraints to be detectable by normal sequence structure/pattern searching methods (Eddy and Rivas, Bioinformatics , 16, 583–605, 2000).

In those cases where there are multiple closely related organisms that have been sequenced, there is additional information that may be used in the investigation of sequence content—that being the possible conserved nature of functional sequences between the organisms.

We present a method for the utilization of this conserved information to detect genes and other potentially functional sequences that may be missed by standard ORF-calling, RNA finding, and pattern matching software. The tricross programs produce a multi-way cross comparison of three sets of sequences, determine which are conserved in all three sets, and produce a graphical (Virtual Reality Modelling Language—VRML; (ISO/IEC 14772-1: 1997, VDC), 1997) representation as well as alignments of all sequence triples found. The software can also be applied to a pair of sequence sets, though the noise in the results increases.

Results: Tricross has been used to examine the intergenic-sequence content of the three archaeal Pyrococcus genomes to determine the most highly related sequences remaining between the annotated protein and RNA coding sequences. Set to relatively stringent similarity requirements for the search, tricross found 101 intergenic sequences conserved among the three organisms. Interestingly, 29 of these appear to contain members of a family of small RNA molecules (Kiss-Laszlo et al. , EMBO J. , 17, 797–807, 1998) only recently discovered in the Archaea (Armbruster, OSU, Diss., 1988; Omer et al. , Science , 288, 517–522, 2000; Gaspin et al. , J. Mol. Biol. , 297, 895–906, 2000). While some of the remaining 72 appear to be individual highly conserved promoter sequences, others have no currently known biological significance. Although originally developed to facilitate the examination of intergenic sequences, none of the tricross logic is inherently specific to intergenic sequences. The software can also be applied to gene sequences, and has been used to produce inter-genomic gene order dot-plots for Haemophilus influenzae (Fleischmann et al. , Science , 269, 496–512, 1995) versus H.ducreyi (unpublished data), and Neisseria meningiditis Z2491 (serogroup A) (Parkhill et al. , Nature , 404, 502–506, 2000) versus Neisseria meningiditis Z58 (serogroup B) (Tettelin et al. , Science , 287, 1809–1815, 2000) versus Neisseria gonorrhoeae (Lewis et al. , http://micro-gen.ouhsc.edu/, 2000).

Availability: The tricross software package is available from http://www.biosci.ohio-state.edu/~ray/bioinformatics/tricross.html.

Contact: ray{at}biosci.ohio-state.edu; daniels.7{at}osu.edu; munsonr{at}pediatrics.ohio-state.edu

Supplementary information: Additional data from the cross-genomic comparisons examined in the discussion section are linked from http://www.biosci.ohio-state.edu/~ray/bioinformatics/tricross.html.

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
J. Bacteriol.Home page
A. Harrison, D. W. Dyer, A. Gillaspy, W. C. Ray, R. Mungur, M. B. Carson, H. Zhong, J. Gipson, M. Gipson, L. S. Johnson, et al.
Genomic Sequence of an Otitis Media Isolate of Nontypeable Haemophilus influenzae: Comparative Study with H. influenzae Serotype d, Strain KW20
J. Bacteriol., July 1, 2005; 187(13): 4627 - 4636.
[Abstract] [Full Text] [PDF]


Home page
Infect. Immun.Home page
R. S. Munson Jr., A. Harrison, A. Gillaspy, W. C. Ray, M. Carson, D. Armbruster, J. Gipson, M. Gipson, L. Johnson, L. Lewis, et al.
Partial Analysis of the Genomes of Two Nontypeable Haemophilus influenzae Otitis Media Isolates
Infect. Immun., May 1, 2004; 72(5): 3002 - 3010.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
W. C. Ray and C. J. Daniels
PACRAT: a database and analysis system for archaeal and bacterial intergenic sequence features
Nucleic Acids Res., January 1, 2003; 31(1): 109 - 113.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.