Improved tools for DNA comparison and clustering
Genome Sequencing Center, Washington University School of Medicine 4444 Forest Park Avenue, St Louis, MO 63108. USA. Email jparsons{at}watson.wustl.edu
DNA sequence clustering is an effective aid of the comprehension, summarization and compression of DNA sequence databases. Previous work created programs suitable for the comparison and clustering of cDNA sequences but new enhanced programs have been written to cluster genomic DNA fragments, large EST projects, and entire DNA databases. Three new programs (ICAtools) are discussed: ICAass, N2tool, and ICAmatches. ICAass has been used to compress the EMBL database by hiding or removing sequences with various degrees of redundancy. It also has the fastest database querying mode. N2tool provides fast and sensitive clustering of genomic fragment databases on the basis of small areas of local similarity. N2tool has proven utility in the discovery of contaminating vector or other artefactual sequence when the potential contaminant is not otherwise known. ICAmatches is a new cluster analysis program that uses a novel alignment style to present multiple alignment summaries. All the tools are convenient to use because they share a common memory-frugal index format and accept most DNA sequence formats directly.
Received on May 1, 1995; accepted on September 13, 1995
This article has been cited by other articles:
![]() |
M. S. Clark, Y. J.K. Edwards, D. Peterson, S. W. Clifton, A. J. Thompson, M. Sasaki, Y. Suzuki, K. Kikuchi, S. Watabe, K. Kawakami, et al. Fugu ESTs: New Resources for Transcription Analysis and Genome Annotation Genome Res., December 1, 2003; 13(12): 2747 - 2753. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Burke, D. Davison, and W. Hide d2_cluster: A Validated Method for Clustering EST and Full-Length cDNA Sequences Genome Res., November 1, 1999; 9(11): 1135 - 1142. [Abstract] [Full Text] |
||||
![]() |
G. Elgar, M. S. Clark, S. Meek, S. Smith, S. Warner, Y. J.K. Edwards, N. Bouchireb, A. Cottage, G. S.H. Yeo, Y. Umrania, et al. Generation and Analysis of 25 Mb of Genomic DNA from the Pufferfish Fugu rubripes by Sequence Scanning Genome Res., October 1, 1999; 9(10): 960 - 971. [Abstract] [Full Text] |
||||
![]() |
J. Burke, H. Wang, W. Hide, and D. B. Davison Alternative Gene Form Discovery and Candidate Gene Selection from Gene Indexing Projects Genome Res., March 1, 1998; 8(3): 276 - 290. [Abstract] [Full Text] |
||||
![]() |
J. W. Ajioka, J. C. Boothroyd, B. P. Brunk, A. Hehl, L. Hillier, I. D. Manger, M. Marra, G. C. Overton, D. S. Roos, K.-L. Wan, et al. Gene Discovery by EST Sequencing in Toxoplasma gondii Reveals Sequences Restricted to the Apicomplexa Genome Res., January 1, 1998; 8(1): 18 - 28. [Abstract] [Full Text] |
||||
