Bioinformatics, Vol 14, 151-156, Copyright © 1998 by Oxford University Press
DK Smith and H Xue
MOTIVATION : Summarizing and displaying the information contained in a set
of aligned sequences is an important aid to identifying patterns within the
sequences. A variety of forms of consensus sequences have been used
previously to provide this information. However, these methods can cause a
loss of information or introduce ambiguities into the consensus sequence,
and some graphical approaches may become difficult to interpret due to
visual distortion. RESULTS: We have developed a method to present a more
precise and graphically clear view of a consensus sequence by using an
approach based on defining the major components at each position in a
sequence set. The major components are given in an ordered list and their
frequencies are shown as histograms which can be colour coded to reflect
conservative groupings. Minor components, a one-line character-based
consensus sequence and information statistics can also be presented. As
well as identifying the dominant sources of variation and conservation in
the sequence set, the method also enables similarities and differences
between subgroups of a sequence set to be readily assessed. AVAILIABILITY:
On request from the authors. CONTACT: bcsmith@usthk.ust.hk, hxue@usthk.
ust.hk
ARTICLES
A major component approach to presenting consensus sequences
Biochemistry Department, Hong Kong University of Science and Technology, Kowloon, Hong Kong.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
L. M. Jakt, L. Cao, K. S.E. Cheah, and D. K. Smith Assessing Clusters and Motifs from Gene Expression Data Genome Res., January 1, 2001; 11(1): 112 - 123. [Abstract] [Full Text] |
||||
