Skip Navigation


Bioinformatics Advance Access originally published online on October 6, 2007
Bioinformatics 2007 23(22):3093-3094; doi:10.1093/bioinformatics/btm489
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
23/22/3093    most recent
btm489v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Huska, M. R.
Right arrow Articles by Andrade-Navarro, M. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Huska, M. R.
Right arrow Articles by Andrade-Navarro, M. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

BiasViz: visualization of amino acid biased regions in protein alignments

Matthew R. Huska 1, Henrik Buschmann 2 and Miguel A. Andrade-Navarro 1,3,*

1Molecular Medicine, Ottawa Health Research Institute, 501 Smyth Road, Ottawa, ON, Canada K1H 8L6, 2Department of Cell and Developmental Biology, John Innes Centre, Norwich UK and 3Cellular and Molecular Medicine, Faculty of Medicine, University of Ottawa, Canada

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 JAVA TOOL
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: About a third of all protein sequences have at least one composition biased region (CBR). Such regions might act as linkers between protein domains but often confer specific binding to various molecules; therefore, their characterization in terms of their boundaries and over-represented residues is important. Analysis of CBRs in a particular sequence can be time consuming if several types of biases have to be explored and their position visualized. Assessment of the significance of the detected CBRs can be approached by comparison to homologous protein sequences. To assist this procedure, we have developed BiasViz, a tool that allows to graphically studying local amino acid composition in protein sequences of a multiple sequence alignment.

Availability: BiasViz java applet and source code can be accessed from http://biasviz.sourceforge.net

Contact: matthuska{at}alumni.uwaterloo.ca


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 JAVA TOOL
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Most protein sequences are a complex series of amino acids with side chains of varied properties. However, about a third of protein sequences contains composition biased regions (CBRs), also described as regions of low complexity, unusually rich in one amino acid or in amino acids with similar properties (Wootton, 1994). CBRs can act as flexible linkers (spacers) between compact domains, in which their precise sequence is actually unimportant, but they have also being reported to function in the binding of proteins and other substrates (see e.g. Sim and Creamer, 2004; Ulbert et al., 2006; Wootton, 1994 and references therein). It is therefore important to characterize the extent and composition properties of such regions.

In some cases, the characterization of a CBR is straightforward (e.g. the N-terminal poly glutamine tract of mammalian Huntingtins, which in the human sequence is a series of 23 consecutive glutamine residues). However, functionally relevant composition biases are often small (e.g. a frequency of 30% of a given amino acid in a region in contrast to a 10% in the unbiased regions) and the property of the amino acids involved in the bias might not be obvious at first sight; the bias can be produced by a particular amino acid, like lysine, or by amino acids with similar properties such as having positive charge or being polar.

Programs have been developed to detect low-complexity regions [e.g. seg (Wootton and Federhen, 1996)]. These programs are routinely used to filter them before sequence analysis to avoid false positives in pairwise sequence comparisons (Bork and Koonin, 1998), and do not inform of the significance or composition bias of the region. Pairwise sequence comparison algorithms can assess the statistical significance of the similarity between phylogenetically divergent proteins but assume that local amino acid composition is close to random and therefore cannot be used to characterize CBRs by sequence similarity (Altschul et al., 1994). As a result, CBRs have a tendency to escape homology detection by pairwise sequence comparisons. An alternative to assess the significance of a CBR in a particular protein sequence is to examine if the CBR is present in some of its homologous sequences in equivalent positions of their multiple sequence alignment (MSA) (Sim and Creamer, 2004).

We recently applied this idea to analyze the CBR of AIR9-like proteins (characterized by a basic Serine/Threonine-rich region) implicated in microtubule binding (Buschmann et al., 2006). CBRs of AIR9-like proteins from plants show little homology in linear sequence alignments, but present a conserved bias for basic and hydroxylated residues. This became obvious after an MSA of the family was studied and plots of sequence composition of the plant members were compared with those of other sequences of the family (Buschmann et al., 2007). Following these ideas, we have developed a tool, BiasViz, which allows the interactive visualization of amino acid composition biases of protein sequences in a multiple sequence alignment at variable ranges.


    2 JAVA TOOL
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 JAVA TOOL
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Input to BiasViz is a multiple sequence alignment in FASTA format (one example is preloaded at the BiasViz web site). This alignment is entered into a simple web form which when submitted launches the BiasViz applet. Once the applet is loaded, the alignment will be displayed along with controls that can be used to change how sequence composition is visualized, including amino acids of interest and window size.

The user can select any combination of amino acids for composition analysis and view the sum of their local frequencies in the sequences contained in the alignment. The visualization is generated by running a sliding window across each sequence (excluding the gaps inserted in the alignment) and recording the fraction of amino acids within the window that belong to the set of amino acids that the user has selected. This information can be displayed in a scale from white (100%) to black (0%), or scaled up so that the location with the highest value is displayed as white (Fig. 1a). A threshold can be set so that values of intensity above a cutoff are displayed as white and those below it as black (Fig.1c). Alignment gaps are represented in red. Output from the program can be saved in the form of a comma delimited table containing the currently displayed intensity values at each location in each sequence, which can be used for further graphing (Fig. 1b).


Figure 1
View larger version (63K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. BiasViz analysis of 15 ADAM proteins. BiasViz was run using as input a ClustalW (Thompson et al., 1994) multiple sequence alignment of human ADAM19 and its 14 closest human homologs. The 15 proteins are members of the ADAM (a disintegrin and metalloproteinase) protein family (Wolfsberg et al., 1995). (a) BiasViz representation of the complete alignment. Gaps are represented in red and sequence in a gray scale representing the fraction of prolines in a window, which in this case was of 36 amino acids, with white for the maximum value in the plot (0.45) and black for 0.00. The amino acid position is indicated at the top. C-terminal proline-rich regions can be appreciated in some of the sequences. (b) Left, phylogenetic tree generated by ClustalW. The sequences marked with stars are known to bind to SH3 domains via proline-rich regions: ADAM9 and ADAM15 bind to the SH3 domains of endophilin I and SH3PX1 (Howard et al., 1999), ADAM12 to the SH3 domain of Src (Kang et al., 2000) and ADAM19 to the SH3 domain of Abi2/ArgBP1 (Huang et al., 2002). These four sequences do not belong to the same branch of the tree indicating that the SH3-binding property could not be deduced from a simple sequence analysis comparison (e.g. using BLAST). Right, plots representing the fraction of prolines in a window of 36 amino acids along two sequences: the SH3-binding protein ADAM19 has a region with values above 0.3 at the C-terminal; in contrast, homolog ADAM11 does not reach 0.3 values and its C-terminal have barely above 0.1 values. (c) BiasViz representation of the complete alignment using a cutoff. Proline fraction values above 0.30 are represented in white and black is used otherwise. Accordingly, only the four sequences that bind SH3 domains have highlighted regions, which are C-terminal in agreement to current knowledge.

 
2.1 Technical specifications
BiasViz is implemented as a Java 1.5 applet and includes a small PHP input form used for input of the multiple sequence alignment to be visualized. As such, the program runs on any platform in a standard browser that has the Java plug-in installed, BiasViz itself requires no installation. The source code is licensed under the permissive MIT open source license (http://www.opensource.org/licenses/mit-license.php).


    3 CONCLUSION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 JAVA TOOL
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
BiasViz has been developed to assist the analysis of amino acid composition bias in sets of protein sequences arranged in a multiple sequence alignment. BiasViz fills a gap that is not covered by algorithms to study sequence complexity or by pairwise sequence comparison methods. We expect that this tool will serve molecular biologists wishing to explore and describe composition bias in protein families and to produce graphical representations for the communication of results.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 JAVA TOOL
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
M.A.A. is a recipient of a Canada Research Chair in Bioinformatics. H.B. was supported by a BBSRC grant to Clive W. Lloyd. (John Innes Centre, Norwich UK).

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Limsoon Wong

Received on July 25, 2007; revised on July 25, 2007; accepted on September 14, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 JAVA TOOL
 3 CONCLUSION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Altschul SF, et al. Issues in searching molecular sequence databases. Nat. Genet. (1994) 6:119–129.[CrossRef][Web of Science][Medline]

    Bork P, Koonin EV. Predicting functions from protein sequences – where are the bottlenecks? Nat. Genet. (1998) 18:313–318.[CrossRef][Web of Science][Medline]

    Buschmann H, et al. Microtubule-associated AIR9 recognizes the cortical division site at preprophase and cell-plate insertion. Curr. Biol. (2006) 16:1938–1943.[CrossRef][Web of Science][Medline]

    Buschmann H, et al. Homologues of Arabidopsis microtubule-associated AIR9 in trypanosomatid parasites: hints on evolution and function. Plant Signal. Behav. (2007) 16:1938–1943.

    Howard L, et al. Interaction of the metalloprotease disintegrins MDC9 and MDC15 with two SH3 domain-containing proteins, endophilin I and SH3PX1. J. Biol. Chem. (1999) 274:31693–31699.[Abstract/Free Full Text]

    Huang L, et al. Screen and identification of proteins interacting with ADAM19 cytoplasmic tail. Mol. Biol. Rep. (2002) 29:317–323.[CrossRef][Web of Science][Medline]

    Kang Q, et al. Metalloprotease-disintegrin ADAM 12 binds to the SH3 domain of Src and activates Src tyrosine kinase in C2C12 cells. Biochem. J. (2000) 352(Pt 3):883–892.[CrossRef][Web of Science][Medline]

    Sim KL, Creamer TP. Protein simple sequence conservation. Proteins (2004) 54:629–638.[CrossRef][Web of Science][Medline]

    Thompson JD, et al. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. (1994) 22:4673–4680.[Abstract/Free Full Text]

    Ulbert S, et al. Direct membrane protein-DNA interactions required early in nuclear envelope assembly. J. Cell Biol. (2006) 173:469–476.[Abstract/Free Full Text]

    Wolfsberg TG, et al. ADAM a novel family of membrane proteins containing the disintegrin and metalloprotease domain: multipotential functions in cell-cell and cell-matrix interactions. J. Cell Biol (1995) 131:275–278.[Free Full Text]

    Wootton JC. Sequences with ‘unusual’amino acid compositions. Curr. Opin. Struct. Biol. (1994) 4:413–421.[CrossRef][Web of Science]

    Wootton JC, Federhen S. Analysis of compositionally biased regions in sequence databases. Meth. Enzymol. (1996) 266:554–571.[Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrowOA All Versions of this Article:
23/22/3093    most recent
btm489v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Huska, M. R.
Right arrow Articles by Andrade-Navarro, M. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Huska, M. R.
Right arrow Articles by Andrade-Navarro, M. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?