Bioinformatics Advance Access published online on November 15, 2007
Bioinformatics, doi:10.1093/bioinformatics/btm547
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Determination and validation of principal gene products
1 Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre, Madrid, Spain
2 HAVANA Group, The Sanger Institute, Cambridge, UK
3 The Goldman Group, The European Bioinformatics Institute, Cambridge, UK
4 Faculty of Life Sciences, University of Manchester, Manchester, UK
*To whom correspondence should be addressed. Dr. Michael Tress, E-mail: mtress{at}cnio.es
| Abstract |
|---|
Motivation: Alternative splicing has the potential to generate a wide range of protein isoforms. For many computational applications and for experimental research it is important to be able to concentrate on the isoform that retains the core biological function. For many genes this is far from clear.
Results: We have combined five methods into a pipeline that allows us to detect the principal variant for a gene. Most of the methods were based on conservation between species, at the level of both gene and protein. The five methods used were the conservation of exonic structure, the detection of non-neutral evolution, the conservation of functional residues, the existence of a known protein structure and the abundance of vertebrate orthologues. The pipeline was able to determine a principal isoform for 83% of a set of well-annotated genes with multiple variants.
Associate Editor: Dr. Alex Bateman
Received on August 17, 2007; revised on October 17, 2007; accepted on October 22, 2007