Bioinformatics Advance Access originally published online on November 15, 2007
Bioinformatics 2008 24(1):11-17; doi:10.1093/bioinformatics/btm547
Determination and validation of principal gene products
1Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre, Madrid, Spain 2HAVANA Group, The Sanger Institute, 3The European Bioinformatics Institute, Cambridge and 4Faculty of Life Sciences, University of Manchester, Manchester, UK
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Alternative splicing has the potential to generate a wide range of protein isoforms. For many computational applications and for experimental research, it is important to be able to concentrate on the isoform that retains the core biological function. For many genes this is far from clear.
Results: We have combined five methods into a pipeline that allows us to detect the principal variant for a gene. Most of the methods were based on conservation between species, at the level of both gene and protein. The five methods used were the conservation of exonic structure, the detection of non-neutral evolution, the conservation of functional residues, the existence of a known protein structure and the abundance of vertebrate orthologues. The pipeline was able to determine a principal isoform for 83% of a set of well-annotated genes with multiple variants.
Contact: mtress{at}cnio.es
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Alex Bateman
Received on August 17, 2007; revised on October 17, 2007; accepted on October 22, 2007