Bioinformatics Advance Access originally published online on March 30, 2006
Bioinformatics 2006 22(12):1437-1439; doi:10.1093/bioinformatics/btl116
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PseudoPipe: an automated pseudogene identification pipeline
1 Banting and Best Department of Medical Research, Donnelly CCBR, University of Toronto 160 College Street, Toronto, ON M5S 3E1, Canada
2 Department of Computer Science, Yale University New Haven, CT 06520, USA
3 Department of Molecular Biophysics and Biochemistry New Haven, CT 06520, USA
4 Department of Biology 506 Wartik Pennsylvania State University, University Park PA 16802, USA
5 Department of Biology, McGill University Stewart Biology Building, 1205 Dr Penfield Avenue, Montreal, QC, H3A 1B1, Canada
*To whom correspondence should be addressed.
Motivation: Mammalian genomes contain many genomic fossils i.e. pseudogenes. These are disabled copies of functional genes that have been retained in the genome by gene duplication or retrotransposition events. Pseudogenes are important resources in understanding the evolutionary history of genes and genomes.
Results: We have developed a homology-based computational pipeline (PseudoPipe) that can search a mammalian genome and identify pseudogene sequences in a comprehensive and consistent manner. The key steps in the pipeline involve using BLAST to rapidly cross-reference potential "parent" proteins against the intergenic regions of the genome and then processing the resulting "raw hits" -- i.e. eliminating redundant ones, clustering together neighbors, and associating and aligning clusters with a unique parent. Finally, pseudogenes are classified based on a combination of criteria including homology, intron-exon structure, and existence of stop codons and frameshifts.
Availability: The PseudoPipe program is implemented in Python and can be downloaded at http://pseudogene.org/
Contact: Mark.Gerstein{at}yale.edu or zhaolei.zhang{at}utoronto.ca
Received on December 28, 2005; revised on March 1, 2006; accepted on March 22, 2006
This article has been cited by other articles:
![]() |
Z. Zhu, Y. Zhang, and M. Long Extensive Structural Renovation of Retrogenes in the Evolution of the Populus Genome Plant Physiology, December 1, 2009; 151(4): 1943 - 1951. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Zou, M. D. Lehti-Shiu, F. Thibaud-Nissen, T. Prakash, C. R. Buell, and S.-H. Shiu Evolutionary and Expression Signatures of Pseudogenes in Arabidopsis and Rice Plant Physiology, September 1, 2009; 151(1): 3 - 15. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Shang, Y. Tao, X. Chen, Y. Zou, C. Lei, J. Wang, X. Li, X. Zhao, M. Zhang, Z. Lu, et al. Identification of a New Rice Blast Resistance Gene, Pid3, by Genomewide Comparison of Paired Nucleotide-Binding Site-Leucine-Rich Repeat Genes and Their Pseudogene Alleles Between the Two Sequenced Rice Genomes Genetics, August 1, 2009; 182(4): 1303 - 1311. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Rho, M. Zhou, X. Gao, S. Kim, H. Tang, and M. Lynch Independent Mammalian Genome Contractions Following the KT Boundary Gen Biol Evol, June 22, 2009; 2009(0): 2 - 12. [Abstract] [Full Text] [PDF] |
||||
![]() |
F.-C. Chen, Y.-Z. Chen, and T.-J. Chuang CNVVdb: a database of copy number variations across vertebrate genomes Bioinformatics, June 1, 2009; 25(11): 1419 - 1421. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Y. K. Lam, E. Khurana, G. Fang, P. Cayting, N. Carriero, K.-H. Cheung, and M. B. Gerstein Pseudofam: the pseudogene families database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D738 - D743. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-T. Huang, F.-C. Chen, C.-J. Chen, H.-L. Chen, and T.-J. Chuang Identification and analysis of ancestral hominoid transcriptome inferred from cross-species transcript and processed pseudogene comparisons Genome Res., July 1, 2008; 18(7): 1163 - 1170. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Zheng, A. Frankish, R. Baertsch, P. Kapranov, A. Reymond, S. W. Choo, Y. Lu, F. Denoeud, S. E. Antonarakis, M. Snyder, et al. Pseudogenes in the ENCODE regions: Consensus annotation, analysis of transcription, and evolution Genome Res., June 1, 2007; 17(6): 839 - 851. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. V. Prasanth and D. L. Spector Eukaryotic regulatory RNAs: an answer to the 'genome complexity' conundrum Genes & Dev., January 1, 2007; 21(1): 11 - 42. [Abstract] [Full Text] [PDF] |
||||
![]() |
F.-C. Chen, C.-J. Chen, W.-H. Li, and T.-J. Chuang Human-specific insertions and deletions inferred from mammalian genome sequences Genome Res., January 1, 2007; 17(1): 16 - 22. [Abstract] [Full Text] [PDF] |
||||






