Bioinformatics Advance Access published online on June 9, 2008
Bioinformatics, doi:10.1093/bioinformatics/btn283
Sequence-specific reconstruction from fragmentary databases using seed sequences: implementation and validation on SAGE, proteome and generic sequencing data
1Instituto do Coração - USP, Av. Prof. Enéas de Carvalho Aguiar 44, Bloco 2, 10o andar, 05403-000, São Paulo SP, Brazil.
2Departamento de Parasitologia, Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo SP, 05508-000, Brazil.
*To whom correspondence should be addressed. Prof. Arthur Gruber, E-mail: argruber{at}usp.br
| Abstract |
|---|
Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of genomic nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.
Availability: GenSeed is available under the GNU General Public License at http://www.coccidia.icb.usp.br/genseed/
Contact: argruber{at}usp.br
Supplementary information: http://www.coccidia.icb.usp.br/genseed/
Associate Editor: Prof. John Quackenbush
Received on December 13, 2007; revised on May 25, 2008; accepted on June 8, 2008
This article has been cited by other articles:
![]() |
H. A. Castillo, R. M. Cravo, A. P. Azambuja, M. S. Simoes-Costa, S. Sura-Trueba, J. Gonzalez, E. Slonimsky, K. Almeida, J. G. Abreu, M. A. A. de Almeida, et al. Insights into the organization of dorsal spinal cord pathways from an evolutionarily conserved raldh2 intronic enhancer Development, February 1, 2010; 137(3): 507 - 518. [Abstract] [Full Text] [PDF] |
||||
