Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (12)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zhang, H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhang, H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 19 no. 11 2003
Pages 1391-1396
© 2003 Oxford University Press

Alignment of BLAST high-scoring segment pairs based on the longest increasing subsequence algorithm

Hongyu Zhang *

Celera Genomics, 45 West Gude Drive, Rockville, MD 20850, USA

Received on September 16, 2002 ; revised on January 2, 2003 ; accepted on February 14, 2003

Motivation:The popular BLAST algorithm is based on a local similarity search strategy, so its high-scoring segment pairs (HSPs) do not have global alignment information. When scientists use BLAST to search for a target protein or DNA sequence in a huge database like the human genome map, the existence of repeated fragments, homologues or pseudogenes in the genome often makes the BLAST result filled with redundant HSPs. Therefore, we need a computational strategy to alleviate this problem.

Results: In the gene discovery group of Celera Genomics, I developed a two-step method, i.e. a BLAST step plus an LIS step, to align thousands of cDNA and protein sequences into the human genome map. The LIS step is based on a mature computational algorithm, Longest Increasing Subsequence (LIS) algorithm. The idea is to use the LIS algorithm to find the longest series of consecutive HSPs in the BLAST output. Such a BLAST+LIS strategy can be used as an independent alignment tool or as a complementary tool for other alignment programs like Sim4 and GenWise. It can also work as a general purpose BLAST result processor in all sorts of BLAST searches. Two examples from Celera were shown in this paper.

Contact: me{at}hongyu.org

* Present address: Ceres Inc., 3007 Malibu Canyon Road, Malibu, CA 90265, USA.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Genome ResHome page
R. She, J. S.-C. Chu, K. Wang, J. Pei, and N. Chen
genBlastA: Enabling BLAST to identify homologous gene sequences
Genome Res., January 1, 2009; 19(1): 143 - 149.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. D. Wu and C. K. Watanabe
GMAP: a genomic mapping and alignment program for mRNA and EST sequences
Bioinformatics, May 1, 2005; 21(9): 1859 - 1875.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. Tian, J. Hu, H. Zhang, and C. S. Lutz
A large-scale analysis of mRNA polyadenylation of human and mouse genes
Nucleic Acids Res., January 12, 2005; 33(1): 201 - 212.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
V. Veeramachaneni and W. Makalowski
Visualizing Sequence Similarity of Protein Families
Genome Res., June 1, 2004; 14(6): 1160 - 1169.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.