Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Haubold, B.
Right arrow Articles by Wiehe, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Haubold, B.
Right arrow Articles by Wiehe, T.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 18 no. 1 2002
Pages 36-38
© 2002 Oxford University Press

Calculating the SNP-effective sample size from an alignment

Bernhard Haubold 1,2 and Thomas Wiehe 1,*

1 Max-Planck-Institut für Chemische Ökologie, Carl-Zeiss-Promenade 10, D-07745 Jena, Germany

Received on May 11, 2001 ; revised on June 27, 2001 ; accepted on August 7, 2001

Motivation: The number of Single Nucleotide Polymorphisms (SNPs) detectable in an alignment is a function of the length and the number of the aligned sequences. The latter is called sample size. However, a typical alignment, for instance obtained as a BLAST-search result of a query sequence against an EST database, does not evenly cover the query sequence. Therefore, it is usually not clear what the actual sample size is.

Results: We present a method to calculate the effective sample size, called neff, for a given BLAST alignment. This method takes into account that multiple coverage contributes only logarithmically to the SNP yield of a given sequence stretch. We show that the effective sample size neff is usually much smaller than would be expected for a given amount of coverage and illustrate this with two typical examples.

Availability: The algorithm is implemented in NEFF, a program written in FORTRAN90 that is accessible at http://soft.ice.mpg.de/neff. From this site also the source, except for two subroutines protected by copyright, and a LINUX compiled executable can be downloaded.

Contact: twiehe{at}ice.mpg.de

* To whom correspondence should be addressed.

2 Present address: LION Bioscience AG, Heidelberg, Germany.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.