Bioinformatics Vol. 16 no. 3 2000
Pages 212-221
© 2000 Oxford University Press
Net Nearest Neighbor Analysis (NNNA) summarizes non-compensated dinucleotides within gene sequences
1 Department of Computer Science, University of Abertay-Dundee, Bell Street, Dundee, DD1 1HG, UK
Received on May 27, 1999
; revised on July 29, 1999
; accepted on October 13, 1999
Motivation: Net Nearest Neighbor Analysis (NNNA) measures a previously unexamined aspect of dinucleotide frequencythe non-compensated, non-repetitive dinucleotides in a sequence. Non-compensated dinucleotides are those in excess of their corresponding reverse dinucleotides.
Results: NNNA regards dinucleotides as vector quantities, making it possible to summarize any sequence as a group of circuits and tags. The results of NNNA are found to be consistent with traditional analytic methods, yet reveal additional characteristics of the sequences. The NNNA circuits and tags uniquely identify each tRNA in Escherichia coli K-12 and certain structural components of each tRNA, extract function-specific characteristics for each of the sequences involved in the formation of insulin from preinsulin, and exhibit species-specific phylogenetic characterization (demonstrated with Monilinia ).
Availability: Nearest neighbor analysis software has been available for many years and is a component of most gene analysis software packages, including the Staden Package which is available at no charge to academic users (http://www.mrc-1mb.cam.ac.uk/pubseq/).
*Address for correspondence: 2538 Great Highway, San Francisco, CA 94116, USA.