Bioinformatics Vol. 18 no. 1 2002
Pages 215-217
© 2002 Oxford University Press
Letter |
Symmetry observations in long nucleotide sequences: a commentary on the Discovery Note of Qi and Cuticchia
Department of Biochemistry, Queens University, Kingston, Ontario, Canada K7L 3N6
Received on June 24, 2001
; revised on August 20, 2001
; accepted on August 23, 2001
The relative quantities of bases in DNA were determined chemically many years before sequencing technologies permitted direct counting of bases. Apparently unaware of the rich literature on the topic, bioinformaticists are today rediscovering the wheels of Chargaff, Wyatt and other biochemists. It follows from Chargaffs second parity rule (%A = %T, %G = %C for single stranded DNA) that the symmetries observed for the two pairs of complementary mononucleotide bases, should also apply to the eight pairs of complementary dinucleotide bases, the thirty-two pairs of complementary trinucleotide bases, etc. This was made explicit by Prabhu in 1993 in a study of complete genomes and long genome segments from a wide range of taxa, and was rediscovered by Qi and Cuticchia in 2001 in a study of complete genomes. It follows from Chargaffs GC-rule (%GC tends to be uniform and species specific) that, within a species, oligonucleotides of the same GC% will be at approximately equal quantities in single stranded DNA. Thus, for example, while quantities of CAT and ATG (reverse complements) will be closely correlated because of both of the above Chargaff rules, CAT and GTA (forward complements) will show some correlation only because of the latter rule. The need for complete genomic sequences in bioinformatic analyses may have been somewhat overplayed.