Bioinformatics Advance Access originally published online on December 16, 2007
Bioinformatics 2008 24(3):383-388; doi:10.1093/bioinformatics/btm621
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A new algorithm for cluster analysis of genomic methylation: the Helicobacter pylori case
1Engineering Faculty, Portuguese Catholic University, Estrada Octávio Pato, 2635-631 Rio de Mouro, Portugal and 2CECF (iMed.UL), Faculty of Pharmacy, University of Lisbon, Av. Forças Armadas, 1649-003 Lisboa, Portugal
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: The genomic methylation analysis is useful to type bacteria that have a high number of expressed type II methyltransferases. Methyltransferases are usually committed to Restriction and Modification (R-M) systems, in which the restriction endonuclease imposes high pressure on the expression of the cognate methyltransferase that hinder R-M system loss. Conventional cluster methods do not reflect this tendency. An algorithm was developed for dendrogram construction reflecting the propensity for conservation of R-M Type II systems.
Results: The new algorithm was applied to 52 Helicobacter pylori strains from different geographical regions and compared with conventional clustering methods. The algorithm works by first grouping strains that share a common minimum set of R-M systems and gradually adds strains according to the number of the R-M systems acquired. Dendrograms revealed a cluster of African strains, which suggest that R-M systems are present in H.pylori genome since its human host migrates from Africa.
Availability: The software files are available at http://www.ff.ul.pt/paginas/jvitor/Bioinformatics/MCRM_algorithm.zip
Contact: filipavale{at}fe.ucp.pt
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Martin Bishop
Received on October 19, 2007; revised on December 13, 2007; accepted on December 13, 2007