Bioinformatics Advance Access published online on December 16, 2007
Bioinformatics, doi:10.1093/bioinformatics/btm621
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A new algorithm for cluster analysis of genomic methylation: the Helicobacter pylori case
1Engineering Faculty, Portuguese Catholic University, Estrada Octávio Pato, 2635-631 Rio de Mouro, Portugal
2CECF (iMed.UL), Faculty of Pharmacy, University of Lisbon, Av. Forças Armadas, 1649-003 Lisboa, Portugal
*To whom correspondence should be addressed. Dr. Filipa Vale, E-mail: filipavale{at}fe.ucp.pt
| Abstract |
|---|
Motivation: The genomic methylation analysis is useful to type bacteria that have a high number of expressed type II methyltransferases. Methyltransferases are usually committed to Restriction and Modification (R-M) systems, in which the restriction endonuclease imposes high pressure on the expression of the cognate methyltransferase that hinder R-M system loss. Conventional cluster methods do not reflect this tendency. An algorithm was developed for dendrogram construction reflecting the propensity for conservation of R-M Type II systems.
Results: The new algorithm was applied to 52 Helicobacter pylori strains from different geographical regions and compared with conventional clustering methods. The algorithm works by first grouping strains that share a common minimum set of R-M systems and gradually adds strains according to the number of the R-M systems acquired. Dendrograms revealed a cluster of African strains, which suggest that R-M systems are present in H. pylori genome since its human host migrates from Africa.
Availability: The software files are available at http://www.ff.ul.pt/paginas/jvitor/Bioinformatics/MCRM_algorithm.zip.
Contact: filipavale{at}fe.ucp.pt
Supplementary information: Supplementary data are available at bioinformatics online.
Associate Editor: Prof. Martin Bishop
Received on October 19, 2007; revised on December 13, 2007; accepted on December 13, 2007