Bioinformatics Advance Access originally published online on May 18, 2006
Bioinformatics 2006 22(14):1788-1789; doi:10.1093/bioinformatics/btl186
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TRFMA: a web-based tool for terminal restriction fragment length polymorphism analysis based on molecular weight
Department of Preventive Dentistry, Faculty of Dental Science, Kyushu University Fukuoka-shi, Japan
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Summary: TRFMA provides a Web environment for analyzing T-RFLP results based on molecular weights of the fragments, rather than the numbers of nucleotides, to increase accuracy. The 16S rRNA data are saved as an XML file containing around 650 sequences (light version) and a MySQL database containing around 50 000 sequences (full version), which are connected to Web server via PHP5 and manipulated on an Internet browser.
Availability: TRFMA is freely available at http://myamagu.dent.kyushu-u.ac.jp/bioinformatics/trfma/index.html and can be downloaded from the same site.
Contact: yosh{at}dent.kyushu-u.ac.jp
| 1 INTRODUCTION |
|---|
|
|
|---|
Terminal restriction fragment length polymorphism (T-RFLP) targeting the 16S rRNA gene is a very effective tool for analyzing bacterial communities, including unculturable species. Community analysis using T-RFLP offers a compromise between sample throughput and phylogenetic resolution (Marsh, 2005). The gene of interest is amplified from bacterial chromosomal DNA by PCR, a fluorescently labeled primer, and then the amplicon mixture is digested with a restriction enzyme, generating fragments of different sizes. The DNA fragments are separated through capillary electrophoresis; a laser reader detects the labeled fragments and generates a profile based on fragment lengths. Bacterial species can be predicted by comparing the observed fragment lengths and the lengths calculated from the sequences. There are many instances in which the same T-RF length is predicted for multiple species of bacteria, but increased specificity can result from the analysis of digests using several enzymes. Usually the fragment lengths are calculated as bases, but sometimes the order of fragment lengths in bases does not agree with the order based on molecular weights (Dunbar et al., 2001; Kaplan and Kitts, 2003).
A few Web databases are available that can tell users which archived sequences have the same or similar terminal fragment sizes for a given endonuclease (Kent et al., 2003; Smith et al., 2005). However, the variability in sizes resulting from the sequence prevents from the identification of species in samples (Kaplan and Kitts, 2003). A plus or minus seven-base range would, in many cases, include far too many species to make identification even a remote possibility (Marsh, 2005). Here we describe a Web-based tool, TRFMA, for identifying bacterial species by calculating the molecular weights of all T-RFs. This tool successfully reduced the variability to an average discrepancy of less than one base and displayed candidates for the bacterial species in samples.
| 2 PROGRAM OVERVIEW |
|---|
|
|
|---|
Requirements. The http server for TRFMA is Apache 1.3 or 2.0 and PHP 5.0 with the GD library module. A large database containing 50 000 sequences was constructed using the relational database management system MySQL.
Small database. The database for 650 species of oral bacteria is described as an XML file that includes the names of species and strains, the nucleotide sequences starting with the forward primer sequence D88 (5'-GAG AGT TTG ATY MTG GCT CAG-3') (Paster et al., 2001) and the RDP accession numbers. The PHP script reads the XML file, finds each restriction site in each sequence and shows intersecting groups of restriction enzymes and T-RFs.
Large database. To construct the large database, 50 000 sequences were selected from the RDP version 9.35 data by removing those containing >10 unidentified nucleotides (N) or lacking the primer sequences. The molecular weights of the T-RFs were predicted from 16S rRNA sequences using the forward primer D88 and a set of tetrameric restriction enzymes. Each item can be marked in a subgroup, e.g. oral bacteria or soil bacteria.
Matching fragments to species. Users input three or more sets of molecular weights of T-RFs with restriction enzymes in a text field (Fig.1A). When the small database is used, the PHP script divides the enzyme name and molecular weight on each line, finds recognition sites from each rRNA sequence in the database, and calculates the molecular weight of the fluorescently labeled terminal fragment containing the forward primer D88 compared with the calculated molecular weight and query. In the large database, the molecular weights of all T-RFs derived from the sequences are pre-calculated and saved as data, and the MySQL program finds matching sequences based on molecular weight ranges. The match window lists the T-RF molecular weights within ±330 when they are smaller than 66 000; for molecular weights over 66 000, the match window is set to the range ±0.5% in the default condition. Users can change these parameters. The intersection of the species sets calculated from each peak is extracted and displayed to the user. Usually, the target species of bacteria are limited. For example, when analyzing oral bacteria in saliva, strains isolated from a lake in Antarctica are not of interest. Therefore, the 16S rRNA genes can be marked using keywords, and the search can be limited using keywords. When all 50 000 genes are searched, 90 s are required to display the candidate flora to the user; when 650 oral bacteria are targeted, only 3 s are required. Fewer than 5 s are needed to search the small XML database.
|
Grouping. Some groups in RDP have the same sequence, and the T-RFLP analysis shows such groups as candidates. It is difficult to distinguish such similar sequences from other independent species when there are too many candidates. The gene groups with a common T-RF sequence can be shown in another window as a group by running a recalculation with the query select, distinct, count, and order by count on the MySQL server (Fig. 1B). This recalculation has two options. One option combines peaks with identical molecular weights, and the other uses peaks of other groups within ±100. This tool put too many uncategorized uncultured bacterium into groups.
Information on each sequence. Even if this system determines a good phylogenetic assignment from T-RF profiles, users may want to reconfirm the T-RF profile by digestion with additional restriction enzymes. With TRFMA, to generate the T-RFLP pattern calculated from the nucleotide sequence of each gene and the recognition sites of the restriction enzymes, the user can select the names of candidate species from among those analyzed by the program or in the species list, the base composition of the fragment, and the nucleotide sequence presented with all the recognition sites emphasized in color (Fig. 1C). Using this information, the user can choose the most suitable restriction enzymes for identifying the strain.
| 3 CONCLUSION |
|---|
|
|
|---|
TRFMA offers a more accurate T-RFLP analysis obtained by calculations using molecular weights of the fragments. Large scale T-RFLP analysis of oral microflora using this system is currently in progress.
| Acknowledgments |
|---|
This work was supported in part by Grants-in-Aid for Scientific Research 16209063 (Y.N.) and 16390618 (Y.Y.) from the Ministry of Education, Culture, Sports, Science and Technology of Japan.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Associate Editor: Joaquin Dopazo
Received on March 6, 2006; revised on April 25, 2006; accepted on May 10, 2006
| REFERENCES |
|---|
|
|
|---|
Dunbar, J., et al. (2001) Phylogenetic specificity and reproducibility and new method for analysis terminal restriction fragment profiles of 16S rRNA gene from bacterial communities. Appl. Environ. Microbiol, . 67, 190197
Kaplan, C.W. and Kitts, C.L. (2003) Variation between observed and true terminal restriction fragment length is dependent on true TRF length and purine content. J. Microbiol. Methods, 54, 121125[CrossRef][Web of Science][Medline].
Kent, A.D., et al. (2003) Web-based phylogenetic assignment tool for analysis of terminal restriction fragment length polymorphism profiles of microbial communities. Appl. Environ. Microbiol, . 69, 67686776
Marsh, T.L. (2005) Culture-independent microbial community analysis with terminal restriction fragment length polymorphism. Methods Enzymol, . 397, 308329[Web of Science][Medline].
Paster, B.J., et al. (2001) Bacterial diversity in human subgingival plaque. J. Bacteriol, . 183, 37703783
Smith, C.J., et al. (2005) T-Align, a web-based tool for comparison of multiple terminal restriction fragment length polymorphism profiles. FEMS Microbiol. Ecol, . 54, 375380[CrossRef][Medline].
This article has been cited by other articles:
![]() |
B. Stres, J. M. Tiedje, and B. Murovec BEsTRF: a tool for optimal resolution of terminal-restriction fragment length polymorphism analysis based on user-defined primer-enzyme-sequence databases Bioinformatics, June 15, 2009; 25(12): 1556 - 1558. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Thies Soil Microbial Community Analysis using Terminal Restriction Fragment Length Polymorphisms Soil Sci. Soc. Am. J., March 12, 2007; 71(2): 579 - 591. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


