Bioinformatics Advance Access originally published online on July 31, 2006
Bioinformatics 2006 22(20):2532-2538; doi:10.1093/bioinformatics/btl417
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors
1 Department of Mathematics, University of Southern California 3620 Vermont Avenue, KAP 108, Los Angeles, CA 90089-2532, USA
2 Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California 1050 Childs Way, MCB 201, Los Angeles, CA 90089-2910, USA
3 Department of Biological Sciences, University of Southern California 3616 Trousdale Pkwy, AHF 107, Los Angeles, CA 90089-0371, USA
*To whom correspondence should be addressed.
Motivation: Characterizing the diversity of microbial communities and understanding the environmental factors that influence community diversity are central tenets of microbial ecology. The development and application of cultivation independent molecular tools has allowed for rapid surveying of microbial community composition at unprecedented resolutions and frequencies. There is a growing need to discern robust patterns and relationships within these datasets which provide insight into microbial ecology. Pearson correlation coefficient (PCC) analysis is commonly used for identifying the linear relationship between two species, or species and environmental factors. However, this approach may not be able to capture more complex interactions which occur in situ; thus, alternative analyses were explored.
Results: In this paper we introduced local similarity analysis (LSA), which is a technique that can identify more complex dependence associations among species as well as associations between species and environmental factors without requiring significant data reduction. To illustrate its capability of identifying relationships that may not otherwise be identified by PCC, we first applied LSA to simulated data. We then applied LSA to a marine microbial observatory dataset and identified unique, significant associations that were not detected by PCC analysis. LSA results, combined with results from PCC analysis were used to construct a theoretical ecological network which allows for easy visualization of the most significant associations. Biological implications of the significant associations detected by LSA were discussed. We also identified additional applications where LSA would be beneficial.
Availability: The algorithms are implemented in Splus/R and they are available upon request from the corresponding author.
Contact: fsun{at}usc.edu
Received on May 21, 2006; revised on July 22, 2006; accepted on July 26, 2006