Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (53)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Califano, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Califano, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 16 no. 4 2000
Pages 341-357
© 2000 Oxford University Press

SPLASH: structural pattern localization analysis by sequential histograms

Andrea Califano 1

1 IBM TJ Watson Research Center, PO Box 704, Yorktown Heights, NY 10598, USA

Received on February 22, 1999 ; revised on September 10, 1999 ; accepted on October 16, 1999

The results Section The Statistical Significance of Patternswere obtained in collaboration with Gustavo Stolovitzky. They will appear independently in a separate joint publication.

Motivation: The discovery of sparse amino acid patterns that match repeatedly in a set of protein sequences is an important problem in computational biology. Statistically significant patterns, that is patterns that occur more frequently than expected, may identify regions that have been preserved by evolution and which may therefore play a key functional or structural role. Sparseness can be important because a handful of non-contiguous residues may play a key role, while others, in between, may be changed without significant loss of function or structure. Similar arguments may be applied to conserved DNA patterns. Available sparse pattern discovery algorithms are either inefficient or impose limitations on the type of patterns that can be discovered.

Results: This paper introduces a deterministic pattern discovery algorithm, called Splash, which can find sparse amino or nucleic acid patterns matching identically or similarly in a set of protein or DNA sequences. Sparse patterns of any length, up to the size of the input sequence, can be discovered without significant loss in performances.

Splash is extremely efficient and embarrassingly parallel by nature. Large databases, such as a complete genome or the non-redundant SWISS-PROT database can be processed in a few hours on a typical workstation. Alternatively, a protein family or superfamily, with low overall homology, can be analyzed to discover common functional or structural signatures. Some examples of biologically interesting motifs discovered by Splash are reported for the histone I and for the G-Protein Coupled Receptor families. Due to its efficiency, Splash can be used to systematically and exhaustively identify conserved regions in protein family sets. These can then be used to build accurate and sensitive PSSM or HMM models for sequence analysis.

Availability: Splash is available to non-commercial research centers upon request, conditional on the signing of a test field agreement.

Contact: acal{at}us.ibm.com, Splash main page http://www.research.ibm.com/splash


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Clin. Cancer Res.Home page
R. Ria, K. Todoerti, S. Berardi, A. M. L. Coluccia, A. De Luisi, M. Mattioli, D. Ronchetti, F. Morabito, A. Guarini, M. T. Petrucci, et al.
Gene Expression Profiling of Bone Marrow Endothelial Cells in Patients with Multiple Myeloma
Clin. Cancer Res., September 1, 2009; 15(17): 5369 - 5378.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Zhang, W. Su, and J. Yang
ARCS-Motif: discovering correlated motifs from unaligned biological sequences
Bioinformatics, January 15, 2009; 25(2): 183 - 189.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C.-M. Hsu, C.-Y. Chen, and B.-J. Liu
Corrigendum
Nucleic Acids Res., March 27, 2008; 36(4): 1400 - 1406.
[Abstract] [Full Text] [PDF]


Home page
BloodHome page
G. De Falco, E. Leucci, D. Lenze, P. P. Piccaluga, P. P. Claudio, A. Onnis, G. Cerino, J. Nyagol, W. Mwanda, C. Bellan, et al.
Gene-expression analysis identifies novel RBL2/p130 target genes in endemic Burkitt lymphoma cell lines and primary tumors
Blood, August 15, 2007; 110(4): 1301 - 1307.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
A. Sosinsky, B. Honig, R. S. Mann, and A. Califano
Discovering transcriptional regulatory regions in Drosophila by a nonalignment method for phylogenetic footprinting
PNAS, April 10, 2007; 104(15): 6305 - 6310.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. Aivado, D. Spentzos, U. Germing, G. Alterovitz, X.-Y. Meng, F. Grall, A. A. N. Giagounidis, G. Klement, U. Steidl, H. H. Otu, et al.
From the cover: Serum proteome profiling detects myelodysplastic syndromes and identifies CXC chemokine ligands 4 and 7 as markers for advanced disease
PNAS, January 23, 2007; 104(4): 1307 - 1312.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Sun and J. Buhler
Designing patterns for profile HMM search
Bioinformatics, January 15, 2007; 23(2): e36 - e43.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C.-M. Hsu, C.-Y. Chen, and B.-J. Liu
MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W356 - W361.
[Abstract] [Full Text] [PDF]


Home page
JCOHome page
D. Spentzos, D. A. Levine, S. Kolia, H. Otu, J. Boyd, T. A. Libermann, and S. A. Cannistra
Unique Gene Expression Profile Based on Pathologic Response in Epithelial Ovarian Cancer
J. Clin. Oncol., November 1, 2005; 23(31): 7911 - 7918.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
S. Fitter and R. James
Deconvolution of a Complex Target Using DNA Aptamers
J. Biol. Chem., October 7, 2005; 280(40): 34193 - 34201.
[Abstract] [Full Text] [PDF]


Home page
ANN BOT (LOND)Home page
B. MOHANTY, S. P. T. KRISHNAN, S. SWARUP, and V. B. BAJIC
Detection and Preliminary Analysis of Motifs in Promoters of Anaerobically Induced Genes of Different Plant Species
Ann. Bot., September 1, 2005; 96(4): 669 - 681.
[Abstract] [Full Text] [PDF]


Home page
BloodHome page
K. Basso, U. Klein, H. Niu, G. A. Stolovitzky, Y. Tu, A. Califano, G. Cattoretti, and R. Dalla-Favera
Tracking CD40 signaling during germinal center development
Blood, December 15, 2004; 104(13): 4088 - 4096.
[Abstract] [Full Text] [PDF]


Home page
JCOHome page
D. Spentzos, D. A. Levine, M. F. Ramoni, M. Joseph, X. Gu, J. Boyd, Towia. A. Libermann, and S. A. Cannistra
Gene Expression Signature With Independent Prognostic Significance in Epithelial Ovarian Cancer
J. Clin. Oncol., December 1, 2004; 22(23): 4700 - 4710.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
G. Narayan, H. Arias-Pulido, S. V. Nandula, K. Basso, D. D. Sugirtharaj, H. Vargas, M. Mansukhani, J. Villella, L. Meyer, A. Schneider, et al.
Promoter Hypermethylation of FANCF: Disruption of Fanconi Anemia-BRCA Pathway in Cervical Cancer
Cancer Res., May 1, 2004; 64(9): 2994 - 2997.
[Abstract] [Full Text] [PDF]


Home page
JEMHome page
K. Basso, A. Liso, E. Tiacci, R. Benedetti, A. Pulsoni, R. Foa, F. Di Raimondo, A. Ambrosetti, A. Califano, U. Klein, et al.
Gene Expression Profiling of Hairy Cell Leukemia Reveals a Phenotype Related to Memory B Cells with Altered Expression of Chemokine and Adhesion Receptors
J. Exp. Med., January 5, 2004; 199(1): 59 - 68.
[Abstract] [Full Text] [PDF]


Home page
BloodHome page
I. Schwering, A. Brauninger, U. Klein, B. Jungnickel, M. Tinguely, V. Diehl, M.-L. Hansmann, R. Dalla-Favera, K. Rajewsky, and R. Kuppers
Loss of the B-lineage-specific gene expression program in Hodgkin and Reed-Sternberg cells of Hodgkin lymphoma
Blood, February 15, 2003; 101(4): 1505 - 1512.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Pathol.Home page
C.-M. Li, M. Guo, A. Borczuk, C. A. Powell, M. Wei, H. M. Thaker, R. Friedman, U. Klein, and B. Tycko
Gene Expression in Wilms' Tumor Mimics the Earliest Committed Stage in the Metanephric Mesenchymal-Epithelial Transition
Am. J. Pathol., June 1, 2002; 160(6): 2181 - 2190.
[Abstract] [Full Text] [PDF]


Home page
JEMHome page
U. Klein, Y. Tu, G. A. Stolovitzky, M. Mattioli, G. Cattoretti, H. Husson, A. Freedman, G. Inghirami, L. Cro, L. Baldini, et al.
Gene Expression Profiling of B Cell Chronic Lymphocytic Leukemia Reveals a Homogeneous Phenotype Related to Memory B Cells
J. Exp. Med., December 3, 2001; 194(11): 1625 - 1638.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.