Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Supplementary Material
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (41)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Liu, J.
Right arrow Articles by Rost, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Liu, J.
Right arrow Articles by Rost, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 18 no. 7 2002
Pages 922-933
© 2002 Oxford University Press

Target space for structural genomics revisited

Jinfeng Liu 1,2 and Burkhard Rost 2,3,*

1 Department of Pharmacology, Columbia University, 630 West 168th Street, New York, USA
2 CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168th Street BB217, NY 10032, New York, USA
3 Columbia University Center for Computational Biology and Bioinformatics (C2B2), Russ Berrie Pavilion, 1150 St. Nicholas Avenue, New York, NY 10032, USA

Received on June 5, 2001 ; revised on January 4, 2002 ; accepted on February 7, 2002

Motivation: Structural genomics eventually aims at determining structures for all proteins. However, in the beginning experimentalists are likely to focus on globular proteins to achieve a rapid basic coverage of protein sequence space. How many proteins will structural genomics have to target? How many proteins will be excluded since we already have structural information for these or since they are not globular? We have to answer these questions in the context of our target selection for the North-East Structural Genomics Consortium (NESG).

Results: We estimated that structural information is available for about 6–38% of all proteins; 6% if we require high accuracy in comparative modelling, 38% if we are satisfied with having a rough idea about the fold. Excluding all regions that are not globular, we found that structural genomics may have to target about 48% of all proteins. This corresponded to a similar percentage of residues of the entire proteomes (52%). We explored a number of different strategies to cluster protein space in order to find the number of families representing these 48% of structurally unknown proteins. For the subset of all entirely sequenced eukaryotes, we found over 18 000 fragment clusters each of which may be a suitable target for structural genomics.

Availability: All data are available from the authors, most results are summarized at: http://cubic.bioc.columbia.edu/genomes/RES/2002_bioinformatics/

Contact: rost{at}columbia.edu

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
S. J. Suhrer, M. Gruber, and M. J. Sippl
QSCOP-BLAST--fast retrieval of quantified structural information for protein sequences of unknown structure
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W411 - W415.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Ofran, V. Mysore, and B. Rost
Prediction of DNA-binding residues from sequence
Bioinformatics, July 1, 2007; 23(13): i347 - i353.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. N. Offman, P. W. Fitzjohn, and P. A. Bates
Developing a move-set for protein model refinement
Bioinformatics, August 1, 2006; 22(15): 1838 - 1845.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. L. Marsden, D. Lee, M. Maibaum, C. Yeats, and C. A. Orengo
Comprehensive genome analysis of 203 genomes provides structural genomics with new insights into protein family space
Nucleic Acids Res., February 15, 2006; 34(3): 1066 - 1080.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Punta and B. Rost
PROFcon: novel prediction of long-range contacts
Bioinformatics, July 1, 2005; 21(13): 2960 - 2968.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. Kifer, O. Sasson, and M. Linial
Predicting fold novelty based on ProtoNet hierarchical classification
Bioinformatics, April 1, 2005; 21(7): 1020 - 1027.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Liu and B. Rost
Sequence-based prediction of protein domains
Nucleic Acids Res., July 7, 2004; 32(12): 3522 - 3530.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Liu and B. Rost
NORSp: predictions of long regions without regular secondary structure
Nucleic Acids Res., July 1, 2003; 31(13): 3833 - 3835.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C.-S. Goh, N. Lan, N. Echols, S. M. Douglas, D. Milburn, P. Bertone, R. Xiao, L.-C. Ma, D. Zheng, Z. Wunderlich, et al.
SPINE 2: a system for collaborative structural proteomics within a federated database framework
Nucleic Acids Res., June 1, 2003; 31(11): 2833 - 2838.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Carter, J. Liu, and B. Rost
PEP: Predictions for Entire Proteomes
Nucleic Acids Res., January 1, 2003; 31(1): 410 - 413.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.