Skip Navigation


Bioinformatics Advance Access originally published online on December 1, 2005
Bioinformatics 2006 22(3):257-263; doi:10.1093/bioinformatics/bti767
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/3/257    most recent
bti767v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Iyer, L. M.
Right arrow Articles by Aravind, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Iyer, L. M.
Right arrow Articles by Aravind, L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Published by Oxford University Press 2005.

The ASCH superfamily: novel domains with a fold related to the PUA domain and a potential role in RNA metabolism

Lakshminarayan M. Iyer , A. Maxwell Burroughs and L. Aravind *

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health Bethesda, MD 20894, USA

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 SYSTEMS AND METHODS
 RESULTS AND DISCUSSION
 NOTE ADDED IN PROOF
 REFERENCES
 

Several studies show that transcription coactivators are often bi-functional ribonucleoprotein complexes that also regulate pre-mRNA processing and splicing decisions. Using sensitive sequence profile searches and structural comparisons we show that the C-terminal domain of the human coactivator protein ASC-1 defines a novel superfamily, the ASC-1 homology (ASCH) domain. The approximately 110 amino acid long ASCH domains are widely represented in all the three superkingdoms of life and several prokaryotic viruses. We show that the ASCH superfamily adopts a beta-barrel fold similar to the PUA domain superfamily. Using multiple lines of evidence, we suggest that members of the ASCH superfamily are likely to function as RNA-binding domains in contexts related to coactivation, RNA-processing and possibly prokaryotic translation regulation. Structural analysis of ASCH domains reveals the presence of a potential RNA-binding cleft associated with a conserved sequence motif, which is characteristic of this superfamily. Despite their similar structure, the ASCH and PUA domains appear to occupy distinct functional niches, with the former domains typically occurring in a standalone form in polypeptides, and the latter domains showing fusions to a variety of RNA-modifying enzymes.

Contact: aravind{at}ncbi.nlm.nih.gov

Supplementary information: A complete alignment of all ASCH domains in the NR-database and other domains found fused to the ASCH can be retrieved from ftp://ftp.ncbi.nih.gov/pub/aravind/


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 SYSTEMS AND METHODS
 RESULTS AND DISCUSSION
 NOTE ADDED IN PROOF
 REFERENCES
 
Systematic analyses of the proteins involved in RNA metabolism have suggested that despite the complexity of this system the majority of proteins are constructed from a relatively small set of conserved globular domains (For summary see Anantharaman et al., 2002a). The phyletic profiles of these conserved domains derived from large-scale comparative analyses of genomes from the three superkingdoms of life show certain interesting features (Anantharaman et al., 2002a; Koonin and Mushegian, 1996). Many of the RNA-binding domains, typically those present in ribosomal proteins, translation factors and tRNA and rRNA-modifying enzymes, are widely represented across the three superkingdoms of life. These appear to be ancient innovations, which were originally utilized in core RNA metabolism processes that are likely to have been already present in the last universal common ancestor (LUCA) of all cellular life forms. In some cases, a subset of these ancient domains also appear to have been secondarily recruited to many of the unique eukaryotic innovations such as splicing, post-transcriptional gene silencing, mRNA capping and polyadenylation, and nonsense-mediated RNA decay (Anantharaman et al., 2002a; Clissold and Ponting, 2000). Identification of these ancient RNA-binding domains have helped considerably in uncovering aspects of RNA–protein interactions that hold good across a wide range of biological functional contexts, and in clarifying the roles of uncharacterized conserved proteins from phylogenetically distant organisms (e.g. see Cerutti et al., 2000; Fatica et al., 2004; Ishitani et al., 2002; Korber et al., 1999; Reid et al., 1999).

Given these antecedents, we were interested in the identification of any potentially novel ancient conserved domains that might throw light on poorly understood ribonucleoprotein complexes that have been identified in the cellular transcription apparatus. The activating signal cointegrator 1 or the thyroid hormone receptor interactor protein 4 (ASC-1/TRIP4) is a transcriptional coactivator that is widely conserved in eukaryotes and is part of a potential RNA interacting protein complex (Jung et al., 2002; Kim et al., 1999). ASC-1 directly interacts with a wide range of unrelated transcription factors such as the serum response factor, NF{kappa}B, AP-1 and nuclear hormone receptors, and has been shown to be part of a protein complex that bridges these specific transcription factors to the basal transcriptional apparatus (Jung et al., 2002). One of the proteins of this coactivator complex is an RNA helicase, while the other one has an RNA-binding KH domain fused to a 2H RNA phosphoesterase (Jung et al., 2002; Mazumder et al., 2002). ASC-1 itself contains a conserved cysteine-rich Zn-chelating domain, which binds transcription factors (Jung et al., 2002) and a conserved C-terminal domain which has thus far not been characterized.

Using sensitive sequence profile searches and structural comparisons we show that the C-terminal domain of ASC-1 domain defines a superfamily of domains that is widely distributed across the three superkingdoms of the life. We show that this superfamily assumes a protein fold, which was originally observed in the RNA-binding PUA domain. Our findings suggest that this unique ß-barrel fold, which is encountered both in the new superfamily of domains typified by the C-terminal domain of ASC-1 and the PUA superfamily, defines an ancient structural theme in RNA–protein interactions.


    SYSTEMS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 SYSTEMS AND METHODS
 RESULTS AND DISCUSSION
 NOTE ADDED IN PROOF
 REFERENCES
 
The non-redundant (NR) database of protein sequences (National Center for Biotechnology Information, NIH, Bethesda) was searched using the BLASTPGP program (Altschul et al., 1997). Iterative sequence profile searches were done using the PSI-BLAST program either with a single sequence or with an alignment used as the query, with a profile inclusion expectation (E) value threshold of 0.01, and were iterated until convergence (Altschul et al., 1997). For all searches with compositionally biased proteins, the statistical correction for this bias was employed (Schaffer et al., 2001). Multiple alignments were constructed using the T_Coffee program, followed by manual correction based on the PSI-BLAST results (Notredame et al., 2000). Hidden Markov models (HMMs) were built from alignments using the hmmbuild program and searches were carried out using the hmmsearch program from the HMMer package (Eddy, 1998). Protein secondary structure was predicted using a multiple alignment as the input for the JPRED and PHD programs (Cuff and Barton, 2000; Cuff et al., 1998; Rost et al., 1994). Preliminary clustering of proteins was done using the BLASTCLUST program with empirically determined length and score threshold cut-off values (For documentation see ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.html). Previously known, conserved domains were identified using PSI-BLAST derived profiles with the RPS-BLAST program (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) (Schaffer et al., 1999). Structure similarity searches were conducted using the DALI program (Holm and Sander, 1995). Structure manipulations and the construction of ribbon and surface diagrams were performed using the Pymol program (Delano, 2002). Gene neighborhoods were obtained by isolating all conserved genes in the neighborhood of the gene under consideration that showed a separation of less than 70 nt between their termini. Genes fulfilling this criterion were considered likely to form operons. Gene neighborhoods were determined by searching the NCBI PTT tables (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Genome) with an in-house PERL script.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 SYSTEMS AND METHODS
 RESULTS AND DISCUSSION
 NOTE ADDED IN PROOF
 REFERENCES
 
Identification of the ASCH domain
The ASC-1 proteins from animals are relatively large proteins (around 580–650 amino acids), and the only characterized globular domain in them is a unique Zn-chelating domain with 7 cysteines and 1 histidine. This domain was shown to be critical for the interaction of ASC-1 with specific transcription factors and is likely to form a binuclear metal cluster chelating two Zn atoms (Jung et al., 2002). Given that other polypeptides of the ASC-1-containing complex have characteristic RNA-interaction domains, we further investigated the ASC-1 proteins to identify potential links to RNA interaction. Analysis of the human ASC-1 protein with the SEG program revealed that it contains additional globular segments, including a C-terminal globular segment (gi: 6013191, 434–581), which in searches of the NR database with the BLASTPGP program gave significant hits to the proteins SAP1p60 from the bacterium Streptomyces avermitilis and Mbur03000455 from the archaeon Methanococcoides burtonii (E = 10–5 and 10–3, respectively). This region of similarity did not map to any previously published protein domain and more or less encompassed the entire length of the prokaryotic proteins, suggesting that it might define a novel protein domain. Further iterations of the search retrieved a large number of uncharacterized proteins from vertebrates, prokaryotes and bacteriophages such as LOC541578 from Homo sapiens (iteration 3; E = 10–3), gp69 from the Mycobacteriophage Che9c (iteration 2; E = 10–6), PF0238 from Pyrococcus furiosus (iteration 3; E = 10–4) and the TTC18981protein from Thermus thermophilus (iteration 3; E = 10–3) whose crystal structure has been determined (pdb id: 1wk2). All the sequences showed a highly conserved GxKxxxxR motif that they shared with the ASC-1 protein. At convergence, the search also retrieved several proteins with the GxKxxxxR motif with E-values of border-line significance (E > 0.01). In order to retrieve all possible homologs for a comprehensive analysis, we conducted transitive sequence profile searches seeded with several homologs of the ASC-1 protein, which were recovered in the above search. As a result, we recovered several additional significant hits from diverse species from all three superkingdoms, and proteins whose structures have been determined as part of various structural genomics project (E < 10–2) such as the uncharacterized proteins YqfB (pdb:1TE7) from Escherichia coli, PF0455 (pdb: 1S04) from P.furiosus, and EF3133 (pdb: 1T62) from Enterococcus faecalis (Fig. 1). Some of these proteins had been classified into separate families of domains of unknown function, DUF437, DUF984 and DUF1530, in the PFAM database (Bateman et al., 2004).


Figure 1
Figure 1
View larger version (98K):
[in this window]
[in a new window]
 
Fig. 1 Multiple alignment of members of the ASCH superfamily. Proteins are shown with their gene name, species abbreviations and genbank ID (gi) numbers separated by underscores. The pdb codes of proteins with X-ray crystal or NMR structures are shown in brackets after the gi number. Columns in the alignment are colored based on the residue conservation profile at 90 and 70% consensus. Sample operons and domain architectures of interest are shown to the right of the alignment. The domains in the architectures are separated by a ‘+’ symbol, whereas genes in operons are separated by ‘–>’ symbol with the ‘>’ pointing from the 5' to the 3' directions of the coding sequence. X1 and X2 refer to uncharacterized domains, which were found fused with certain ASCH domains. The Pfam domains of unknown function, DUF437, DUF1530 and DUF984, include some of the representatives, respectively, from families 1, 3 and 4 defined by us. The consensus for residue conservation and the coloring scheme are as follows: h, hydrophobic residues (ACFILMVWY), shaded yellow; b, big residues (LIYERFQKMW), shaded gray; s, small residues (AGSVCDN) colored green; p, polar residues (STEDKRNQHC) colored magenta. The lysine residue that is characteristic of the ASCH superfamily is shaded red. Species abbreviations are as follows: Aful, Archaeoglobus fulgidus; Ana, Nostoc sp.; Aper, Aeropyrum pernix; Aple, Actinobacillus pleuropneumoniae; Asp., Acinetobacter sp.; Atha, Arabidopsis thaliana; BP315.4, Streptococcus pyogenes phage 315.4; BPChe9c, Mycobacteriophage Che9c; BPHK022, Enterobacteria phage HK022; BPP2, Enterobacteria phage P2; Bpbacteriophage, Stx1 converting bacteriophage; BPpsiM2, Methanobacterium phage psiM2; Bant, Bacillus anthracis; Bbac, Bdellovibrio bacteriovorus; Bcep, Burkholderia cepacia; Bcla, Bacillus clausii; Bhal, Bacillus halodurans; Blin, Brevibacterium linens; Blon, Bifidobacterium longum; Bmel, Brucella melitensis; Bthe, Bacteroides thetaiotaomicron; Cace, Clostridium acetobutylicum; Ccre, Caulobacter crescentus; Cele, Caenorhabditis elegans; Dhaf, Desulfitobacterium hafniense; Dmel, Drosophila melanogaster; Ecar, Erwinia carotovora; Ecol, Escherichia coli; Efae, Enterococcus faecalis; Exsp, Exiguobacterium sp.; Hinf, Haemophilus influenzae; Hsap, Homo sapiens; Laci, Lactobacillus acidophilus; Lgas, Lactobacillus gasseri; Linn, Listeria innocua; Lint, Leptospira interrogans; Ljoh, Lactobacillus johnsonii; Llac, Lactococcus lactis; Lmes, Leuconostoc mesenteroides; Lmon, Listeria monocytogenes; Lpla, Lactobacillus plantarum; Mbur, Methanococcoides burtonii; Mflo, Mesoplasma florum; Mgri, Magnaporthe grisea; Mjan, Methanocaldococcus jannaschii; Mkan, Methanopyrus kandleri; Mmar, Methanococcus maripaludis; Mmar, Moritella marina; Mpen, Mycoplasma penetrans; Mthe, Methanothermobacter thermautotrophicus; Ncra, Neurospora crassa; Nequ, Nanoarchaeum equitans; Oihe, Oceanobacillus iheyensis; Ooen, Oenococcus oeni; Osat, Oryza sativa; Paby, Pyrococcus abyssi; Pfur, Pyrococcus furiosus; Phor, Pyrococcus horikoshii; Plum, Photorhabdus luminescens; Pmul, Pasteurella multocida; Psp., Pseudomonas sp.; Pyae, Pyrobaculum aerophilum; Rgel, Rubrivivax gelatinosus; Saga, Streptococcus agalactiae; Scoe, Streptomyces coelicolor; Sent, Salmonella enterica; Smut, Streptococcus mutans; Sone, Shewanella oneidensis; Spne, Streptococcus pneumoniae; Ssui, Streptococcus suis; Tbru, Trypanosoma brucei; Tery, Trichodesmium erythraeum; Tkod, Thermococcus kodakaraensis; Tthe, Thermus thermophilus; Upar, Ureaplasma parvum; Vcho, Vibrio cholerae; Vvul, Vibrio vulnificus; Ypes, Yersinia pestis; Ypse, Yersinia pseudotuberculosis; Zmob, Zymomonas mobilis.

 
The sequence affinities between the proteins recovered in the above searches were also independently corroborated by searches with HMMs derived using a seed alignment of the originally detected set of ASC-1 homologs. Furthermore, comparisons of the predicted secondary structures for different subgroups of these homologous domains with the above-mentioned proteins with X-ray or NMR structures showed complete congruence, indicating that these 103–120 amino acid long domains define a novel monophyletic superfamily (Fig. 1). We refer to this superfamily, containing over 180 distinct representatives in the NR database from viruses and cellular organisms belonging to all three superkingdoms of life, as the ASC-1-homology (ASCH) superfamily. Structure similarity searches with members of the ASCH superfamily showed that it contains a fold, which was previously noted in the PUA domain (Fig. 2) (DALI Z-scores 4.5–6). For example, DALI searches with the Thermus TTC18981 protein (pdb id: 1wk2) retrieved the PUA domains from pseudouridine synthase (pdb id: 1k8w [PDB] , Z-score 4.8), ATP sulfurylase (1g8f, Z-score 4.8) and Archaeosine tRNA-guanine transglycosylase (1k8w Z-score 3.6) in addition to the bona fide ASCH proteins derived from structural genomics projects (pdb ids: 1t62, 1xne, 1zce, 1t5y, 1nxz; Z-scores 4.6–5.8). The PUA domain is an ancient RNA-binding domain, which is fused to the catalytic domains of a variety of RNA-modifying enzymes such as pseudouridine synthetases of the TruB family, the archaeosine transglycosylase, Rossmann fold methylases, YggJ-type SPOUT domain RNA methylases and thiouridine synthases, and also occurs as standalone forms (Anantharaman et al., 2002b; Aravind and Koonin, 1999; Forouhar et al., 2003). However, PUA domains were not recovered in any of the sequence profile searches seeded with the ASCH domain or vice versa, suggesting that these two classes of domains form distinct sequence superfamiles, despite them sharing a common fold. We propose that the fold be renamed the PUA-ASCH fold to reflect the two distinct superfamilies of the fold. The ASCH domains contain a conserved core of five strands that form a ß-barrel, and a characteristic helix between strand-1 and strand-2 (Fig. 2). Additionally, most versions of the ASCH domain, unlike the majority PUA domains, contain a long insert between strand 4 and strand-5 that usually forms two or more helical segments (Fig. 2). In terms of sequence conservation, the most characteristic feature of the ASCH superfamily is a GxK motif (where x is any amino acid) that is found in the distinctive turn between the core helix and strand-2 (Figs 1 and 2). Members of the ASCH superfamily also contain a highly conserved polar position, two residues downstream of this GXK motif, which is typically occupied by either glutamate or threonine (Figs 1 and 2).


Figure 2
View larger version (21K):
[in this window]
[in a new window]
 
Fig. 2 Structures of members and domain architectures of the ASCH and PUA superfamilies. Cartoon representations of X-ray and NMR structures of the ASCH and the PUA superfamilies are mapped on a tree showing the inferred higher order relationships between the two superfamilies. The clustering was derived using distances derived from pairwise DALI Z-scores. Each structure is labeled with its Protein Data Bank (PDB) identifier. Conserved beta-strands are shown in light blue while the characteristic conserved {alpha} helix is shown in red. Variable helical inserts located between strand-4 and strand-5 are colored tan. The structures are shown with strand-2 vertical and approximately central to the depiction. S1 and S5, which are the first and the last strands of the PUA-ASCH fold, are labeled. This view placed the RNA binding cleft in between the conserved helix and strand-2. Key conserved residues lining this cleft in the ASCH superfamily are shown in the ball and stick format. Domain architectures of members of the ASCH superfamily and those of a subset of the PUA superfamily are shown as cartoon representations in the top right and top left panels, respectively. Proteins are labeled as in Figure 1.

 
Predicted functions of members of ASCH superfamily
In order to obtain functional insights regarding members of the ASCH superfamily, we used the combined evidence gleaned from different forms of contextual connections, namely physical interactions, gene fusions and conserved operons. In different Gram-positive bacteria such as Mycoplasma, Ureaplasma and Lactococcus lactis, members of the ASCH superfamily are embedded or associated with the ribosomal protein operon (Fig. 1). Specifically, in Mycoplasma penetrans the ASCH domain is fused to the ribosomal protein S3, whereas in Ureaplasma parvum it is fused to ribosomal protein L22 (Fig. 2). Other members of the ASCH family are also found tightly linked with genes encoding RNA-binding proteins with RRM (e.g. in Acinetobacter, gene ACIAD0497) or R3H (e.g. Listeria, gene lmo2852) domains, implying that they are cotranscribed and probably functionally cooperate. These associations with ribosomal and RNA-metabolism proteins are consistent with the physical interactions of the vertebrate ASC-1 with proteins involved in RNA processing and the potential requirement for RNA–protein interactions for transcriptional coactivation by the ASC-1 containing complex (Jung et al., 2002). A study of the available structures of four distinct members of the ASCH superfamily indicates that they contain a prominent cleft, whose scaffold is formed by the conserved helix and the downstream strand-2 (Figs 2 and 3). The above-described conserved residues of the ASCH superfamily, like the lysine from the GXK motif, and other polar residues associated with strand-2, line this cleft forming a positively charged surface (Fig. 3). A similarly positioned cleft has been observed in the structures of the PUA domain found in the Archaeosine tRNA-guanine transglycosylase, Pseudouridine synthase II TruB and the predicted RNA methylase (Hoang and Ferre-D'Amare, 2001; Ishitani et al., 2002; Pan et al., 2003), and is likely to form its RNA-binding surface. Taken together the above observations suggest that the ASCH domains are likely to possess RNA-binding activity.


Figure 3
View larger version (33K):
[in this window]
[in a new window]
 
Fig. 3 Molecular surfaces of observed binding cleft in ASCH superfamily. X-ray structure of Family 1 member of the ASCH superfamily (PDB: 1WK2) is depicted in four different ways. In A, B and C the protein is oriented to expose the potential binding cleft, located between the helix and strand 2. In the top left (A), the predicted three-dimensional surface of the protein is shown with the conserved residues lining the binding cleft of family 1 colored in red while other surfaces are colored in blue. On the top right (B), cartoons indicating secondary structure features are shown against the transparent outline of the predicted molecular surface of the protein colored in dark blue. Again, the surfaces of the residues lining the putative binding cleft of this family are colored in red (D20, G21, R22, K23, E26, R28, R29) and the most highly conserved residues found along the cleft are rendered as ball and sticks and are colored in green (G21, K23, E26). Beta-strands rendered as cartoons in the protein are colored yellow, the conserved helix is colored red, and coil regions are colored gray. In the bottom left (C) and right (D) the front and back views of the predicted molecular surface are shown. Surfaces of residues are colored according to consensus conservation across the entire ASCH superfamily; red denotes positions with at least 90% conservation, while yellow denotes positions with at least 70% conservation. The residue conservation was calculated using the residue grouping as indicated in the consensus shown in Figure 1. Panels A, B and C represent the molecule in the same orientation, while it is rotated by 180° around the Z-axis in panel D. The scale bar in-between surfaces A and B represents the approximate width of the core of a nucleotide in single-stranded RNA.

 
Over the past few years a number of studies have shown that coactivator complexes are often bi-functional proteins that not only coactivate transcription mediated by specific transcription factors, like nuclear hormone receptors, but also participate in pre-mRNA processing (Auboeuf et al., 2004; Dowhan et al., 2005; Maniatis and Reed, 2002) and regulation of splicing. Furthermore, a regulatory pseudouridylated RNA termed the steroid receptor coactivator RNA (SRA), together with specific RNA-binding proteins with which it interacts, have been shown to be a part of coactivator complexes that couple nuclear hormone receptors to the basal transcription machinery (Lanz et al., 1999; Shi et al., 2001; Zhao et al., 2004). Given these observations, it is likely that the ASCH domain mediates some of the interactions between RNA and the ASC-1 coactivator complex. Its RNA partner could either be the pre-mRNA generated from the transcription of its target genes or a regulatory RNA like SRA. The association with the ribosomal proteins might indicate that some of the prokaryotic versions might be involved in translational regulation.

The prokaryotic and phage ASCH domains, with a few exceptions, occur as standalone versions (Fig. 1), which are encoded by genes in predicted cotranscribed arrays containing a wide variety of other genes. In several of these cases they are found adjacent to a gene encoding a helix–turn–helix protein, which is the transcriptional regulator of the predicted operon (Fig. 1). In Brucella an ASCH domain is fused to a cI-like HTH domain within the same polypeptide (Fig. 2). These associations suggest that solo ASCH proteins of prokaryotes functionally cooperate with transcription regulators, probably by binding the transcripts generated from particular operons, and thereby regulate their expression.

Evolutionary diversity of ASCH domains and general conclusions
The ASCH superfamily encompasses considerable diversity and can be subdivided into several families that are unified by specific sequence signatures. The ASC-1 proper family is typified by a unique insert between strand-3 and strand-4. It is present in animals (two paralogous versions, with and without a fusion to the Zn-chelating domain are seen in vertebrates, respectively, typified by human ASC-1 and LOC541578; Fig. 1), plants and trypanosomes among the eukaryotes and in certain cyanobacteria, actinobacteria and their phages, Burkholderia and the archaeon Methanococcoides. The two copies in the vertebrates appear to have emerged from a relatively recent duplication in the common ancestor of the extant vertebrates with sequenced genomes. Related to the ASC-1 family is family 1 typified by the Thermus protein TTC1891 (termed DUF437 in PFAM) that is present in Thermus, Pyrococcus and Archaeoglobus. Family 2 (typified by the standalone ASCH domain protein Zymomonas protein ZM00922) is predominantly found in bacteria and archaea, with isolated eukaryotic representatives from the filamentous fungi such as Neurospora and Magnaporthe (Fig. 1). Likewise sporadic eukaryotic representatives from plants are seen in the otherwise prokaryotic family typified by the Pyrococcus protein PH0447 protein (family 3). All the other families of ASCH domains, such as families 4 (DUF984, e.g. EF3133), 5, 6, 7, 8 and 9 are restricted to prokaryotes and their phages. This phyletic pattern of the ASCH superfamily suggests that it diversified in the prokaryotes followed by multiple lateral transfers to the eukaryotes. The Zn-chelating domain and a predicted globular segment immediately downstream of it (Fig. 2) in ASC-1 are conserved in all eukaryotes, and occur as a standalone unit independent of the ASCH domain in basal eukaryotes like Giardia (Supplementary data). Hence, the transfer of the ASCH domain from prokaryotes that gave rise to eukaryotic ASC-1 appears to have happened after the divergence of the basal eukaryotic lineages like Giardia, followed by a fusion to the above-mentioned standalone unit. This was followed by losses of the ASCH domain in crown group eukaryotes, such as in the fungi. In addition to the emergence of ASC-1, there appear to have been independent sporadic transfers of other prokaryotic ASCH family members to specific lineages of crown group eukaryotes (Fig. 1).

In terms of phyletic patterns, the PUA domains can be confidently traced back to the LUCA of all cellular life forms. The ancient versions of the PUA domain include those fused to key RNA metabolism enzymes such as the pseudouridine synthetase, which are conserved in all the three superkingdoms of life (Anantharaman et al., 2002b; Hoang and Ferre-D'Amare, 2001). In the case of the ASCH domain no single family is conserved across the three superkingdoms of life, making it unclear whether it was present in LUCA. However, its broad phyletic range in the prokaryotes suggests that the ASCH domain emerged very early in the evolution of the prokaryotic superkingdoms. It is however not universally represented in all prokaryotic genomes and has been lost in some eukaryotes such as the fungi. This suggests that they are likely to belong to the more easily dispensable regulatory apparatus rather than the core aspects of RNA metabolism. No ASCH domain occurs as multiple repeats in the same polypeptide unlike many other RNA binding domains such as the KH or the RRM domains. This suggests that it is likely to form single isolated contacts with specific features on RNA rather than extended multi-site contact with long RNA molecules. Furthermore, unlike the structurally similar PUA domains, which typically occur in multi-domain proteins fused to other RNA modifying or interacting domains (Anantharaman et al., 2002a; Aravind and Koonin, 1999; Forouhar et al., 2003), the ASCH domains typically occur as the sole globular domain in the polypeptide (Figs 1 and 2). The conserved residues on the surface of the predicted cleft are also distinct in the PUA and ASCH superfamilies, suggesting that they bind very different types of target RNAs. The PUA domain appears to have mainly colonized core functional niches related to rRNA and tRNA modification, while the ASCH domains appear to have to been recruited to a distinct set of functional niches, including transcription coactivation and regulation of translation. Thus, the ASCH and PUA domains appear to have emerged from a common RNA-binding precursor and subsequently diversified to perform distinct functional roles, probably as a result of the diversification of their binding clefts.


    NOTE ADDED IN PROOF
 TOP
 ABSTRACT
 INTRODUCTION
 SYSTEMS AND METHODS
 RESULTS AND DISCUSSION
 NOTE ADDED IN PROOF
 REFERENCES
 
The recently published structure of the Lon ATPase N-terminal (LAN) domain is a novel branch of the PUA-like fold that is outside of the clade including the PUA and the ASCH domains.


    Acknowledgments
 
The authors gratefully acknowledge the Intramural research program of the National Library of Medicine, National Institutes of Health, USA for funding their research.

Conflict of interest: none declared.


    FOOTNOTES
 
Associate Editor: Chris Stoeckert

Received on May 4, 2005; revised on November 1, 2005; accepted on November 4, 2005

    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 SYSTEMS AND METHODS
 RESULTS AND DISCUSSION
 NOTE ADDED IN PROOF
 REFERENCES
 

    Altschul, S.F., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, . 25, 3389–3402[Abstract/Free Full Text].

    Anantharaman, V., et al. (2002a) Comparative genomics and evolution of proteins involved in RNA metabolism. Nucleic Acids Res, . 30, 1427–1464[Abstract/Free Full Text].

    Anantharaman, V., et al. (2002b) SPOUT: a class of methyltransferases that includes spoU and trmD RNA methylase superfamilies, and novel superfamilies of predicted prokaryotic RNA methylases. J. Mol. Microbiol. Biotechnol, . 4, 71–75[ISI][Medline].

    Aravind, L. and Koonin, E.V. (1999) Novel predicted RNA-binding domains associated with the translation machinery. J. Mol. Evol, . 48, 291–302[CrossRef][ISI][Medline].

    Auboeuf, D., et al. (2004) CoAA, a nuclear receptor coactivator protein at the interface of transcriptional coactivation and RNA splicing. Mol. Cell. Biol, . 24, 442–453[Abstract/Free Full Text].

    Bateman, A., et al. (2004) The Pfam protein families database. Nucleic Acids Res, . 32, D138–D141[Abstract/Free Full Text].

    Cerutti, L., et al. (2000) Domains in gene silencing and cell differentiation proteins: the novel PAZ domain and redefinition of the Piwi domain. Trends Biochem Sci, . 25, 481–482[CrossRef][ISI][Medline].

    Clissold, P.M. and Ponting, C.P. (2000) PIN domains in nonsense-mediated mRNA decay and RNAi. Curr. Biol, . 10, 888–890.

    Cuff, J.A. and Barton, G.J. (2000) Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins, 40, 502–511[CrossRef][ISI][Medline].

    Cuff, J.A., et al. (1998) JPred: a consensus secondary structure prediction server. Bioinformatics, 14, 892–893[Abstract/Free Full Text].

    Delano, W.L. The PyMOL Molecular Graphics System, (2002) , San Carlos, CA, USA DeLano Scientific.

    Dowhan, D.H., et al. (2005) Steroid hormone receptor coactivation and alternative RNA splicing by U2AF65-related proteins CAPERalpha and CAPERbeta. Mol. Cell, 17, 429–439[CrossRef][ISI][Medline].

    Eddy, S.R. (1998) Profile hidden Markov models. Bioinformatics, 14, 755–763[Abstract/Free Full Text].

    Fatica, A., et al. (2004) PIN domain of Nob1p is required for D-site cleavage in 20S pre-rRNA. RNA, 10, 1698–1701[Abstract/Free Full Text].

    Forouhar, F., et al. (2003) Functional assignment based on structural analysis: crystal structure of the yggJ protein (HI0303) of Haemophilus influenzae reveals an RNA methyltransferase with a deep trefoil knot. Proteins, 53, 329–332[CrossRef][ISI][Medline].

    Hoang, C. and Ferre-D'Amare, A.R. (2001) Cocrystal structure of a tRNA Psi55 pseudouridine synthase: nucleotide flipping by an RNA-modifying enzyme. Cell, 107, 929–939[CrossRef][ISI][Medline].

    Holm, L. and Sander, C. (1995) Dali: a network tool for protein structure comparison. Trends Biochem Sci, . 20, 478–480[CrossRef][ISI][Medline].

    Ishitani, R., et al. (2002) Crystal structure of archaeosine tRNA-guanine transglycosylase. J. Mol. Biol, . 318, 665–677[CrossRef][ISI][Medline].

    Jung, D.J., et al. (2002) Novel transcription coactivator complex containing activating signal cointegrator 1. Mol. Cell. Biol, . 22, 5203–5211[Abstract/Free Full Text].

    Kim, H.J., et al. (1999) Activating signal cointegrator 1, a novel transcription coactivator of nuclear receptors, and its cytosolic localization under conditions of serum deprivation. Mol. Cell. Biol, . 19, 6323–6332[Abstract/Free Full Text].

    Koonin, E.V. and Mushegian, A.R. (1996) Complete genome sequences of cellular life forms: glimpses of theoretical evolutionary genomics. Curr. Opin. Genet. Dev, . 6, 757–762[CrossRef][ISI][Medline].

    Korber, P., et al. (1999) A new heat shock protein that binds nucleic acids. J. Biol. Chem, . 274, 249–256[Abstract/Free Full Text].

    Lanz, R.B., et al. (1999) A steroid receptor coactivator, SRA, functions as an RNA and is present in an SRC-1 complex. Cell, 97, 17–27[CrossRef][ISI][Medline].

    Maniatis, T. and Reed, R. (2002) An extensive network of coupling among gene expression machines. Nature, 416, 499–506[CrossRef][Medline].

    Mazumder, R., et al. (2002) Detection of novel members, structure-function analysis and evolutionary classification of the 2H phosphoesterase superfamily. Nucleic Acids Res, . 30, 5229–5243[Abstract/Free Full Text].

    Notredame, C., et al. (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol, . 302, 205–217[CrossRef][ISI][Medline].

    Pan, H., et al. (2003) Structure of tRNA pseudouridine synthase TruB and its RNA complex: RNA recognition through a combination of rigid docking and induced fit. Proc. Natl Acad. Sci. USA, 100, 12648–12653[Abstract/Free Full Text].

    Reid, R., et al. (1999) Exposition of a family of RNA m(5)C methyltransferases from searching genomic and proteomic sequences. Nucleic Acids Res, . 27, 3138–3145[Abstract/Free Full Text].

    Rost, B., et al. (1994) PHD—an automatic mail server for protein secondary structure prediction. Comput. Appl. Biosci, . 10, 53–60[Abstract/Free Full Text].

    Schaffer, A.A., et al. (1999) IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices. Bioinformatics, 15, 1000–1011[Abstract/Free Full Text].

    Schaffer, A.A., et al. (2001) Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res, . 29, 2994–3005[Abstract/Free Full Text].

    Shi, Y., et al. (2001) Sharp, an inducible cofactor that integrates nuclear receptor repression and activation. Genes Dev, . 15, 1140–1151[Abstract/Free Full Text].

    Zhao, X., et al. (2004) Regulation of nuclear receptor activity by a pseudouridine synthase through posttranscriptional modification of steroid receptor RNA activator. Mol. Cell, 15, 549–558[CrossRef][ISI][Medline].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
M. Roovers, C. Hale, C. Tricot, M. P. Terns, R. M. Terns, H. Grosjean, and L. Droogmans
Formation of the conserved pseudouridine at position 55 in archaeal tRNA
Nucleic Acids Res., September 10, 2006; 34(15): 4293 - 4301.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/3/257    most recent
bti767v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Iyer, L. M.
Right arrow Articles by Aravind, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Iyer, L. M.
Right arrow Articles by Aravind, L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?