Pre-mRNA Splicing in Eukaryotic Cells
Xiang-Dong Fu
Department of Cellular and Molecular Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0651
Contact information: Phone: 858-534-4937 Fax: 858-534-8549 Email:
[email protected]
1
Summary Gene expression in eukaryotic cells is a collective outcome of transcription, RNA processing, and protein translation. In this chapter, I focus on the mechanism and regulation of gene expression at the level of RNA metabolism. I begin with the introduction of different kinds of RNA expressed in mammalian cells, including mRNAs, rRNAs, tRNAs, small RNAs, miRNAs, and RNAs of unknown function. Realizing that the RNA world is a big topic, which is beyond the scope of a single chapter, I am forced to concentrate on the pathway and regulation of pre-mRNA processing, which is arguably the most important step in gene expression and regulation. Readers are referred to outstanding reviews on other RNA categories for detailed information. Key features in this chapter also include the discussion on integration of RNA processing with other critical nuclear events and on genomics of splicing in the current post-genome era. Most of the information covered here is a condensed version of a number of lectures given annually to graduate students in Beijing and Shanghai as part of the Molecular and Cell Biology course coordinated by members of the Ray Wu society.
2
1. Overview of the RNA world The central dogma in gene expression is the flow of genetic information from DNA to RNA to protein. DNA serves to store and pass on genetic information, but cannot function as the enzyme during the expression and replication of genetic information. Proteins, on the other hand, are the enzymes and structural components of cells and organisms, but do not have the capacity to store genetic information. RNA is the bridge between DNA and protein. Remarkably, RNA is capable of carrying genetic information and processing catalytic functions much like protein enzymes, a property suspected to represent the earliest form of life (Cech, 1985).
1.1. Synthesis and processing of the major three RNA classes Traditionally, RNAs are roughly divided into three major classes, which are transcribed by one of the three RNA polymerases in the cell. rRNAs are transcribed by Pol I, mRNAs by Pol II, and tRNAs by Pol III, respectively. rRNAs are transcribed as a long 45s precursor transcript, which is then processed into 28s, 18s, and 5.8s rRNAs by a large number of proteins, including endo- and exo-nucleases and specificity factors ((Lafontaine and Tollervey, 1995). A separate 5s rRNA gene, however, is transcribed by Pol III. Many sites within rRNAs are modified (i.e. methylation, pesudouridination, etc.), which is essential for their function in translation. tRNAs are individually transcribed as pre-tRNA precursors. The 5’-end sequence is removed by RNase P, which is a ribozyme consisting of a catalytic RNA component and a structural protein component (Kirsebom, 2002). The 3’-end of pre-tRNA is first processed by Rnase-mediated cleavage followed by CCA tri-nucleotide addition, which
3
is catalyzed in the absence of a template (Schurer et al., 2001). A subclass of tRNAs also contains a short intron, which is removed by concerted actions of endonucleases and ligases (Deutscher, 1984). tRNAs are extensively modified to become competent for amino acid charging by aminoacyl-tRNA synthetases (Martinis et al., 1999). mRNAs are first transcribed as intron-containing pre-mRNAs. A pre-mRNA is processed into a mature and functional mRNA in three steps: (1) capping, (2) splicing, and (3) polyadenylation. Capping takes place co-transcriptionally during which a monomethylated guanylate (G) is linked to the first nucleotide at the 5’end via a 5’-5’ phosphodiester bond (Shatkin and Manley, 2000). Splicing occurs in the spliceosome, a multi-component complex, to remove intervening sequences (or introns), which is the main topic in this chapter. Polyadenylation initiates with (1) the recognition of the polyadenylation signal (AAUAA), which is approximately 30 nt upstream of the ploy(A) tail, by cleavage and polyadenylation specificity factors (CPSFs) (2) the binding of the GU-rich sequences downstream of the poly(A) site by cleavage stimulation factors (CstFs) (Keller and Minvielle-Sebastia, 1997). The poly(A) polymerase, with the aid of poly(A) binding protein II, which keeps the length of the poly(A) tail relatively constant, catalyzes the addition of approximately 200 adenosines. Splicing and polyadenylation can take place co-transcriptionally or after the precursor transcript is released from the chromatin to the nucleoplasm (Minvielle-Sebastia and Keller, 1999).
1.2. Non-coding RNAs Besides the three major RNA classes, many small non-coding RNAs are expressed in eukaryotic cells, and their diverse biological roles are being increasingly
4
recognized and appreciated. Small nuclear RNAs (or snRNAs) have been extensively studied as part of the RNA processing machinery (see further details below). snoRNAs are a special class of small RNAs localized in the nucleolus. These snoRNAs, which may be transcribed from individual genes or part of introns in pre-mRNAs, play important roles in guiding site-specific modifications of rRNAs and snRNAs (Decatur and Fournier, 2003). More recently, the world of microRNAs (miRNAs) was brought forward a class of ~21 nt small RNAs involved in a variety of gene expression paradigms (Novina and Sharp, 2004). miRNAs are found in intragenic regions or within introns of other genes. The organization and transcription of miRNAs are poorly understood, although recent evidence suggests that they are transcribed by Pol II (Lee et al., 2004). miRNAs are the final products processed from transcribed precursors in two key steps: (1) The initial transcripts, known as primary miRNAs (pri-miRNAs), are processed by the Droshacontaining complex (Drosha is a RNase) into pre-miRNAs, which all have a similar stemloop structure, and (2) Pre-miRNAs are processed by Dicer (another RNase) into final single-stranded miRNAs (Cullen, 2004). Mature miRNAs are incorporated into the RISC complex (RNA induced silencing complex) to mediate target degradation (if a miRNA base-pairs perfectly with a target, similar to the action of RNAi) or in most cases translational repression (if a miRNA forms partial base-pairing with its target). The function of miRNAs has been implicated in the regulation of development, cell proliferation, and apoptosis (Hartmann et al., 2004). Another class of small RNAs is referred to as small heterogeneous RNAs (or shRNAs) because their lengths are not as confined as miRNAs. This class of non-coding
5
RNAs is mostly transcribed from repeat-containing intragenic regions. shRNAs helps to eliminate transposable elements (an innate cellular defense mechanism against genomic instability) and induce the formation of heterochromatin, a mechanism thought to be critical in programming cell differentiation (Mochizuki et al., 2002); (Taverna et al., 2002); Volpe et al., 2002; 2003(Volpe et al., 2003; Volpe et al., 2002).
1.3. TUFs: A large number of RNAs to be understood Sequencing the genomes of many model organisms, has brought attention to the observation that the number of genes in individual genomes cannot explain the complexity and functional diversity of these organisms. In fact, it is difficult to determine the total number of genes encoded in a given genome. For example, the tiling of total RNA from human cells onto a DNA chip, detected a far higher gene count than previously reported (Kapranov et al., 2002). Some of these “extra” counts may be pseudogenes; however, many clearly correspond to previously unrecognized genes, some of which show a multi-exon arrangement typical of a eukaryotic gene. In a recent genome-decoding consortium meeting, these unknown transcripts were named transcripts of unknown functions (TUFs) (The ENCODE Project Consortium). It will be interesting to learn how many of these TUFs actually correspond to real genes and whether some of them represent new classes of genes that have escaped recognition of conventional molecular biology.
2. Pre-mRNA Splicing: Pathway and Factors
6
Pre-mRNA splicing has been extensively studied in the past three decades beginning with the discovery of “split genes” in the 70s (Sharp, 1994). The development of the in vitro splicing system and the power of yeast genetics allowed for the dissection of the splicing mechanism.
2.1. Consensus splicing signals and the splicing pathway In the majority of eukaryotic genes, exons are relatively short in length, typically ranging from 100 to 300 nts whereas introns are relatively long and variable in length, with lengths up to 100 kb. Key splicing signals mostly reside on the intron side of the exon/intron junction. As shown in Fig. 1, the 5’ splice site is composed of an invariant GT dinucleotide flanked by conserved nucleotides, with the most important being a G in the fifth position on the intron side. The 3’ splice site is more complicated and can be divided into three important regions: the branchpoint sequence, the polypyrimidine tract, and the 3’ invariant AG dinucleotide (Sharp and Burge, 1997). These splicing signals are loosely conserved in mammalian genes. In yeast, however, the splicing signals are much more conserved and the branchpoint sequence shows no variation. The consensus sequence shown in Fig. 1 represents the majority of introns, conveniently referred to as major introns. In addition to major introns, there exists a minor class of introns, which is characterized by the conserved AT and AC dinucleotide at the 5’ and 3’ splice site, respectively (Burge et al., 1998). The major class is thus referred to as GT-AG introns and the minor class to AT-AC introns. The major and minor classes utilize both overlapping and distinct factors to build the spliceosome (see below).
7
Major and minor classes of introns follow the same chemical pathway for intron removal. As shown in Fig. 2, the splicing reaction proceeds in two steps. In the first step, the 5’ exon is cleaved in a nucleophilic attack by the 2’-OH group of the branchpoint nucleotide (adenosine being the most common), resulting in the release of the first exon and the formation of a lariat intermediate. In the second step, the 3’ exon is cleaved during the second nucleophilic attack by the 2’-OH group of the last nucleotide of the released 5’ exon, resulting in a lariat intron and ligated exons. The lariat intron is quickly degraded (with the aid of a debranching enzyme) in the nucleus and the ligated exons are exported from the nucleus after the removal of all introns from the pre-mRNA.
2.2. Exon definition In mammalian systems, exons are short whereas introns are long in length, and splicing signals flanking each exon are recognized first by the exon definition mechanism (Berget, 1995). Interactions between specific factors binding independently at the 3’ and downstream 5’ splice sites result in each exon being recognized as a unit. As predicted from the exon definition model, a functional downstream 5’ splice site was found to stimulate the upstream splicing event (Hoffman and Grabowski, 1992). Interactions between specific factors at the two splice sites are limited by the physical distance between the factors therefore the length of the exon has a major impact. In addition to the mechanism of exon definition, splicing factors can also interact across an intron if the intron length is short enough (intron definition). This two mechanisms work in parallel, where small exons are recognized by exon definition whereas small introns are recognized by intron definition.
8
2.3. snRNP and non-snRNP splicing factors Besides consensus splicing signals at the 5’ and 3’ splice sites, other sequence elements within exons and introns can positively and negatively influence the splicing efficiency. Exonic sequences that can stimulate or inhibit splicing are referred to as exonic splicing enhancers (ESEs) or exonic splicing silencers (ESSs), respectively. Likewise, intronic sequences that have positive or negative affects on splicing are called intronic splicing enhancers (ISEs) or intronic splicing silencers (ISSs), respectively. Small nuclear ribonucleoprotein particles (snRNPs) and non-snRNP protein factors mediate the recognition of consensus splicing signals and regulatory sequences (Kramer, 1996). Many non-snRNP proteins factors play essential roles during splicing while others are only involved in alternative splicing. Splicing of major introns is mediated by U1, U2, U4/6, and U5 snRNPs, while splicing of minor introns is mediated by U11, U12, U4atac/6atac, and U5 (Tarn and Steitz, 1997); (Patel and Steitz, 2003). Thus, only U5 snRNP is common to both classes of introns. Individual snRNPs consist of one uridine-rich small nuclear RNA (snRNA) and a group of associated proteins with the exception of U4/6 and U4atac/6atac di-snRNPs. They consist of two snRNAs packed in one snRNP particle because of extensive base-pairing between the two snRNAs, which is disrupted during splicing to establish a RNA-based catalytic core (see below). Non-snRNP splicing factors are individual protein factors that are not part of snRNP complexes. Many essential non-snRNP splicing factors are RNA binding proteins while others are RNA helicases. The family of SR proteins, which is characterized by one or two RNA Recognition Motifs (RRM) at the N-terminus and an
9
arginine/serine dipeptide repeat (RS domain) at the C-terminus (Fu, 1995); (Graveley, 2000), are well-characterized non-snRNPs. RNA helicases play a central role in RNA rearrangement along the splicing pathway (Staley and Guthrie, 1998). Mass spectrometry has successfully identified proteins associated with purified spliceosomes and subspliceosomal complexes (Gottschalk et al., 1999); (Stevens, 2000; Zhou et al., 2002). Many newly identified spliceosome associated proteinsm however, remain to be functionally characterized.
2.4. Spliceosome assembly Splicing takes place in the spliceosome, which is assembled in a step-wise manner as illustrated in Fig. 3. RNA binding proteins rapidly bind to RNA when an RNA is mixed with nuclear extracts, forming heterogeneous ribonucleoprotein particles known as hnRNPs (or H complex). U1 snRNP base-pairs with the 5’ splice site, forming the E complex (E for early) (Michaud and Reed, 1993). The E complex is a commitment complex, meaning that the pre-mRNA in the complex is committed to the splicing pathway in an irreversible manner. In the next step, U2 snRNP joins the E complex by base-pairing with the branchpoint sequence, resulting in the formation of the A complex. The mechanism for A complex assembly is more complicated than the formation of the E complex and requires several key factors. First U2AF, a heterodimer, binds to the 3’ splice site (Zamore and Green, 1991). The large subunit U2AF65 binds to the polypyrimdine tract while the small subunit U2AF35 touches the conserved AG dinucleotide ((Wu et al., 1999). A number of other protein factors (SF1, SF3, and a number of spliceosome
10
associated proteins or SAPs) are also important for 3’ splice site specification. SF1, which was later renamed the branchpoint binding protein BBP, directly binds to the branchpoint sequence (Abovich and Rosbash, 1997; Berglund et al., 1998). In contrast to U1 binding to the 5’ splice site, U2 base-pairing with the branchpoint is an ATPdependent process. The requirement for ATP likely reflects the essential role of the RNA helicase UAP56 in facilitating U2 binding to the branchpoint sequence (Fleckner et al., 1997); (Kistler and Guthrie, 2001). The A complex, containing U1 and U2, is conditioned for further spliceosome assembly via the addition of the U4/6.U5 tri-snRNPs, resulting in the formation of the B complex known as the mature spliceosome. U4 and U6 are extensively base-paired and jointly packaged into a di-snRNP particle. U5, however, can exist as a single snRNP particle, and before joining the spliceosome, it forms a complex with U4/6 to generate a tri-snRNP particle. U1 is released from the A complex during the joining of the trisnRNP particle to the spliceosome in which the tri-snRNP interacts with the pre-mRNA and U2, thereby forming an RNA-based catalytic core. A number of RNA helicases are involved in this process. As a result, U4 is released from the spliceosome, giving rise to the formation of the active spliceosome or the C complex (Ares and Weiser, 1995; Staley and Guthrie, 1998). The two-step splicing reaction than takes place in the C complex. In the end, U2/5/6 snRNPs are released with the lariat intron, thereby allowing ligated exons to dissociate from snRNPs in preparation for nuclear export.
2.5. Role of SR proteins in constitutive splicing
11
A highlight of many protein factors involved in spliceosome assembly has been briefly described above. The family of SR proteins deserves further attention. In baking yeast, intron-containing genes account for approximately 5% of the total number of genes encoded in the genome with the vast majority containing a single intron. Fewer protein factors are needed for the recognition of yeast splice sites because they are strongly conserved. As a result, SR proteins are not present in yeast, except for a few SR-like RNA binding proteins. Thus, SR proteins are essential for splicing only in higher eukaryotic cells. The participation of SR proteins in a number of critical steps during constitutive splicing leads to the conclusion that they are essential splicing factors (Fu, 1995). It is interesting that, although SR proteins are collectively essential, the majority of splicing events can take place in the presence of just one SR protein. This built-in functional redundancy may allow vast constitutive splicing evets to proceed in different cell types and under a variety of growth conditions where SR proteins are differentially expressed. SR proteins become essential at the very earliest step of spliceosome assembly. In fact, SR proteins are sufficient to commit pre-mRNAs to the splicing pathway (Fu, 1993). Although SR proteins are not essential for U1 binding to the 5’ splice site, they were found to facilitate efficient and accurate selection of functional 5’ splice sites and avoid cryptic 5’ splice sites frequently found in intronic sequences. During the formation of the A complex, SR proteins play an important role in the recruitment of U2 to the 3’ splice site (Fu and Maniatis, 1992). This is mediated by the interaction between SR proteins and the U2AF heterodimer (Wu and Maniatis, 1993). The joining of U4/6.5 tri-snRNPs to the A complex also depends on SR proteins (Roscigno and Garcia-Blanco, 1995). SR
12
protein-dependent snRNP recruitment is likely mediated by the RS domain-mediated protein-protein interactions (Yeakley et al., 1999). Interestingly, the mammalian orthologs of RNA helicases involved in splicing all carry an RS domain. More recent studies suggest that the RS domain of SR proteins may also interact with RNA in assembled spliceosomes (Shen et al., 2004). Phosphorylation is essential for SR proteins to function during spliceosome assembly (Mermoud et al., 1992); (Mermoud et al., 1994). SR proteins are phosphorylated by two families of SR protein specific kinases known as SRPKs and Clks (Gui et al., 1994b); (Colwill et al., 1996a; Colwill et al., 1996b). During splicing in the disassembed spliceosome, however, SR proteins are dephosphorylated. Prevention of SR protein dephosphorylation blocks splicing, suggesting that dephosphorylation of SR proteins is essential for spliceosome resolution (Tazi et al., 1993); (Xiao and Manley, 1998); (Prasad et al., 1999). It is important to point out here that a full phosphorylation/dephosphorylation cycle has to take place after each splicing reaction: A particular phosphorylated SR protein may be used to initiate spliceosome assembly while another different SR protein, upon dephosphorylation, may drive the splicing reaction to completion (Xiao and Manley, 1998). At the cellular level, phosphorylation appears to mediate SR protein trafficking from “storage” sites within the nucleus to nascent transcripts whereas dephosphorylation seems to be responsible for the reversal of this process (Gui et al., 1994a);(Misteli et al., 1998).
2.6. RNA catalysis
13
During spliceosome assembly and within fully assembled spliceosome, RNA elements in pre-mRNA and in snRNPs engage in base-pairing and non-Watson/Crick interactions to establish a core for RNA-based catalysis. This concept has yet to receive full experimental proof; however mounting evidence point in this direction. One such evidence comes from the similar splicing pathways of group II introns and pre-mRNAs (Sharp, 1994). Group II introns are self-splicing introns found mostly in low eukaryotic cells, such as nematodes and trypanosomes, and in mitochondrial and chloroplasts of higher eukaryotic cells. Splicing of group II introns, which can take place in the absence of protein factors, follows a two-step chemical reaction identical to that of pre-mRNA splicing, lending a strong support for an RNA-based catalytic core in both the splicing pathways. Further evidence for an RNA-based catalysis comes from elegant yeast genetics combined with biochemical approaches, such as induced RNA-RNA cross-linking and mapping of cross-linked sites. These approaches led to the elucidation of conserved and functionally important RNA-RNA interactions in the spliceosome (Guthrie, 1991); (Guthrie, 1994). For example, as illustrated in Fig. 4, the U1-5’ splice site base pairing established initially in early splicing complexes are later disrupted, giving ways for basepairing between U6 and the 5’ splice site. U2 remains bound at the branchpoint sequence, and after U4 is released from the spliceosome, part of U2 RNA is then engaged in base-pairing with the U6 RNA. Thus, the 5’ splice site and the 3’ splice site are linked through a network RNA-RNA interactions. These interactions are further strengthened by a conserved loop in the U5 RNA via its base-pairing with the 5’ splice site on one side and with the 3’ splice site on the other. This final core structure closely resembles the
14
minimal catalytic core in group II introns, strongly suggesting that both types of the splicing reaction use the same catalytic mechanism. Finally, attempts have been made to reconstitute the entire splicingor at least part of the process in vitro using synthetic RNAs resembling different regions of pre-mRNA and snRNAs (Valadkhan and Manley, 2001); (Valadkhan and Manley, 2003). This would provide the ultimate proof for the RNAbased catalytic mechanism.
3. Alternative Splicing: Mechanisms and Regulation Unlike yeast, splice sites in higher eukaryotic cells are considerably more variable. As a result, a pre-mRNA may give rise to more than one mRNA product via alternative splicing. In some dramatic cases, a pre-mRNA can, in theory, produce thousands of mRNAs, indicating that alternative splicing has the potential to significantly enlarge the proteome (Black, 2000); (Black, 2003). Alternative splicing is a widespread phenomenon in eukaryotic genomes such that more than half of the genes in humans are alternatively spliced (Modrek and Lee, 2002); (Johnson et al., 2003). Thus, alternative splicing has become more a rule rather than an exception. This provides additional dimensions to the regulation of gene expression as different mRNA isoforms may have distinct half-lives, follow different export pathways, and interact with different factors for intracellular targeting and regulated translation (Black, 2003). Due to its scale and complexity, alternative splicing has become a major challenge in post-genome research.
3.1. Multiple ways to splice alternatively
15
When considering independent uses of 5’ and 3’ splice sites, different classes (modes) of alternative splicing are possible (Fig. 5). Commonly found alternative splicing modes include (1) alternative 5’ choices, (2) alternative 3’ choices, (3) intron retention, (4) exon skipping, (5) mutually selected exons, and (6) combinatory exons. In addition, alternative splicing may also be paired with alternative promoters or the use of alternative polyadenylation sites. Examples of these modes can be found in the UCSC Genome Browser (http://genome.ucsc.edu), which is a compilation of cloned cDNAs aligned with their corresponding genomic loci, therefore revealing marked differences in exon usage for each gene. Key questions in the study of alternative splicing include (1) how alternative splicing is regulated and (2) what is the function of individual mRNA isoforms. The understanding of splicing regulation requires the identification of cis-acting regulatory elements and trans-acting factors. Table 1 lists some well-studied splicing regulators, most of which were identified by biochemical approaches on model systems. More recently, a large-scale RNAi screening against RNA binding proteins and splicing factors revealed the involvement of many previously conceived constitutive splicing factors in alternative splicing (Park et al., 2004). Thus, our knowledge on regulated splicing is quite limited, despite intensive research in the last two decades. Given the prevalence of mRNA isoforms in mammalian cells, it is a great challenge to determine which isoforms are functionally important and which ones are just noise in gene expression as a reflection of defects in the RNA processing machinery.
3.2. Sex determination in the fly: the best understood case
16
The most understood alternative splicing pathway, in terms of functionality and regulatory mechanisms, is the sex determination in Drosophila (Fig. 6). A series of powerful genetic studies has helped in the dissection of the pathway (Baker, 1989), while elegant biochemical experiments have contribute to the elucidation of the molecular mechanisms involved (Maniatis and Tasic, 2002). In this pathway, the sex lethal gene (Sxl) is expressed only in females. Sxl, a RNA binding protein, regulates alternative splicing of its own RNA by binding to intronic sequences surrounding an alternative exon, resulting in the alternative exon to be skipped. The skipped exon contains a premature stop codon, therefore the skipping of this exon leads to a truncated Sxl protein. This pathway helps to decrease the chance of accidental expression of Sxl in males. The Sxl protein also acts on a key downstream target, the transformer gene (Tra). Sxl binds to an intronic regulatory element within Tra, thereby blocking the usage of a nearby 3’ splice site and shifting the splicing donor to the next 3’ splice site. This leads to the expression of a full-length Tra protein (instead of a truncated one). Tra then forms a heterdimer with Tra-2 protein expressed in a non-sex specific manner, which together bind to an exonic regulatory element on the downstream doublesex (Dsx) pre-mRNA. This binding, together with other cellular splicing factors, is responsible for the activation of an upstream 3’ splice site, thereby giving rise to a female-specific Dsx protein. In the absence of Tra/Tra2 binding, a downstream 3’ splice site is used, thereby giving rise to a male-specific Dsx protein. Thus both alternative products of Dsx result in functional proteins. The female Dsx protein negatively regulates genes involved in male differentiation whereas the male Dsx protein negatively regulates genes involved in
17
female differentiation. This pathway illustrates the importance of regulated splicing in crucial biological processes.
3.3. ESEs and ESSs and their effectors The Drosophila sex determination pathway provides a road map to the understanding of regulated splicing in mammalian systems. Many cis-acting elements involved in splicing regulation have been identified through mutagenesis and functional studies in in vitro splicing or in transfected cells. Exonic sequences positively regulating splicing are referred to as exonic splicing enhancers (ESEs) whereas those negatively regulating splicing are called exonic splicing silencers (ESSs). Interestingly, ESEs and ESSs are not unique to alternative exons; these regulatory elements are also widely found within constitutive exons (Schaal and Maniatis, 1999). The ESE/ESS ratio, in addition to other intronic regulatory elements, may be the distinguishing characteristics between alternative and constitutive exons (Fu, 2004). In general, ESEs are recognized by SR proteins whereas ESSs are recognized by hnRNP proteins (Fig. 7). In this sense, Tra-2 is a SR protein. In fact, Tra-2 contains two RNA recognition motifs and two RS domains (one at the C-terminus, like a typical SR protein, and the other at the N-terminus). Tra is also an SR-like protein because it contains a RS domain, but not RRM. The binding of SR proteins to ESEs results in the recruitment of U1 to a nearby downstream 5’ splice site and U2 binding to a nearby upstream 3’ splice site during exon definition. The recruitment of U2 is indirectly accomplished by the recruitment of U2ASF at the 3’ splice site. Because both U2AF subunits also have a RS domain, it is believed that the U1and U2 recruitment is facilitated
18
by RS domain-mediated protein-protein interactions (Wu and Maniatis, 1993). In addition to the recruitment of spliceosome components, SR proteins are also able to antagonize the interaction of hnRNP proteins with ESSs, which prevents the recruitment of U1 and U2 to their target splice sites during spliceosome assembly (Zhu et al., 2001).
3.4. ISEs and ISSs and their effectors Intron sequences also harbor splicing regulatory elements. Intronic splicing enhancers (ISEs) and silencers (ISSs) are not as well understood as ESEs and ESSs. The c-src model, where a small exon (N) is only included in neurons and is skipped in most other cell types, represents the most understood case. Extensive mutagenesis studies have revealed intronic control elements on both sides of the alternative exon (Fig. 8). Here, the definition of ISEs and ISSs becomes a little fuzzy. An ISE in one cell type may be an ISS in another. Intronic sequences downstream of the N exon are important for the inclusion of the exon in neuronal cells, therefore these would be considered neuron specific ISEs. However, in non-neuronal cells, the same sequences serve to prevent the selection of the splice sites on both sides of the N exon. Thus, those sequences are ISSs in non-neuronal cells. Many protein factors have been shown to interact ISEs and ISSs, with the function of a polypyrimidine tract binding protein (PTB) being characterized the best. PTB was originally characterized as a polypyrimidine tract binding protein with strong preference for U-rich sequences, which shares some sequence binding specificity with U2AF65 (Garcia-Blanco et al., 1989). PTB is not required for splicing, and thus it is not an essential splicing factor. Because of its competitive binding with U2AF, PTB is
19
regarded as a negative regulator for splicing. In non-neuronal cells, PTB binds to both sides of the N exon, resulting in the formation of a multi-component complex, which shields the N-exon from the splicing machinery. In neuronal cells, on the other hand, the multi-component complex has a different composition. One of the key molecules is a neuron-specific PTB homologue known as nPTB. nPTB appears sensitive to an ATPdependent process which opens up the inhibitory complex and converts the ISS into ab ISE resulting in the inclusion of the N exon (Chan and Black, 1997); (Modafferi and Black, 1999). nPTB’s regulatory roles in other neuron-specific splicing events needs further investigation. The identification of tissue specific splicing regulators has been a major goal in the splicing field. The first successful example is from the study of a disease gene known as Nova (Jensen et al., 2000a). The Nova family of RNA binding proteins is characterized by a signature KH domain, which is also present in many other RNA binding proteins (Table 1). Via in vitro selection for high affinity RNA elements, Nova was found to interact with YCAY (Y being U or C) elements in pre-mRNA (Jensen et al., 2000b). Multiple copies of this consensus sequence are frequently found in introns of many neuron-specific genes (Dredge and Darnell, 2003). Biochemical and mutagenesis studies demonstrate that the Nova family of RNA binding proteins acts through these RNA motifs to regulate neuron-specific alternative splicing events (Ule et al., 2003). However, it is important to note that binding of the Nova binding proteins can activate splicing in some cases and inhibit splicing in others. The mechanism of either regulatory pathway remains elusive. Nova RNA binding protein knock-out mice have provided the strongest evidence for its role in neuron-specific splicing regulation (Jensen et al.,
20
2000a). The phenotype of the knock-out mouse mimics the human disease and shows the mis-regulation of many neuron-specific genes. Because many alternative splicing events are altered in the knockout, it is presently unclear which genes directly contribute to the neuronal disorders in mice and humans.
3.5. Recursive splicing As mentioned earlier, introns vary dramatically in length, some being 100 kb or longer. The ability of the splicing machinery to correctly identify functional splice sites within the boundless sea of highly related intronic sequences remains unsolved. The ratio of ESEs/ESSs has been suggested to aid this process. In addition, recent evidence suggests that splice site selection may be a co-transcriptional event in vivo, such that a functional splice site may be recognized right after its emergence from the Pol II complex, which then waits for the appearance of its downstream pair. Within the past several years, a new cellular mechanism, recursive splicing, has been shown to deal with splice site selection within long introns in Drosophila (Lopez, 1998). Recursive splicing may simply be viewed as multiple splicing events within the same intron, which eventually results in the removal of the long intron (Fig. 9). In this process, the 5’ splice site finds a downstream 3’ splice site to begin the initial splicing reaction. After the removal of the first piece of the intron, the resultant spliced product has a reconstituted 5’ splice site. This allows the next splicing event(s) to take place. After all splicing reactions are finished in the long intron, the end product appears to have been a result of the direct splicing of the upstream and downstream exons. This mechanism not only illustrates a solution to the removal of long introns, but may also
21
explain some puzzling splicing products carrying one or a few nucleotide insertions between two exon sequences. It is suspected that these short nucleotide insertions may result from a recursive splicing situation where the 3’ splice site is followed by one or a few nucleotides before connecting to the downstream consensus 5’ splice site. This would create so-called “mini exons” in the length range of zero (a case where there are no nucleotides separating the upstream 3’ splice site and the downstream 5’ splice site) to a few nucleotides.
3.6 Regulated splicing in development and disease This is a big topic, which has been recently reviewed (Cartegni et al., 2002); (Faustino and Cooper, 2003). In Drosophila, the role of both isoform expression and trans-acting splicing regulators in development has been well documented, with the sex determination pathway being best understood. Relatively little is known about the role of alternative splicing in development in mammalian systems. This may largely be due to the difficulty of conducting large-scale forward genetic studies in mammals. In a few reported cases, reverse genetic approaches (knock-out or knock-in) have determined the function of specific isoforms or specific splicing regulators. For example, inactivation of WT1 isoform expression led to kidney failure (Hammes et al., 2001) and inactivation of FGFR2 isoforms caused abnormal limb development (De Moerlooze et al., 2000) (Hajihosseini et al., 2001). Furthermore, knock-out of a number of SR proteins resulted in early embryonic lethality (Jumaa et al., 1999); (Wang et al., 2001); (Ding et al., 2004). More recently, it was shown that conditional ablation of prototypical SR proteins SC35 and ASF/SF2 in the heart had no effect on heart development, but caused a typical heart
22
disease known as dilated cardiomyopathy (Ding et al., 2004); (Xu et al., 2005). Interestingly, ASF/SF2 appears to be a critical regulator in a postnatal splicing reprogramming pathway, which appears essential for heart remodeling during the juvenile to adult transition (Xu et al., 2005). These studies demonstrate the importance of regulated splicing in animal development. The examples reported thus far only mark the beginning in understanding the biology of alternative splicing in mammalian systems. The impact of splicing defects on diseases has long been recognized. An early survey indicates that about 15% of disease-causing mutations are because of mutations in exon/intron splicing signals in a wide range of cellular genes (Krawczak et al., 1992); (Stenson et al., 2003). More recently, it was found that silent, mis-sense, and even nonsense mutations can contribute to disease phenotypes because of the mutational induced splicing defects (Cartegni et al., 2002). Previously, disease phenotypes associated with mis-sense and non-sense mutations were thought to be a result of impaired protein functions. It is now clear that many of these mutations result in mis-splicing and/or accelerated RNA decay, thereby mimicking the effect of having null mutations. Two possible mechanisms exist for this mimicking effect. First, point mutations within exons may disrupt existing ESEs or create new ESSs, resulting in abnormal splicing. Second, non-sense mutations may trigger non-sense mediated decay (NMD), resulting in dramatic down-regulation of the affected transcript. The first mechanism explains the observation that point mutations causing Duchenne muscular dystrophy are associated with a more severe disease phenotype than deletion mutants (Pillers et al., 1999). In contrast to cis-acting regulators, little is known about the effect of trans-acting splicing regulators in disease. The most well-known example is the survival motor
23
neuron (SMN) gene. Mutations in SMN cause spinal muscular atrophy (SMA) (Lefebvre et al., 1997). Biochemical studies revealed that SMN functions in snRNP recycling and therefore plays a role in mammalian pre-mRNA splicing (Pellizzoni et al., 1998). However it is unclear why motor neurons are particularly sensitive to mutations in the SMN gene. Genetic mapping also revealed the role of several essential splicing factors (Prp3, Prp8, and Prp31) in retinitis pigmentosa (Hims et al., 2003). Mutations in these genes selectively cause aberrant splicing of a small number of genes, some of which, such as rhodopsin, may be critical for viability of retina cells (Yuan et al., 2005). Reverse genetics has also contributed to the elucidation of the roles of other transacting splicing regulators in disease. As previously mentioned, Nova was shown to induce neurological disorders in knock-out mice similar to those seen in human patients carrying mutations in Nova (Jensen et al., 2000a). Recently, knock-out mice of a gene named Muscleblind, which encodes for a RNA binding protein, gave rise to the same muscular atrophy phenotype as found in humans (Kanadia et al., 2003). These studies show the relevance of splicing regulators in human disease. Broadly speaking, alternative splicing may be widely associated with human diseases, such as cancer, either directly contributing to a specific disease phenotype or indirectly accompanying the progression of a disease. A recent computational analysis suggests that many mRNA isoforms may be specifically induced during cellular transformation (Modrek and Lee, 2002). Therefore, understanding of the mechanism and regulation of alternative splicing is fundamental to disease research.
4. Coupling of Splicing with Other Nuclear Events
24
RNA splicing is not an isolated event in the nucleus, rather it is orchestrated with upstream transcriptional activities and downstream nuclear export steps. Understanding these integrated mechanisms will help to uncover regulated gene expression networks in higher eukaryotic cells. This topic has been recently reviewed (Maniatis and Reed, 2002).
4.1. Transcription-splicing integration EM visualization of co-transcriptional processing of pre-mRNA in spread chromosomes was the first to show the temporal integration of transcription and splicing (Beyer and Osheim, 1988);(Beyer and Osheim, 1991). RT-PCR analysis of mRNAs associated with dissected chromosomes or those released into the nucleoplasm was further used to prove the case (Bauren and Wieslander, 1994). Results showed that upstream introns were removed before the completion of transcription (chromosomeassociated) and downstream introns could be spliced either before or after transcription (only a fraction of spliced mRNA was chromosome-associated). Furthermore, in situ hybridization using specific exon-exon junction probes revealed the co-localization of spliced RNA with nascent transcripts at the gene locus in the nucleus (Zhang et al., 1994). Together, these observations strongly support the idea that splicing is temporally paired with transcription in the nucleus, keeping in mind that some splicing events may be initiated during transcription and not yet completed until after the release of the transcript into the nucleoplasm. Promoter use and transcription elongation have been shown to have some impact on splice site selection, further supporting the integration of transcription and splicing in
25
the nucleus. In transfected cells, it was first recognized that alternative splicing of certain genes depended on the choice of the alternative promoters used to drive the expression of the reporter gene (Cramer et al., 1999); (Cramer et al., 2001). The idea (or the authors’ explanation) is that different promoters may drive gene expression through distinct mechanisms and thus use different sets of transcription factors. These factors may result in differential recruitment of splicing factors responsible for the recognition of the emerging splice sites during transcription. One can further image that the transcription elongation rate may play a role in splice site selection. A slow polymerase would give an emerging weak splice site more time to be recognized and paired with the upstream splice site, forming a committed splicing complex before the appearance of a stronger downstream splice site. However with a fast polymerase, the emergence of downstream splice sites are quicker therefore not giving a weaker upstream splice site a chance to be recognized before the emergence of a stronger downstream splice site. According to this model, alternative splicing events may be modulated by the elongation rate of the Pol II complex. This was recently shown to be the case when a slower Pol II (due to a point mutation in the polymerase) was compared with wt Pol II (Kadener et al., 2002). It is important to note that the models for differential promoter loading and variations in Pol II kinetics may not be mutually exclusive. Transcription elongation rates may directly be dependent on the promoter used and the co-factors recruited. Factors involved in the integration of transcription and splicing remain to be characterized. SR proteins have been found to able to interact with the C-terminal domain (CTD) of Pol II in a phosphorylation dependent manner (Misteli and Spector,
26
1999). Interestingly, phosphorylated Pol II was able to stimulate splicing in vitro, even though it is not an essential splicing factor (Hirose et al., 1999). A number of other RNA processing factors are also associated with the CTD of Pol II, including enzymes involved in capping and polyadenylation, suggesting that RNA processing events ranging from capping to intron removal to polyadenylation are largely co-transcriptional in vivo (Proudfoot, 2000); (Proudfoot et al., 2002). CTD may be a crucial docking site for many different processing factors. However, other factors functioning in transcription elongation, such as protein kinases and phosphatases, may dictate the affinity of various processing factors for CTD. At this point, most mechanisms proposed for the integration of transcription and splicing remain largely speculative and wait for further experimental support.
4.2. Coupling of splicing with nuclear export It has long been observed that unprocessed pre-mRNA cannot be exported out of the nucleus, suggesting the existence of a RNA quality control mechanism. Initially, it was thought that the RNA processing machinery might be localized near the nuclear pore complex to spatially and temporally integrate splicing with RNA export. However this is clearly not the case. Instead, it has been found that transcription factors are able to recruit splicing factors, which then recruit export factors (Reed, 2003). As shown in Fig. 10, the transcription export complex (TREX), which is part of the transcription elongation complex, appears to directly interact with specific RNA binding proteins, such as UAP65, which plays an important role in mediating U2 binding to the branchpoint sequence during spliceosome assembly. As part of the spliceosome, UAP65 then recruits
27
a small protein called Aly, which is a key component of the complex deposited onto every exon-exon junctions (known as the EJC complex, see below) after splicing. Aly then directly interacts with TAP, a critical mediator involved in RNA export, which in turn interacts with nuclear pore complex. Through this cascade of protein-protein interactions, transcription is apparently integrated with splicing and then with export. Other RNA binding proteins, such as the cap binding protein CBP80/20 and SR proteins, also seem to contribute to RNA export (Huang et al., 2003). A subset of SR proteins have been shown to be able to shuttle between the nucleus and the cytoplasm, suggesting a role for shuttling SR proteins during RNA export (Caceres et al., 1998). In addition, shuttling SR proteins may also play a role in translation (Sanford et al., 2004) (see below).
4.3. Integration of splicing and regulation of RNA stability A key complex involved in various integration events is the so-called EJC complex, which consist of a group of proteins deposited onto the exon-exon junction after a splicing reaction (Le Hir et al., 2000). The EJC complex connects RNA splicing to RNA export as described above. The EJC complex also plays a role in non-sense mediated RNA decay (NMD) (Le Hir et al., 2001). This is accomplished by the recruitment of Upf proteins, which in turn recruits the decapping enzyme Dcp1. RNA decapping is one of the major pathways for regulated RNA degradation in eukaryotic cells (Hilleren and Parker, 1999). Two positional rules have come to light in recent years, regarding the influence of upstream RNA processing events on RNA stability. As illustrated in Fig. 11, the first
28
positional rule states that activation of the NMD pathway is triggered by a premature stop codon 50 to 55 nt upstream of an exon-exon junction. The second rule states that EJC is deposited on exon sequences 20 to 24 nt upstream of each exon-exon junction as a result of the splicing reaction. These rules provide critical insights into the impact of RNA splicing on the regulation of RNA stability in the cytoplasm. As illustrated in Fig. 12, during or right after nuclear export of spliced mRNA to the cytoplasm, the first round of translation (pilot round) rearranges protein complexes deposited onto the mRNA. Most natural stop codons reside in the last exon, therefore during the pilot round, the scanning ribosomes strip off all EJCs before reaching the natural stop codon. However in the presence of a pre-mature stop codon, the scanning ribosomes falls off the mRNA before stripping off all EJC complexes. The remaining EJC complex then recruits Upfs, which in turn recruits the decapping enzyme, leading to the down-regulation of the mRNA. It should be noted that additional RNA binding proteins in the cytoplasm might also add to the regulation of RNA stability (Chen et al., 2001; Gherzi et al., 2004). Therefore, individual RNA molecules have distinct half-lives in a given cell under a given experimental condition.
4.4. Connecting nuclear processing to protein translation in the cytoplasm Proteins associated with mRNA during nuclear export may also play a role in protein translation. This was recently found to be true with shuttling SR proteins such as ASF/SF2 (Sanford et al., 2004). Exported mRNA carrying this splicing factor appears to be more competent in translation, although the mechanism remains to be understood.
29
Shuttling nuclear RNA processing factors are able to move back and forth to and from the nucleus. This export/import cycle may be regulated by phosphorylation. As previously mentioned, SR proteins need to be phosphorylated to initiate spliceosome assembly and dephosphorylated for spliceosome resolution. For shuttling SR proteins, dephosphorylation is also essential for interaction with the export factor TAP (Huang et al., 2003; Lai and Tarn, 2004). Once shuttling SR proteins reach the cytoplasm, they are phosphorylated by a family of SR protein specific kinase SRPKs (Yun and Fu, 2000). This is an essential step to enable SR proteins to interact with their nuclear import receptor to enter the nucleus (Lai et al., 2001; Yun et al., 2003). Recently, it has been proposed that SRPK-mediated phosphorylation might have a dual effect: (1) phosphorylation may facilitate the release of shuttling SR proteins from exported mRNA and (2) phosphorylation may then promotes re-entry of the released SR proteins (Gilbert and Guthrie, 2004). Thus, nuclear export of mRNA may be regulated by a nucleuslocalized phosphatase and a cytoplasm-localized kinase system. This is in contrast to other export pathways, which requires the RanGTP gradient created by the nucleuslocalized RCC1 (a Ran exchange factor to generate RanGTP) and a cytoplasm-localized RanGAP (a Ran activation protein to catalyze GTP hydrolysis to produce RanGDP) (Weis, 2002).
5. Genomics of RNA Splicing Like everything else, RNA splicing has become global in the post-genome era. The term “-omics” is used to describe the whole collection of a group, such as proteomics (for proteins), kinomics (for kinases), etc. As discussed earlier, alternatively spliced
30
mRNA isoforms are widespread in higher eukaryotic cells. This presents new opportunities and challenges in understanding the new dimension of gene expression. In the last section of the chapter, features of what one may refer to as RNAomics are briefly depicted.
5.1. Scale of alternative splicing in eukaryotic genomes As described earlier, alternative splicing now appears to be more the rule rather than the exception. About half of the genes in human and mouse express multiple mRNA isoforms. Some of these isoforms may be regulated and thus functionally important whereas others may reflect splicing errors or the products produced during a diseased state. Individual mRNA isoforms from a given gene may encode distinct protein products, be differentially localized in cells or embryos, or have unique half-lives. Therefore, alternative splicing may contribute to the complexity and diversity of the proteome in eukaryotic cells and provide points of differential gene expression regulation. It should be stressed that the functional importance of a given mRNA isoform may not correlate with its abundance. This thus points to a specific problem in the field because most studies focus on the most abundant isoform for functional dissection.
5.2. Features of alternatively spliced regions Analysis of alternatively spliced genomic sequences reveals both expected and unexpected features. Generally speaking, splice sites involved in alternative splice sites are relatively weaker (more divergent from the consensus sequence) than constitutive splice sites. Weaker splice sites may allow the splicing machinery to regulate and control
31
their usage. Alternative spliced exons are shorter than constitutive exons, suggesting that short exons may be subject to alternative splicing because of a reduced ESE frequency. Interestingly, skipped exons maintain the reading frame (length is a multiple of 3 nts) of a transcript more frequently than constitutive ones. This property may allow the compartmentalization of protein domains. Surprisingly, sequences surrounding alternative splice sites (both in the exon and the intron sides) seem to be more conserved across different species than constitutive exons (Sorek and Ast, 2003). This finding argues against the assumption that weak splice sites are sites which have not yet fully evolved. Instead this suggests that these alternative splicing signals (weaker splice sites) are purposely preserved during evolution, and thus are functionally important. These sequence features form a basis for the development of ab initio computational tools to predict alternative splicing. The evolution of alternative splicing is an interesting and open research area. It is postulated that mutually exclusive exons resulted from exon duplication events therefore explaining the sequence similarities between pairs of mutually exclusive exons. A fraction of skipped exons, on the other hand, appears to have evolved from intronic sequences carrying Alu elements (“exonization” of Alu elements) (Sorek et al., 2002; Lev-Maor et al., 2003). A more recent analysis indicates that the “exonization” of multiple types of transposable elements (not just SINE elements) gives rise to skipped exons (Zheng et al., 2005).
5.3. Development of splicing arrays
32
Given the abundance of alternative splicing in higher eukaryotic cells, robust isoform-sensitive microarray systems is expected to aid in the understanding of regulated splicing in development and disease. Three basic microarray platforms have been developed. One platform uses short oligonucleotides (40-mer) to detect individual exonexon junctions (Clark et al., 2002). This strategy has been used to fabricate a highdensity array, targeting exon-exon (mostly constitutive exons) junctions in ~10,000 human transcripts (Johnson et al., 2003). More recently, this strategy has been used to interrogate several thousand skipped exons in the mouse (Pan et al., 2004). One major concern with the use of exon-exon probes is the so-called “half-hybridization” phenomenon where half of each exon-exon probe will hybridize to all competing isoforms containing a common donor or acceptor. The second splicing array platform, the “all-exon” array, is under development at Affymetrix. In constructing this array, all potential exons were identified with the help of several gene prediction programs. This strategy requires the design of four oligonucleotides for each exon. Potential alternative portions of an exon are considered as separate exon or exonic regions therefore four oligonucleotides are also designed for the exon portion. An advantage of this “all-exon” array approach is the ability to conduct an unbiased genomic search for regulated splicing. However, an obvious disadvantage is the lack of exon-exon linkage information. The third platform is based on a molecular barcode strategy (Yeakley et al., 2002). In this approach, a total of three oligonucleotides are used to detect two alternative isoforms. One oligonucleotide targets the common exon site (donor or acceptor) of an alternative event and is linked to a universal primer-landing site. Two
33
additional oligonucleotides are separately synthesized to target the alternative exonic regions. These two oligonucleotides are each linked to a unique 20-mer sequence (called zipcodes) followed by another universal primer landing site. These zip codes are printed on a universal array, which are later used to detect different isoforms. The splicing profile experiment begins with a RNA annealing reaction in which total RNA is mixed with pooled oligonucleotides and biotinylated oligo-dT under denaturing and annealing conditions. Annealed oligonucleotides are selected for on streptavidin oligo-dT beads (solid selection phase). During the solid selection phase, the beads capture all polyA mRNAs along with annealed oligonucleotides. Free oligonucleotides are washed away and T4 DNA ligase is then used to ligate adjacent oligonucleotides bridged by the mRNAs. This process converts half amplicons to full amplicons. Only ligated oligonucleotides can be amplified by PCR via the pair of universal primers. One of the primers is end-labeled with a fluorescent dye. Thus the PCR products can be directly applied to the universal zipcode array. Each zipcode reports one mRNA isoform according to the oligonucleotide design. This approach has aided in the discovery of targets for specific splicing factors, alternative splicing events regulated in various signal transduction pathways, and tumor specific mRNA isoform biomarkers (Li et al., 2005).
5.4. Contributions of splicing to biology: Opportunities and challenges. Given the degree to which we understand the basic mechanisms and various regulatory strategies for alternative splicing, the research in this post-transcriptional step of gene expression is at its infancy. For example, we know little about the biological functions of a majority of isoforms and how alternative splicing may be regulated in
34
development and disease. Understanding the integration of splicing and various upstream and downstream events are still in preliminary stages. As many are chasing the fundamental questions in this field, the mysterious world of microRNAs has recently surfaced. In light of these recent advances, this chapter may better be viewed as a call for the next generation of scientists to pursue a career in RNA research rather than a summation of what has been accomplished thus far in the world of RNA.
Table 1. RNA binding proteins involved in the regulation of alternative splicing
Figure legend Figure 1. Consensus splicing signals. Shown are consensus splicing signals for the major class of introns. ESE: exonic splicing enhancer; ESS: exonic splicing silencer.
Figure 2. Chemical steps in pre-mRNA splicing.
Figure 3. The spliceosome assembly pathway for the removal of the major class of introns. The spliceosome assembly for minor introns are slightly different: U1 is replaced by U11; U2 is replaced by U12; U4/6 is replaced by U4atac/6atac. Furthermore, U11 and U12 may jointly recognize the 5’ and 3’ splice site during spliceosome assembly.
Figure 4. Catalytic core for RNA splicing. A. Binding of U1 to the 5’ splice site and U2 to the branchpoint via base-pairing with the conserved splicing signals. B. Intra-snRNP
35
base-pairing between U4 and U6. C. Deduced RNA-RNA interactions in the spliceosome. D. The catalytic core of a group II intron self-splicing. The core is structurally related to the RNA network within the spliceosome, which provides a strong support for a related RNA-based chemical mechanism in pre-mRNA splicing.
Figure 5. Modes of alternative splicing.
Figure 6. The sex determination pathway in Drosophila. Fly sex phenotype is determined by the X chromosome to Autosome ratio (2:2 for female and 1:2 for male). This ratio dictates Sxl expression in early development. The female phenotype is maintained by a cascade of regulated splicing. Intronic regulatory elements regulate Sxl and Tra splicing whereas exonic regulatory elements regulate the splicing of Dsx.
Figure 7. Positive and negative regulation of splicing via exonic regulatory elements. SR proteins bind to ESEs thereby promoting the binding of U1 to the downstream 5’ splice site and recruiting U2 to the upstream 3’ splice site. HnRNP proteins antagonize the action of SR proteins by binding to ESSs in this process. Protein interactions across the exon are part of the early exon definition process. After that, protein interactions are established across the intron to initiate spliceosome assembly.
Figure 8. Positive and negative regulation of splicing via intronic regulatory elements: Example of PTB-regulated alternative splicing of c-src. PTB interacts with both up- and downstream intronic elements (ISS) in non-neuronal cells. Within neurons, nPTB may
36
be responsible for the reorganization of the suppression complex (an ATP-dependent process) therefore allowing the inclusion of the N exon. Other factors bound to downstream intronic elements (ISE) also play a crucial role in the inclusion of the N exon in neuronal cells.
Figure 9. Removal of long introns by recursive splicing. The 5’ splice site is spliced to a downstream 3’ splice site within the long intron. After the splicing reaction, a 5’ splice site is reconstituted and ready to pair with subsequent downstream 3’ splice site(s). Recursive splicing has been reported in the Ubx gene in Drosophila. Recursive splicing has not yet been reported in mammalian systems.
Figure 10. Integration of splicing with RNA export. The TREX complex plays a role in the co-transcriptional recruitment of UAP56 to pre-mRNA. During splicing, UAP56 recruits Aly, which then becomes a component of EJC. Aly directly interacts with TAP to initiate RNA export. TAP may also be independently recruited by dephosphorylated SR proteins associated with spliced mRNAs as described in the text.
Figure 11. The two positional rules for NMD. A. The rule for NMD. A pre-mature stop codon 50 to 55 nt upstream of an exon-exon junction triggers NMD. B. An EJC complex 20 to 24 nt upstream of an exon-exon junction. The interaction of the EJC with exonic sequences is position-dependent and sequence-independent.
37
Figure 12. Integration of splicing with RNA stability. EJCs, which are deposited onto exon-exon junctions after splicing, play an important role in RNA export. Right after export, ribosomes engage in the first round of translation. During this first round, scanning ribosomes removes EJCs from the mRNA. However, if a pre-mature stop codon is present, all subsequent EJCs will remain on the mRNA, which recruit the Upf proteins to initiate RNA degradation in the cytoplasm. Other sequence specific RNA binding proteins may interact with mRNA in in the cytoplasm to regulate RNA stability, which is independent of EJCs as described in the text.
References Abovich, N., and Rosbash, M. (1997). Cross-intron bridging interactions in the yeast commitment complex are conserved in mammals. Cell 89, 403-412. Ares, M., Jr., and Weiser, B. (1995). Rearrangement of snRNA structure during assembly and function of the spliceosome. Prog Nucleic Acid Res Mol Biol 50, 131-159. Baker, B. S. (1989). Sex in flies: the splice of life. Nature 340, 521-524. Bauren, G., and Wieslander, L. (1994). Splicing of Balbiani ring 1 gene pre-mRNA occurs simultaneously with transcription. Cell 76, 183-192. Berget, S. M. (1995). Exon recognition in vertebrate splicing. J Biol Chem 270, 24112414. Berglund, J. A., Abovich, N., and Rosbash, M. (1998). A cooperative interaction between U2AF65 and mBBP/SF1 facilitates branchpoint region recognition. Genes Dev 12, 858867. Beyer, A. L., and Osheim, Y. N. (1988). Splice site selection, rate of splicing, and alternative splicing on nascent transcripts. Genes Dev 2, 754-765. Beyer, A. L., and Osheim, Y. N. (1991). Visualization of RNA transcription and processing. Semin Cell Biol 2, 131-140. Black, D. L. (2000). Protein diversity from alternative splicing: a challenge for bioinformatics and post-genome biology. Cell 103, 367-370.
38
Black, D. L. (2003). Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem 72, 291-336. Burge, C. B., Padgett, R. A., and Sharp, P. A. (1998). Evolutionary fates and origins of U12-type introns. Mol Cell 2, 773-785. Caceres, J. F., Screaton, G. R., and Krainer, A. R. (1998). A specific subset of SR proteins shuttles continuously between the nucleus and the cytoplasm. Genes Dev 12, 5566. Cartegni, L., Chew, S. L., and Krainer, A. R. (2002). Listening to silence and understanding nonsense: exonic mutations that affect splicing. Nat Rev Genet 3, 285-298. Cech, T. R. (1985). Self-splicing RNA: implications for evolution. Int Rev Cytol 93, 322. Chan, R. C., and Black, D. L. (1997). The polypyrimidine tract binding protein binds upstream of neural cell-specific c-src exon N1 to repress the splicing of the intron downstream. Mol Cell Biol 17, 4667-4676. Chen, C. Y., Gherzi, R., Ong, S. E., Chan, E. L., Raijmakers, R., Pruijn, G. J., Stoecklin, G., Moroni, C., Mann, M., and Karin, M. (2001). AU binding proteins recruit the exosome to degrade ARE-containing mRNAs. Cell 107, 451-464. Clark, T. A., Sugnet, C. W., and Ares, M., Jr. (2002). Genomewide analysis of mRNA processing in yeast using splicing-specific microarrays. Science 296, 907-910. Colwill, K., Feng, L. L., Yeakley, J. M., Gish, G. D., Caceres, J. F., Pawson, T., and Fu, X. D. (1996a). SRPK1 and Clk/Sty protein kinases show distinct substrate specificities for serine/arginine-rich splicing factors. J Biol Chem 271, 24569-24575. Colwill, K., Pawson, T., Andrews, B., Prasad, J., Manley, J. L., Bell, J. C., and Duncan, P. I. (1996b). The Clk/Sty protein kinase phosphorylates SR splicing factors and regulates their intranuclear distribution. Embo J 15, 265-275. Cramer, P., Caceres, J. F., Cazalla, D., Kadener, S., Muro, A. F., Baralle, F. E., and Kornblihtt, A. R. (1999). Coupling of transcription with alternative splicing: RNA pol II promoters modulate SF2/ASF and 9G8 effects on an exonic splicing enhancer. Mol Cell 4, 251-258. Cramer, P., Srebrow, A., Kadener, S., Werbajh, S., de la Mata, M., Melen, G., Nogues, G., and Kornblihtt, A. R. (2001). Coordination between transcription and pre-mRNA processing. FEBS Lett 498, 179-182. Cullen, B. R. (2004). Transcription and processing of human microRNA precursors. Mol Cell 16, 861-865.
39
De Moerlooze, L., Spencer-Dene, B., Revest, J., Hajihosseini, M., Rosewell, I., and Dickson, C. (2000). An important role for the IIIb isoform of fibroblast growth factor receptor 2 (FGFR2) in mesenchymal-epithelial signalling during mouse organogenesis. Development 127, 483-492. Decatur, W. A., and Fournier, M. J. (2003). RNA-guided nucleotide modification of ribosomal and other RNAs. J Biol Chem 278, 695-698. Deutscher, M. P. (1984). Processing of tRNA in prokaryotes and eukaryotes. CRC Crit Rev Biochem 17, 45-71. Ding, J. H., Xu, X., Yang, D., Chu, P. H., Dalton, N. D., Ye, Z., Yeakley, J. M., Cheng, H., Xiao, R. P., Ross, J., et al. (2004). Dilated cardiomyopathy caused by tissue-specific ablation of SC35 in the heart. Embo J 23, 885-896. Dredge, B. K., and Darnell, R. B. (2003). Nova regulates GABA(A) receptor gamma2 alternative splicing via a distal downstream UCAU-rich intronic splicing enhancer. Mol Cell Biol 23, 4687-4700. Faustino, N. A., and Cooper, T. A. (2003). Pre-mRNA splicing and human disease. Genes Dev 17, 419-437. Fleckner, J., Zhang, M., Valcarcel, J., and Green, M. R. (1997). U2AF65 recruits a novel human DEAD box protein required for the U2 snRNP-branchpoint interaction. Genes Dev 11, 1864-1872. Fu, X. D. (1993). Specific commitment of different pre-mRNAs to splicing by single SR proteins. Nature 365, 82-85. Fu, X. D. (1995). The superfamily of arginine/serine-rich splicing factors. Rna 1, 663680. Fu, X. D. (2004). Towards a splicing code. Cell 119, 736-738. Fu, X. D., and Maniatis, T. (1992). The 35-kDa mammalian splicing factor SC35 mediates specific interactions between U1 and U2 small nuclear ribonucleoprotein particles at the 3' splice site. Proc Natl Acad Sci U S A 89, 1725-1729. Garcia-Blanco, M. A., Jamison, S. F., and Sharp, P. A. (1989). Identification and purification of a 62,000-dalton protein that binds specifically to the polypyrimidine tract of introns. Genes Dev 3, 1874-1886. Gherzi, R., Lee, K. Y., Briata, P., Wegmuller, D., Moroni, C., Karin, M., and Chen, C. Y. (2004). A KH domain RNA binding protein, KSRP, promotes ARE-directed mRNA turnover by recruiting the degradation machinery. Mol Cell 14, 571-583.
40
Gilbert, W., and Guthrie, C. (2004). The Glc7p nuclear phosphatase promotes mRNA export by facilitating association of Mex67p with mRNA. Mol Cell 13, 201-212. Gottschalk, A., Neubauer, G., Banroques, J., Mann, M., Luhrmann, R., and Fabrizio, P. (1999). Identification by mass spectrometry and functional analysis of novel proteins of the yeast [U4/U6.U5] tri-snRNP. Embo J 18, 4535-4548. Graveley, B. R. (2000). Sorting out the complexity of SR protein functions. Rna 6, 11971211. Gui, J. F., Lane, W. S., and Fu, X. D. (1994a). A serine kinase regulates intracellular localization of splicing factors in the cell cycle. Nature 369, 678-682. Gui, J. F., Tronchere, H., Chandler, S. D., and Fu, X. D. (1994b). Purification and characterization of a kinase specific for the serine- and arginine-rich pre-mRNA splicing factors. Proc Natl Acad Sci U S A 91, 10824-10828. Guthrie, C. (1991). Messenger RNA splicing in yeast: clues to why the spliceosome is a ribonucleoprotein. Science 253, 157-163. Guthrie, C. (1994). The spliceosome is a dynamic ribonucleoprotein machine. Harvey Lect 90, 59-80. Hajihosseini, M. K., Wilson, S., De Moerlooze, L., and Dickson, C. (2001). A splicing switch and gain-of-function mutation in FgfR2-IIIc hemizygotes causes Apert/Pfeiffersyndrome-like phenotypes. Proc Natl Acad Sci U S A 98, 3855-3860. Hammes, A., Guo, J. K., Lutsch, G., Leheste, J. R., Landrock, D., Ziegler, U., Gubler, M. C., and Schedl, A. (2001). Two splice variants of the Wilms' tumor 1 gene have distinct functions during sex determination and nephron formation. Cell 106, 319-329. Hartmann, C., Corre-Menguy, F., Boualem, A., Jovanovic, M., and Lelandais-Briere, C. (2004). [MicroRNAs: a new class of gene expression regulators]. Med Sci (Paris) 20, 894-898. Hilleren, P., and Parker, R. (1999). Mechanisms of mRNA surveillance in eukaryotes. Annu Rev Genet 33, 229-260. Hims, M. M., Diager, S. P., and Inglehearn, C. F. (2003). Retinitis pigmentosa: genes, proteins and prospects. Dev Ophthalmol 37, 109-125. Hirose, Y., Tacke, R., and Manley, J. L. (1999). Phosphorylated RNA polymerase II stimulates pre-mRNA splicing. Genes Dev 13, 1234-1239.
41
Hoffman, B. E., and Grabowski, P. J. (1992). U1 snRNP targets an essential splicing factor, U2AF65, to the 3' splice site by a network of interactions spanning the exon. Genes Dev 6, 2554-2568. Huang, Y., Gattoni, R., Stevenin, J., and Steitz, J. A. (2003). SR splicing factors serve as adapter proteins for TAP-dependent mRNA export. Mol Cell 11, 837-843. Jensen, K. B., Dredge, B. K., Stefani, G., Zhong, R., Buckanovich, R. J., Okano, H. J., Yang, Y. Y., and Darnell, R. B. (2000a). Nova-1 regulates neuron-specific alternative splicing and is essential for neuronal viability. Neuron 25, 359-371. Jensen, K. B., Musunuru, K., Lewis, H. A., Burley, S. K., and Darnell, R. B. (2000b). The tetranucleotide UCAY directs the specific recognition of RNA by the Nova Khomology 3 domain. Proc Natl Acad Sci U S A 97, 5740-5745. Johnson, J. M., Castle, J., Garrett-Engele, P., Kan, Z., Loerch, P. M., Armour, C. D., Santos, R., Schadt, E. E., Stoughton, R., and Shoemaker, D. D. (2003). Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science 302, 2141-2144. Jumaa, H., Wei, G., and Nielsen, P. J. (1999). Blastocyst formation is blocked in mouse embryos lacking the splicing factor SRp20. Curr Biol 9, 899-902. Kadener, S., Fededa, J. P., Rosbash, M., and Kornblihtt, A. R. (2002). Regulation of alternative splicing by a transcriptional enhancer through RNA pol II elongation. Proc Natl Acad Sci U S A 99, 8185-8190. Kanadia, R. N., Johnstone, K. A., Mankodi, A., Lungu, C., Thornton, C. A., Esson, D., Timmers, A. M., Hauswirth, W. W., and Swanson, M. S. (2003). A muscleblind knockout model for myotonic dystrophy. Science 302, 1978-1980. Kapranov, P., Cawley, S. E., Drenkow, J., Bekiranov, S., Strausberg, R. L., Fodor, S. P., and Gingeras, T. R. (2002). Large-scale transcriptional activity in chromosomes 21 and 22. Science 296, 916-919. Keller, W., and Minvielle-Sebastia, L. (1997). A comparison of mammalian and yeast pre-mRNA 3'-end processing. Curr Opin Cell Biol 9, 329-336. Kirsebom, L. A. (2002). RNase P RNA-mediated catalysis. Biochem Soc Trans 30, 11531158. Kistler, A. L., and Guthrie, C. (2001). Deletion of MUD2, the yeast homolog of U2AF65, can bypass the requirement for sub2, an essential spliceosomal ATPase. Genes Dev 15, 42-49.
42
Kramer, A. (1996). The structure and function of proteins involved in mammalian premRNA splicing. Annu Rev Biochem 65, 367-409. Krawczak, M., Reiss, J., and Cooper, D. N. (1992). The mutational spectrum of single base-pair substitutions in mRNA splice junctions of human genes: causes and consequences. Hum Genet 90, 41-54. Lafontaine, D., and Tollervey, D. (1995). Trans-acting factors in yeast pre-rRNA and presnoRNA processing. Biochem Cell Biol 73, 803-812. Lai, M. C., Lin, R. I., and Tarn, W. Y. (2001). Transportin-SR2 mediates nuclear import of phosphorylated SR proteins. Proc Natl Acad Sci U S A 98, 10154-10159. Lai, M. C., and Tarn, W. Y. (2004). Hypophosphorylated ASF/SF2 binds TAP and is present in messenger ribonucleoproteins. J Biol Chem 279, 31745-31749. Le Hir, H., Gatfield, D., Izaurralde, E., and Moore, M. J. (2001). The exon-exon junction complex provides a binding platform for factors involved in mRNA export and nonsensemediated mRNA decay. Embo J 20, 4987-4997. Le Hir, H., Moore, M. J., and Maquat, L. E. (2000). Pre-mRNA splicing alters mRNP composition: evidence for stable association of proteins at exon-exon junctions. Genes Dev 14, 1098-1108. Lee, Y., Kim, M., Han, J., Yeom, K. H., Lee, S., Baek, S. H., and Kim, V. N. (2004). MicroRNA genes are transcribed by RNA polymerase II. Embo J 23, 4051-4060. Lefebvre, S., Burlet, P., Liu, Q., Bertrandy, S., Clermont, O., Munnich, A., Dreyfuss, G., and Melki, J. (1997). Correlation between severity and SMN protein level in spinal muscular atrophy. Nat Genet 16, 265-269. Lev-Maor, G., Sorek, R., Shomron, N., and Ast, G. (2003). The birth of an alternatively spliced exon: 3' splice-site selection in Alu exons. Science 300, 1288-1291. Li, H-R., Yeakley, J.M., Nair, T.M., Kwon, Y.S., Bibikova, M., Zhou, L., Zheng, C., Downs, T., Wang-Rodriguz, J., Fu, X-D., and Fan, J-B. (2005). Profiling signature mRNA isoform by splicign array: Novel approach to cancer biomarker discovery. Submitted. Lopez, A. J. (1998). Alternative splicing of pre-mRNA: developmental consequences and mechanisms of regulation. Annu Rev Genet 32, 279-305. Maniatis, T., and Reed, R. (2002). An extensive network of coupling among gene expression machines. Nature 416, 499-506.
43
Maniatis, T., and Tasic, B. (2002). Alternative pre-mRNA splicing and proteome expansion in metazoans. Nature 418, 236-243. Martinis, S. A., Plateau, P., Cavarelli, J., and Florentz, C. (1999). Aminoacyl-tRNA synthetases: a new image for a classical family. Biochimie 81, 683-700. Mermoud, J. E., Cohen, P., and Lamond, A. I. (1992). Ser/Thr-specific protein phosphatases are required for both catalytic steps of pre-mRNA splicing. Nucleic Acids Res 20, 5263-5269. Mermoud, J. E., Cohen, P. T., and Lamond, A. I. (1994). Regulation of mammalian spliceosome assembly by a protein phosphorylation mechanism. Embo J 13, 5679-5688. Michaud, S., and Reed, R. (1993). A functional association between the 5' and 3' splice site is established in the earliest prespliceosome complex (E) in mammals. Genes Dev 7, 1008-1020. Minvielle-Sebastia, L., and Keller, W. (1999). mRNA polyadenylation and its coupling to other RNA processing reactions and to transcription. Curr Opin Cell Biol 11, 352-357. Misteli, T., Caceres, J. F., Clement, J. Q., Krainer, A. R., Wilkinson, M. F., and Spector, D. L. (1998). Serine phosphorylation of SR proteins is required for their recruitment to sites of transcription in vivo. J Cell Biol 143, 297-307. Misteli, T., and Spector, D. L. (1999). RNA polymerase II targets pre-mRNA splicing factors to transcription sites in vivo. Mol Cell 3, 697-705. Mochizuki, K., Fine, N. A., Fujisawa, T., and Gorovsky, M. A. (2002). Analysis of a piwi-related gene implicates small RNAs in genome rearrangement in tetrahymena. Cell 110, 689-699. Modafferi, E. F., and Black, D. L. (1999). Combinatorial control of a neuron-specific exon. Rna 5, 687-706. Modrek, B., and Lee, C. (2002). A genomic view of alternative splicing. Nat Genet 30, 13-19. Novina, C. D., and Sharp, P. A. (2004). The RNAi revolution. Nature 430, 161-164. Pan, Q., Shai, O., Misquitta, C., Zhang, W., Saltzman, A. L., Mohammad, N., Babak, T., Siu, H., Hughes, T. R., Morris, Q. D., et al. (2004). Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform. Mol Cell 16, 929-941.
44
Park, J. W., Parisky, K., Celotto, A. M., Reenan, R. A., and Graveley, B. R. (2004). Identification of alternative splicing regulators by RNA interference in Drosophila. Proc Natl Acad Sci U S A 101, 15974-15979. Patel, A. A., and Steitz, J. A. (2003). Splicing double: insights from the second spliceosome. Nat Rev Mol Cell Biol 4, 960-970. Pellizzoni, L., Kataoka, N., Charroux, B., and Dreyfuss, G. (1998). A novel function for SMN, the spinal muscular atrophy disease gene product, in pre-mRNA splicing. Cell 95, 615-624. Pillers, D. M., Fitzgerald, K. M., Duncan, N. M., Rash, S. M., White, R. A., Dwinnell, S. J., Powell, B. R., Schnur, R. E., Ray, P. N., Cibis, G. W., and Weleber, R. G. (1999). Duchenne/Becker muscular dystrophy: correlation of phenotype by electroretinography with sites of dystrophin mutations. Hum Genet 105, 2-9. Prasad, J., Colwill, K., Pawson, T., and Manley, J. L. (1999). The protein kinase Clk/Sty directly modulates SR protein activity: both hyper- and hypophosphorylation inhibit splicing. Mol Cell Biol 19, 6991-7000. Proudfoot, N. (2000). Connecting transcription to messenger RNA processing. Trends Biochem Sci 25, 290-293. Proudfoot, N. J., Furger, A., and Dye, M. J. (2002). Integrating mRNA processing with transcription. Cell 108, 501-512. Reed, R. (2003). Coupling transcription, splicing and mRNA export. Curr Opin Cell Biol 15, 326-331. Roscigno, R. F., and Garcia-Blanco, M. A. (1995). SR proteins escort the U4/U6.U5 trisnRNP to the spliceosome. Rna 1, 692-706. Sanford, J. R., Gray, N. K., Beckmann, K., and Caceres, J. F. (2004). A novel role for shuttling SR proteins in mRNA translation. Genes Dev 18, 755-768. Schaal, T. D., and Maniatis, T. (1999). Multiple distinct splicing enhancers in the proteincoding sequences of a constitutively spliced pre-mRNA. Mol Cell Biol 19, 261-273. Schurer, H., Schiffer, S., Marchfelder, A., and Morl, M. (2001). This is the end: processing, editing and repair at the tRNA 3'-terminus. Biol Chem 382, 1147-1156. Sharp, P. A. (1994). Split genes and RNA splicing. Cell 77, 805-815. Sharp, P. A., and Burge, C. B. (1997). Classification of introns: U2-type or U12-type. Cell 91, 875-879.
45
Shatkin, A. J., and Manley, J. L. (2000). The ends of the affair: capping and polyadenylation. Nat Struct Biol 7, 838-842. Shen, H., Kan, J. L., and Green, M. R. (2004). Arginine-serine-rich domains bound at splicing enhancers contact the branchpoint to promote prespliceosome assembly. Mol Cell 13, 367-376. Sorek, R., and Ast, G. (2003). Intronic sequences flanking alternatively spliced exons are conserved between human and mouse. Genome Res 13, 1631-1637. Sorek, R., Ast, G., and Graur, D. (2002). Alu-containing exons are alternatively spliced. Genome Res 12, 1060-1067. Staley, J. P., and Guthrie, C. (1998). Mechanical devices of the spliceosome: motors, clocks, springs, and things. Cell 92, 315-326. Stenson, P. D., Ball, E. V., Mort, M., Phillips, A. D., Shiel, J. A., Thomas, N. S., Abeysinghe, S., Krawczak, M., and Cooper, D. N. (2003). Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat 21, 577-581. Stevens, S. W. (2000). Analysis of low-abundance ribonucleoprotein particles from yeast by affinity chromatography and mass spectrometry microsequencing. Methods Enzymol 318, 385-398. Tarn, W. Y., and Steitz, J. A. (1997). Pre-mRNA splicing: the discovery of a new spliceosome doubles the challenge. Trends Biochem Sci 22, 132-137. Taverna, S. D., Coyne, R. S., and Allis, C. D. (2002). Methylation of histone h3 at lysine 9 targets programmed DNA elimination in tetrahymena. Cell 110, 701-711. Tazi, J., Kornstadt, U., Rossi, F., Jeanteur, P., Cathala, G., Brunel, C., and Luhrmann, R. (1993). Thiophosphorylation of U1-70K protein inhibits pre-mRNA splicing. Nature 363, 283-286. Ule, J., Jensen, K. B., Ruggiu, M., Mele, A., Ule, A., and Darnell, R. B. (2003). CLIP identifies Nova-regulated RNA networks in the brain. Science 302, 1212-1215. Valadkhan, S., and Manley, J. L. (2001). Splicing-related catalysis by protein-free snRNAs. Nature 413, 701-707. Valadkhan, S., and Manley, J. L. (2003). Characterization of the catalytic activity of U2 and U6 snRNAs. Rna 9, 892-904. Volpe, T., Schramke, V., Hamilton, G. L., White, S. A., Teng, G., Martienssen, R. A., and Allshire, R. C. (2003). RNA interference is required for normal centromere function in fission yeast. Chromosome Res 11, 137-146.
46
Volpe, T. A., Kidner, C., Hall, I. M., Teng, G., Grewal, S. I., and Martienssen, R. A. (2002). Regulation of heterochromatic silencing and histone H3 lysine-9 methylation by RNAi. Science 297, 1833-1837. Wang, H. Y., Xu, X., Ding, J. H., Bermingham, J. R., Jr., and Fu, X. D. (2001). SC35 plays a role in T cell development and alternative splicing of CD45. Mol Cell 7, 331-342. Weis, K. (2002). Nucleocytoplasmic transport: cargo trafficking across the border. Curr Opin Cell Biol 14, 328-335. Wu, J. Y., and Maniatis, T. (1993). Specific interactions between proteins implicated in splice site selection and regulated alternative splicing. Cell 75, 1061-1070. Wu, S., Romfo, C. M., Nilsen, T. W., and Green, M. R. (1999). Functional recognition of the 3' splice site AG by the splicing factor U2AF35. Nature 402, 832-835. Xiao, S. H., and Manley, J. L. (1998). Phosphorylation-dephosphorylation differentially affects activities of splicing factor ASF/SF2. Embo J 17, 6359-6367. Xu, X., Yang, D., Ding, J. H., Wang, W., Chu, P. H., Dalton, N. D., Wang, H. Y., Bermingham, J. R., Jr., Ye, Z., Liu, F., et al. (2005). ASF/SF2-regulated CaMKIIdelta alternative splicing temporally reprograms excitation-contraction coupling in cardiac muscle. Cell 120, 59-72. Yeakley, J. M., Fan, J. B., Doucet, D., Luo, L., Wickham, E., Ye, Z., Chee, M. S., and Fu, X. D. (2002). Profiling alternative splicing on fiber-optic arrays. Nat Biotechnol 20, 353-358. Yeakley, J. M., Tronchere, H., Olesen, J., Dyck, J. A., Wang, H. Y., and Fu, X. D. (1999). Phosphorylation regulates in vivo interaction and molecular targeting of serine/arginine-rich pre-mRNA splicing factors. J Cell Biol 145, 447-455. Yuan, L., Kawada, M., Havlioglu, N., Tang, H., and Wu, J. Y. (2005). Mutations in PRPF31 inhibit pre-mRNA splicing of rhodopsin gene and cause apoptosis of retinal cells. J Neurosci 25, 748-757. Yun, C. Y., and Fu, X. D. (2000). Conserved SR protein kinase functions in nuclear import and its action is counteracted by arginine methylation in Saccharomyces cerevisiae. J Cell Biol 150, 707-718. Yun, C. Y., Velazquez-Dones, A. L., Lyman, S. K., and Fu, X. D. (2003). Phosphorylation-dependent and -independent nuclear import of RS domain-containing splicing factors and regulators. J Biol Chem 278, 18050-18055.
47
Zamore, P. D., and Green, M. R. (1991). Biochemical characterization of U2 snRNP auxiliary factor: an essential pre-mRNA splicing factor with a novel intranuclear distribution. Embo J 10, 207-214. Zhang, G., Taneja, K. L., Singer, R. H., and Green, M. R. (1994). Localization of premRNA splicing in mammalian nuclei. Nature 372, 809-812. Zheng, C., Fu, X-D., and Gribskov, M. (2005). Characteristics defining constitutive splicing, alternative splicing and different modes of alternative splicing. Submitted. Zhou, Z., Licklider, L. J., Gygi, S. P., and Reed, R. (2002). Comprehensive proteomic analysis of the human spliceosome. Nature 419, 182-185. Zhu, J., Mayeda, A., and Krainer, A. R. (2001). Exon identity established through differential antagonism between exonic splicing silencer-bound hnRNP A1 and enhancerbound SR proteins. Mol Cell 8, 1351-1361.
48
Table 1. RNA binding proteins involved in alternative splicing Family Name
Examples
Key Domain
Essential Splicing Factor?
SR Proteins
SC35, ASF/SF2, 9G8, hTra2-α, hTra2-β, SRp20, 30c, 40, 46, 54, SRp55, 75, 86
RRM and RS
Yes
HnRNPs
hnRNP A/B, hnRNP F, hnRNP H, hnRNP I/PTB, nPTB, TIA-1, Elva, Fox-1, 2, 3
RRM, some with RGG boxes
No
KH-type
KSRP, Nova-1, 2, PSI
KH
No
CELF Factors
CUG-BP1, CUG-BP2/ETR-3, NAPOR
RRM
No
MBNL
MBNL-1, 2, 3
C3H zinc finger
No
5’ splice site
3’ splice site branchpoint
A/CAG GURAGU
ESE
ESS
YNYURAY--Y10-20--YAG polypyrimidine tract
exon
intron
exon R: Purine Y: Pyrimidine N: Any base
Figure 1
2’ A HO P
Pre-mRNA
P
Step I: 5’ splice site cleavage and branch formation Lariat intermediate
pA P
3’ OH
Step II: 3’ splice site cleavage and exon ligation
Ligated exons
P
+
Figure 2
pA
3’ HO
Released lariat intron
U1 Exon 1
Exon 2
ATP
U1
E (Commitment Complex)
U2
A
A
(Pre-spliceosome)
U1
U6
B
U5
U4
(spliceosome)
U2 U4
C
U6
(Activated Spliceosome)
U5 U2
U6 U5 U2
Exon 1
Exon 2
mRNA Figure 3
A
B U1
U2 U6
Exon 1
C
U UCCAU CAUA
A G AUGAUGU
AGGUA GU U
UACUA C A
GA GAU AAG UU UCA C UGCAU A CCU U GG GC C U C UA UC A GUUC ACG G GU G C AAG U A UU U
AGGU
Exon 2
A C U G A U A C G C G A C C GU U A U A G C AC A CGUUUUACAAAGAGAUUUAUUUCGUUUU AG C UG A G G G UUUUCCGUUUCUCUAAGCA UA GA A C U UGAUC U NAG G ACUAG A U U U A A U G C U GC C U GUAGUA
U4
D
U6
5’
U5
U2
3’
UG G A
G n
5’
GUGCG
C AUCAU A
3’
yA
U
r
y
C
A y UUG n n n y G n
G G
Figure 4
r
r r AGC n n n r
A
y y
G
n
A A
Domain 5
r r r r r r
yyyy yy A
Domain 6
Intron retention
Mutually exclusive exons
Alternative 5’ choices
Combinatory exon selections
Alternative 3’ choices
Alternative promoter and splicing
Exon inclusion/skipping
Alternative splicing & polyadenylation pA
pA
Figure 5
X:A ratio:
2:2
+
1:2 + Sxl stop
Sxl
2
3
4
Sxl off
2
Tra (truncated)
- Sxl - Sxl
Tra
stop
1
+ Sxl
Tra-2
Tra/Tra-2
+ Dsx
3
4
Negative regulator of male differentiation genes
5
Dsx
Negative regulator of female differentiation genes Figure 6
Interactions across exon Interactions across intron
U2 5’ ss
U2AF
BPS
Py
+ + + + ++ SR
ESE
3’ ss
-- - A/B
Interactions across intron
U1 3’ ss
ESS 5’ ss
Figure 7
Non-neuronal cells (N1 skipping)
U1
U2
3
PTB B PT PTB B T P KSRP ISS N1 H p28 p50 U1
4
Neuronal cells (N1 inclusion)
U1
U2
U1 N1
3
PTB
ATP
Figure 8
ISE
KSRP H p28 p50
nPTB
U2 4
5’ss
3’ss 5’ss
3’ss
5’ss
3’ss
Figure 9
Pol II
TREX UAP56 3’ss Spliceosome
Aly
Cap 5’ss
EJC Cap
Aly
(A)n TAP
Nucleus
Cytoplasm
Figure 10
A. The 50-55 nts rule for NMD Stop
Exon-exon junction
D: > 50-55 nts: NMD < 50-55 nts: no effect B. The 20-24 nts rule for post-splicing marker Exon-exon junction EJC
D = 20-24 nts
Figure 11
Upf
Upf Cap
EJC
EJC
(A)n
TAP
TAP
Nucleus
Cytoplasm Upf Cap
EJC
Upf
Upf stop
stop
EJC
Cap
(A)n
Upf
stop
EJC
EJC
(A)n
Ribosome Dcp1
stop Cap
(A)n
Cap
Translation
stop
Upf EJC
Degradation
Figure 12
stop (A)n