6. RNA Processing a). Steps in mRNA processing i). Capping ii). Cleavage and polyadenylation iii). Splicing b). Chemistry of mRNA splicing c). Spliceosome assembly and splice site recognition i). Donor and acceptor splice sites ii). Small nuclear RNAs d). Mutations that disrupt splicing e). Alternative splicing
Steps in mRNA processing (hnRNA is the precursor of mRNA) • capping (occurs co-transcriptionally) • cleavage and polyadenylation (forms the 3’ end) • splicing (occurs in the nucleus prior to transport) exon 1
intron 1
exon 2
Transcription of pre-mRNA and capping at the 5’ end cap Cleavage of the 3’ end and polyadenylation cap cap
poly(A) Splicing to remove intron sequences cap
poly(A)
Transport of mature mRNA to the cytoplasm
Capping occurs co-transcriptionally shortly after initiation • guanylyltransferase (nuclear) transfers G residue to 5’ end • methyltransferases (nuclear and cytoplasmic) add methyl groups to 5’ terminal G and at two 2’ ribose positions on the next two nucleotides pppNpN
mGpppNmpNm
capping involves formation of a 5’- 5’ triphosphate bond • cap function • protects 5’ end of mRNA (increases mRNA stability) • required for initiation of protein synthesis
Polyadenylation • cleavage of the primary transcript occurs approximately 10-30 nucleotides 3’-ward of the AAUAAA consensus site • polyadenylation catalyzed by poly(A) polymerase • approximately 200 adenylate residues are added cleavage AAUAAA mGpppNmpNm
AAUAAA mGpppNmpNm
A A A
A
polyadenylation A
A 3’
• poly(A) is associated with poly(A) binding protein (PBP) • function of poly(A) tail is to stabilize mRNA
Chemistry of mRNA splicing • two cleavage-ligation reactions • transesterification reactions - exchange of one phosphodiester bond for another - not catalyzed by traditional enzymes • branch site adenosine forms 2’, 5’ phosphodiester bond with guanosine at 5’ end of intron
intron 1
Pre-mRNA 2’OH-A
5’
exon 1
G-p-G-U -
branch site adenosine
A-G-p-G
First clevage-ligation (transesterification) reaction
exon 2
3’
• ligation of exons releases lariat RNA (intron) intron 1
Splicing intermediate
U-G-5’-p-2’-A A
5’
exon 1
G-OH O 3’
exon 2
A-G-p-G A -
3’
Second clevage-ligation reaction intron 1
Lariat U-G-5’-p-2’-A A
3’ G-A
Spliced mRNA 5’
exon 1
G-p-G
exon 2
3’
Recognition of splice sites • invariant GU and AG dinucleotides at intron ends • donor (upstream) and acceptor (downstream) splice sites are within conserved consensus sequences donor (5’) splice site
branch site
acceptor (3’) splice site
G/GUAAGU..................…A.......…YYYYYNYAG/G
U1
U2
•small nuclear RNA (snRNA) U1 recognizes the donor splice site sequence (base-pairing interaction) • U2 snRNA binds to the branch site (base-pairing interaction) Y= U or C for pyrimidine; N= any nucleotide
Spliceosome - assembly of the splicing apparatus • snRNAs are associated with proteins (snRNPs or “snurps”) • splicing snRNAs - U1, U2, U4, U5, U6 • antibodies to snRNPs are seen in the autoimmune disease systemic lupus erythematosus (SLE)
= hnRNP proteins
Spliceosome assembly intron 1
Step 1: binding of U1 and U2 snRNPs
U2
2’OH-A exon 1 5’
exon 2
U1
G-p-G-U -
A-G-p-G
3’
intron 1
Step 2: binding of U4, U5, U6
U2 U4 U6
2’OH-A
exon 1 5’
U5
G-p-G-U -
exon 2
A-G-p-G
3’
U1 Step 3: U1 is released, then U4 is released
intron 1
2’OH-A
U6
exon 1 5’
G-p-G-U -
U5
U2 exon 2 A-G-p-G
3’
Step 4: U6 binds the 5’ splice site and the two splicing reactions occur, catalyzed by U2 and U6 snRNPs
intron 1 2’OH-A
U6 U2
U-G-5’-p-2’-A A
mRNA 5’
3’ G-A
G-p-G
U5 3’
Frequency of bases in each position of the splice sites Donor sequences exon intron %A %U %C %G
30 20 30 19
40 7 43 9
64 13 12 12 A
9 12 6 73 G
0 0 0 100 0 0 100 0 G U
62 6 2 29 A
68 12 9 12 A
9 5 2 84 G
17 63 12 9 U
39 22 21 18
24 26 29 20
Acceptor sequences intron exon %A %U %C %G
15 10 10 15 6 15 11 19 12 3 10 25 4 100 0 51 44 50 53 60 49 49 45 45 57 58 29 31 0 0 19 25 31 21 24 30 33 28 36 36 28 22 65 0 0 15 21 10 10 10 6 7 9 7 7 5 24 1 0 100 Y Y Y Y Y Y Y Y Y Y Y N Y A G Polypyrimidine track (Y = U or C; N = any nucleotide)
22 8 18 52 G
17 37 22 25
Mutations that disrupt splicing • βo-thalassemia - no β-chain synthesis • β+-thalassemia - some β-chain synthesis Normal splice pattern: Exon 1
Exon 2
Exon 3 Intron 2
Intron 1 Donor site: /GU
Acceptor site: AG/
Intron 2 acceptor site βο mutation: no use of mutant site; use of cryptic splice site in intron 2 Exon 1
Exon 2 Intron 1
Intron 2 cryptic acceptor site: UUUCUUUCAG/G
mutant site: GG/
Translation of the retained portion of intron 2 results in premature termination of translation due to a stop codon within the intron, 15 codons from the cryptic splice site
Intron 1 β+ mutation creates a new acceptor splice site: use of both sites Exon 1
Exon 2
Exon 3 Intron 2
Donor site: /GU
AG/: Normal acceptor site (used 10% of the time in β+ mutant)
CCUAUUAG/U: β+ mutant site (used 90%of the time) CCUAUUGG U: Normal intron sequence (never used because it does not conform to a splice site) Translation of the retained portion of intron 1 results in termination at a stop codon in intron 1
Exon 1 β+ mutation creates a new donor splice site: use of both sites Exon 2
Exon 3 Intron 2
/GU: Normal donor site (used 60% of the time when exon 1 site is mutated) GGUG/GUAAGGCC: β+ mutant site (used 40%of the time) GGUG GUGAGGCC: Normal sequence (never used because it does not conform to a splice site) The GAG glutamate codon is mutated to an AAG lysine codon in Hb E The incorrect splicing results in a frameshift and translation terminates at a stop codon in exon 2
Patterns of alternative exon usage • one gene can produce several (or numerous) different but related protein species (isoforms)
Cassette
Mutually exclusive
Internal acceptor site
Alternative promoters
The Troponin T (muscle protein) pre-mRNA is alternatively spliced to give rise to 64 different isoforms of the protein Constitutively spliced exons (exons 1-3, 9-15, and 18) Mutually exclusive exons (exons 16 and 17) Alternatively spliced exons (exons 4-8)
Exons 4-8 are spliced in every possible way giving rise to 32 different possibilities Exons 16 and 17, which are mutually exclusive, double the possibilities; hence 64 isoforms