2.0 DNA Deoxyribonucleic acid (DNA) is the genetic material for all prokaryotes, eukaryotes, and many viruses. DNA is a polymer of nucleotides that forms the genome of an organism, and the genetic information contained within the genome is encoded in the sequence of the nucleotide bases in the polymer. The differences in genetic makeup between organisms lies in the length and the nucleotide sequence of their DNA.
2.1 Nucleotide Structure Nucleotides are the monomers that are polymerized to make DNA or RNA, and contain a nitrogenous base, a pyranose sugar, and one or more phosphate groups. The nucleotides used to make DNA contain 2-deoxyribose as the pyranose sugar while the nucleotides used to make RNA contain ribose. Apart from their roles as genetic material, nucleotides also serve as the energy currency of the cell because of high energy phosphate bonds found in nucleotide triphosphates. Nucleotides are essential components of coenzymes such as NAD+, NADP+, FAD, and coenzyme A. The nucleotides are also regulatory molecules that are involved in the control of a number of metabolic pathways. Humans acquire nucleotides from the diet or de novo synthesis.
2.11 Nitrogenous Bases The nitrogenous bases are the components of the DNA or RNA molecule that convey genetic information by the sequence in which they are polymerized. The nitrogenous bases found within DNA and RNA can be classified as either Purines or Pyrimidines based on the structure of their hetrocyclic rings. Base pairing between purines and pyrimidines plays an important role in stabilizing the secondary structure of nucleic acids and it serves as the basis of nucleic acid sequence recognition events vital to processes such as replication and transcription.
2.11.1
Purines
The purines are nitrogenous bases that contain two fused hetrocyclic rings formed from carbon and nitrogen. The two purines found in nucleic acids are adenine and guanine, and they form base pairs with the thymine and cytosine respectively. The structure of adenine is shown below, and the numbers identify positions on the hetrocyclic rings.
2.11.2
Pyrimidines
The pyrimidines are nitrogenous bases that contain one hetrocyclic ring formed from carbon and nitrogen. The two pyrimidines found in DNA are thymine and cytosine, and they form base pairs with the adenine and guanine respectively. Uracil is a pyrimidine that substitutes for thymine in RNA to base pair with adenine. The structure of cytosine is shown below, and the numbers identify positions on the hetrocyclic ring.
2.11.3
A-T Base Pairing
Purines are capable of forming hydrogen bond stabilized basepairs with pyrimindines. In DNA, thymine forms two hydrogen bonds when it basepairs with adenine. Uracil substitutes for thymine in RNA to base pair with adenine. G-C basepairs are also found in DNA so the %A = %T and %G = %C because of basepairing between the two complimentary strands of DNA.
2.11.4
G-C Base Pairing
Purines are capable of forming hydrogen bond stabilized basepairs with pyrimindines. Cytosine pairs with guanine to form three hydrogen bonds. A-T basepairs are also found in DNA so the %A = %T and %G = %C because of basepairing between the two complimentary strands of DNA. G-C basepairs are more heat-stable than A-T basepairs because of the additional hydrogen bond in a G-C basepair.
2.12 Nucleosides A nucleoside if formed by adding a sugar to the nitrogenous base. Ribose is the sugar found in RNA and 2-deoxyribose is the sugar in DNA. Carbon 1 of the appropriate sugar is attached to nitrogen 9 in purines or nitrogen 1 in pyrimidines to form a nucleoside. The carbon atoms in the sugar are designated as prime (‘) to avoid confusion with positions on the nucleotide ring.
2.12.1
Ribose
Ribose is the pentose sugar found in RNA. Ribose is linked to a nitrogenous base at position 1 and to phosphate groups at position 5 to form a complete nucleotide. The structure of ribose is shown below with the carbons numbered.
2.12.2
2-deoxyribose
The pentose sugar found in DNA is 2-deoxyribose. It differs from ribose by lacking a hydroxyl group at position 2 on the ring. The 2-deoxyribose is linked to a nitrogenous base at position 1 and to phosphate groups at position 5 to form a complete nucleotide. The structure of 2-deoxyribose shown below with the carbons numbered.
2.13 Nucleotides Liking one or more phosphate groups to the 5’ position of the nucleoside sugar produces a nucleotide. Mono-, di-, and triphosphate forms of nucleotides are commonly found in cells. The deoxy nucleotide triphosphates used to make DNA are dATP, dCTP, dTTP, and dGTP. RNA is made from ATP, GTP, CTP, and UTP. The bonds between the phosphate groups are energy rich and are hydrolyzed to provide energy for cellular processes.
2.2
DNA Structure
DNA is a polynucleotide consisting of a series of 5’-3’ phosphate-deoxyribose linkages that form a backbone from which the bases extend from. The 5’-3’ linkages confer directionality to the DNA strands that are wrapped around each other to form a right handed double helix. The DNA strands are complimentary to each other and are anti parallel with respect to the sugar-phosphate backbone. The helix is 20A in diameter and makes a complete turn every 34A. The bases are stacked 3.4A apart and extend into the interior of the helix. Inside the helix, basepairs are formed with the complimentary base on the opposite strand and hydrophobic interactions between the stacked bases stabilize the helix. The hydrophilic surface of the helix is formed by the sugar-phosphate backbone. The double helix can be denatured by heat, pH changes that cause the bases to ionize, and organic solvents. While denaturing, the absorbance of the DNA solution at 260 nm will increase as the DNA becomes single stranded. Because of the additional hydrogen bond between GC basepairs, GC rich DNA will be more difficult (higher denaturation temperature) to denature than AT rich DNA. When the denaturing agent is removed, the DNA strands will reanneal to form the double helix. DNA can exist in several helical forms (A, B, and Z). The B form of DNA is shown below.
2.21 Supercoiling In living cells, DNA is usually constrained from free rotation by proteins and RNA that closely associate with the double helix. When forces act on the DNA to twist the molecule around its own axis, supercoils are produced to relieve the tension. One supercoil is formed for every time the molecule is twisted about its axis, and if the magnitude of the torsion therefore proportional to the number of supercoils present. Winding the DNA in a direction opposite the clockwise turns of the helix produces negative supercoils reflecting the underwinding of the helix, and this reflects the natural state of DNA in vivo. If the magnitude of the negative supercoiling is great enough, localized disruption of baseparing can occur. Overwinding the DNA in a clockwise direction will produce positive supercoils, but this is occurs only during laboratory manipulations of DNA. To maintain an appropriate magnitude of negative supercoiling in a cell, topoisomerase enzymes expend ATP to induce or relieve supercoils as necessary. Type 1 topoisomerases break one sugar phosphate backbone, allow the DNA to rotate freely, and then reattach the backbone. This enzyme can reduce positive and negative supercoils in an ATP dependent manner. Type 2 topoisomerases are capable of breaking both sugar-phosphate backbones in the helix to relieve or induce tension as needed. Type 2 topoisomerases do not require ATP for activity.
2.3
Genome Organization
The genome contains all the genes required for the replication and metabolism of the organism. One or more origins of replication, regulatory sequences, and ‘spacer’ or ‘junk’ DNA are also components the genome. Several prokaryotic genomes that are several million base pairs in size, have been completely sequenced, and the Human Genome Project has been launched to sequence the entire human genome. The human genome is distributed over 46 linear chromosomes that contain over 3 billion base pairs of information. Because of the enormous amount of genetic material in the human genome, specialized chromosome structures are needed to protect and organize the DNA.
2.3.1 Genotype vs. Phenotype The variability observed in the same species is due to differences in the phenotypes of individuals. Phenotype is not inheritable, but is the sum of the interaction of the alleles in the genome (genotype) and the environment. This interaction can be controlled by a small number of well defined genes or be multifactoral. Multifactoral traits are sums of several groups of genes interacting with each other and the environment to produce the observable phenotype.
2.3.11
Genes and Traits
A genetic trait is a phenotype specific for a particular genotype. The gene responsible for this genotype may affect only one trait or it may be pleiotropic and affect many traits. Additionally it is possible for one trait to be influenced by more than one gene.
2.31 Prokaryotic Genomes Prokaryotes have relatively small circular genomes with single-copy, closely packed genes with simple regulatory and transcriptional elements. The DNA is complexed with RNA and protein to form a compact, folded structure that has a single origin of replication. Genes are arranged in tandem arrays called operons which are regulated and transcribed as a unit. Transcription is initiated at a promoters that have a -35 sequence (TTGACA) and a -10 sequence (TATAAT) to produce a transcript that will only rarely have introns. Translation occurs simultaneously, starting at ribosome binding sites (AGGAGG) found on the mRNA that bind to a complimentary sequence on the 16S rRNA in small ribosomal subunit. Genetic diversity arises through mutation, conjugation, transformation, and transduction.
2.32 Eukaryotic Genomes Eukaryotic genomes are larger and more complex than prokaryotic genomes. To organize and protect the DNA a highly ordered chromatin structure is employed. This structure must be unfolded to allow transcription by the three eukaryotic RNA polymerases, providing an additional level of regulation on transcription. Additionally, eukaryotic genomes contain large quantities of repetitive DNA that serves no well defined function. This genome is distributed over one or more linear chromosomes. If one copy of each chromosome is present, the organism is haploid, with two copies the organism is diploid, and with more than two copies the organism is aneuoploid. Humans have a diploid genome with 23 pairs of chromosomes, including two sex chromosomes. Aneuloploidy leads to serious developmental defect such as Down’s Syndrome in humans, but in organisms such as plants and some amphibians aneuloploidy is common and not deleterious. Genetic diversity is produced by recombination, mutation, and sexual reproduction
2.32.1
Chromatin
Human cells have 23 pairs of chromosomes, representing over 3 billion base pairs with a total DNA length of approximately 2 meters per cell. To compact the DNA to fit in 5-10 uM cell and to protect the molecule from shearing, DNA is complexed with protein and organized into chromatin. The fundamental unit of chromatin is the nucleosome core which consists of a histone octamer composed of two copies of histone H2A, H2B, H3, and H4. This core particle then serves as a spool to wind up 146 base pairs of DNA, forming a nucleosome. Histone H1 then binds to the nucleosome, and a chromatin fiber then forms. These fibers form higher order structures such as 100 and 300 angstrom fibers that form the Laemli loops that are anchored to an acidic protein scaffold to form a highly compact chromosome. The distinctive light and dark banding patterns seen when the chromosomes are stained are used to identify the individual chromosomes.
2.32.2
DNA Sequences
The DNA sequences in eukaryotic genomes can be divided into nonrepetitive and repetitive DNA sequences. Nonrepetitive DNA, such as structural genes, are unique sequences that appear only a few (<20) times in a genome. Repetitive DNA is often found thousands of times in a genome, and serves no known purpose but makes up nearly 50% of the genome in mammals. Moderately repetitive DNA occurs about 100-1000 times per genome, while highly repetitive DNA can be found 10,000 times per genome.
2.32.3
Eukaryotic Gene Organization
The genes in eukaryotes can be transcribed by RNA polymerase I, II, or III and have a promoter that is specific for one polymerase. These genes are often regulated singly and may have upstream and downstream control elements to give a high degree of transcriptional regulation mediated by an array of transcription factors. These transcription factors interact with the DNA at specific sequences such as the TATA box found in RNA polymerase II promoters to allow the polymerase to bind to the promoter. When transcribed, eukaryotic mRNA contains introns that must be spliced out before the mRNA can be exported from the nucleus for translation.
2.32.4
Variations in Chromosome Number
Variation in chromosome number is a relatively common occurrence in eukaryotes that results in euploidy (variable numbers of complete chromosome sets) or aneuploidy (variable numbers of one or more chromosomes). Aneuploids result from chromosomal nondisjunction in meiosis I or II that produces a gamete with an extra or missing copy of a chromosome. Following fertilization with a gamete with a normal number of chromosomes, a aneuploid zygote is formed. In organisms such as plants and some amphibians this is not a serious problem, but in humans it is a serious problem that can lead to severe developmental defects or stillbirth. Down’s syndrome is a common congenital form of mental retardation that results from the presence of three copies of chromosome 21 (trisomy 21) in the genome. The presence of extra chromosomes or the absence a chromosome often leads to embryological abnormalities that are fatal before birth. Only in a few cases, such as Down’s syndrome or Klinefelter’s syndrome do aneuploid embryos survive.
2.5 DNA Replication DNA replication is semiconservative and occurs in a 5’ to 3’ direction. The strands of the template DNA helix are separated and replicated to form two new strands consisting of one template and one newly synthesized strand. Replication is initiated at specific origin sites found within the DNA and results in the formation of two replication forks that travel away from the origin.
2.51 Prokaryotic Replication Prokaryotic replication is initiated at the OriC site by the formation of a prepriming complex. The OriC is bound by 20-40 DnaA proteins that open the helix at specific regions within the origin. These open regions of the helix are bound by DnaB/DnaC hexamers that have helicase activity. As the DnaB/DnaC helicase expands the open region of the helix, single stranded binding protein binds the exposed DNA to prevent the helix from reforming while additional proteins bind to the DNA. The replication ‘bubble’ forms when DNA gyrase binds to each replication fork and primase forms the RNA primers for leading strand synthesis on each fork. DNA polymerase III then binds to the primer and replicates the DNA. As the forks enlarge, Okazaki fragments are synthesized on the lagging strand. Replication is terminated when the two expanding replication forks collide 180 degrees from OriC. The RNA primers of the Okazaki fragments are removed by DNA polymerase I, the nicks are sealed by DNA ligase, and mistakes made in replication are repaired.
2.51.1
Prokaryotic DNA Polymerases
DNA polymerases are enzymes that can synthesize new DNA in a 5’-3’ direction from a template strand of RNA. These enzymes require a RNA or DNA primer to initiate the synthesis of new DNA. Three type of prokaryotic DNA polymerases have been identified, and all have 5’-3’ polymerase activity and 3’-5’ exonuclease activity. This 3’-5’ exonuclease activity is essential for proofreading the newly synthesized strand. If an improper nucleotide is incorporated into the new strand, the polymerase will reverse direction and excise the improper nucleotide. The polymerase will then resume synthesizing the new strand of DNA. DNA polymerase I also possesses 5’-3’ exonuclease activity which is essential for nick translation which is used to remove the RNA primers from Okazaki fragments. Polymerase
Function
Activity
Polymerase I Polymerase II Polymerase III
Repair Enzyme Repair Enzyme Replicase
5'-3' Polymerase, 3'-5' and 5'-3' Exonuclease 5'-3' Polymerase, 3'-5' Exonuclease 5'-3' Polymerase, 3'-5' Exonuclease
2.52 Eukaryotic Replication Eukaryotic replication is initiated at multiple origin sites distributed throughout the genome by a mechanism that is thought to resemble prokaryotic initiation. Bi-directional DNA synthesis is then initiated. Histone displacement proceeds the replication forks, and shortly after the fork passes by the histones bind to the replicated DNA.
2.53 Eukaryotic DNA Polymerases Four eukaryotic DNA polymerases have been identified, but their large size and multisubunit construction have complicated analysis. Polymerase
Function
Activity
Polymerase alpha Polymerase beta Polymerase gamma Polymerase delta
Replication and Priming Repair Mitochondrial Replication Replication
5'-3' Polymerase and 3'-5' Exonuclease 5'-3' Polymerase, 3'-5' and 5'-3' Exonuclease 5'-3' Polymerase, 3'-5' and 5'-3' Exonuclease 5'-3' Polymerase and 3'-5' Exonuclease
2.6
Mutation and Repair
Mutation is the creation heritable genetic change due to a spontaneous or induced process. Spontaneous mutations are naturally occurring processes and the rate at which they occur is often characteristic of a particular organism, often specifically in mutational hot spots. Induced mutations are caused by exposure to an external agent such as ultraviolet light or a chemical mutagen. The major classes of mutation transitions, transversion, insertions, and deletions. These mutations may be silent and cause no alterations to protein structure or may be classified as missense, nonsense, or frameshift mutations. Missense mutations cause an improper amino acid to be inserted into the protein during translation, with the potential of altering or eliminating the function of the mutated protein. Nonsense mutations result from the formation of a stop codon within mRNA that causes premature termination of translation. The truncated protein that results from a nonsense mutation is often nonfunctional. Frameshift mutations alter the reading frame of the mRNA, producing an altered protein beyond the point of the frameshift. DNA repair mechanisms identify the class of mutation present and will attempt to repair the defect before it is inherited by the offspring of the effected organism.
2.61 Transitions Transitions are a type of mutation that effects a single base, converting it from one pyrimidine to another pyrimidine or one purine to another purine. Transitions can produce silent, missense, and nonsense mutations.
Nitrous acid is a common chemical agent that oxidatively deaminates cytosine, converting it to uracil. When replicated, this uracil is replaced by a thymine to complete the conversion of a C-G basepair to a T-A basepair. Nitrous acid is also capable of deaminating adenine to change an A-T basepair to a G-C basepair.
Cytosine undergoes spontaneous deaminiation to uracil. This deaminated uracil is recognized as abnormal and repaired, unless the cytosine is methylated at position 5 on the pyrimidine ring. Deaminated 5-methylcytosine is thymine, so it is not identified as abnormal and repaired. This base modification is generated by a cellular enzyme that methylates cytosine residues at specific sites within the DNA sequence. One consequence of the altered base is the creation of mutational hotspots where C-G basepairs spontaneously convert to A-T basepairs.
2.62 Transversions Transversions are mutations that effect a single base, converting it from a purine to a pyrimidine or a pyrimidine to a purine. Transversions can produce silent, missense, and nonsense mutations.
2.63 Insertions and deletions Insertions and deletions are mutations that insert or remove a stretch of one or more nucleotides within a DNA sequence. Mobile genetic elements such as transposons or insertion sequences cause insertions in deletions. Insertions and deletions are noted for their ability to cause frameshift mutations.
2.64 Photoreactivation In prokaryotes, pyrimidine dimers are recognized by DNA photolyase. This enzyme binds to the pyrimidine dimer and breaks the covalent links formed by UV light between the adjacent bases in a light dependent process. The DNA photolyase is then released from the DNA after directly repairing the potential mutation in the DNA.
2.65 Excision Repair In prokaryotes altered basepairs, such as pyrimidine dimers, are recognized by the UvrA/B/C complex. This complex cleaves the sugar-phosphate backbone 7 nucleotides 5’ to the lesion and 3 nucleotides 3’ to the lesion. This short stretch of DNA is removed by the helicase activity of UvrD to allow DNA polymerase I to synthesize new DNA to replace the excised section. DNA ligase then reforms the sugar-phosphate backbone. A similar repair system is found in eukaryotes, and when defective in humans, it leads to a serious disease called xeroderma pigmentosum.
2.66 AP Repair When the bases are removed from DNA by ionizing radiation or a DNA glycosylase, an AP endonuclease cleaves the sugar-phosphate backbone near the apurininc or apyrimidininc site and an exonuclease removes a short stretch of DNA containing the lesion. The resulting gap is then filled in by DNA polymerase I and the sugar-phosphate backbone is reformed by DNA ligase.
2.67 Mismatch Repair The mismatch repair apparatus follows behind replication forks to correct errors in replication. Because the template strand of DNA is methylated at the A residues in GATC sequences and the new DNA is not, it is possible to recognize which strand of DNA should be corrected. Mismatches are recognized by MutS and MutL which bind to the mismatch. The sugar-phosphate backbone of the new strand of DNA is cleaved on either side of the mismatch by MutH. The helicase activity of MutU then removes this stretch of DNA so it can be replaced by DNA polymerase I and DNA ligase.
2.68 Removal of Uracil from DNA When cytosine undergoes spontaneous deaminiation to uracil, this altered base is recognized by uracil-DNA glycosylase. This enzyme cleaved the sugar-nitrogen bond of the uracil to release the base from the sugar-phosphate backbone. Then this apurinic site is repaired by the AP Repair system to fully correct the lesion.
2.8
Genetic Exchange
Genetic exchange is an important means of establishing and maintaining genetic diversity within a population. It has great significance in medicine because it is a means by which microorganisms can become resistant to antibiotics and it is a useful experimental tool to investigate gene expression.
2.81 Transformation Transformation is a process where an organism will take up extracellular DNA in a manner that it could become incorporated into the genome or be maintained as extrachromosomal DNA. Some cells become receptive to transformation during specific phases of their life cycle, while others will not be transformable without laboratory manipulation. To make cells competent for transformation, they can be treated with salts such as CaCl2 or be exposed to high voltages to produce pores in their membranes. After gaining access to the cell, restriction endonucleases may attack the DNA. If the DNA survives it can undergo recombination with the genome or be maintained extrachromosomaly.
2.82 Transduction Transduction is the process in which prokaryotes, and possibly eukaryotes, can acquire new genetic material from viruses. If a virus accidentally packages a portion of host genome, it can be carried to other cells by the virus. When this virus-carried DNA enters a new cell, it can undergo recombination with the new host’s genome.
2.83 Conjugation In some prokaryotes, a plasmid containing genes for a pillus confers the ability to transfer genetic material to other bacteria which do not carry the plasmid. Plasmid containing bacteria, designated F+, use the pillus to attach to untransformed bacteria (F-) to transfer the plasmid. This produces two F+ bacteria, and can also result in the transfer of genomic DNA in some cases. Occasionally, the F plasmid will undergo recombination with the host genome, producing a F’ bacterium. When an F’ bacterium transfers the F plasmid via the pillus, some genomic DNA may be transferred as well. This transferred genomic DNA could then undergo recombination with the DNA of the new host.
2.84 Bacteriophages Bacteriophages are viruses that infect bacteria. These viruses can transfer genetic material via transduction or can alter the genotype of the host with the genetic material of the virus itself. Because of their ability to carry genetic information bacteriophages are utilized as vectors, in a manner similar to plasmids. Bacteriophages are studied as model systems for prokaryotic gene regulation and play an important role in infectious disease. In some cases, such as with toxic shock syndrome, the infection of a benign bacterium with a phage can make it a virulent pathogen.
2.85 Sexual Reproduction When the parental haploid gametes fuse during fertilization, they form a diploid organism that contains half of the genetic makeup of each parent. This produces a situation where an organism may have two different copies (one from each parent) of a single gene. These different copies are alleles and can be either recessive, dominant, or codominant. Recessive alleles are ‘silent’ unless the organism has to recessive alleles, then the phenotype is shown. Usually recessive alleles are defective genes that do not produce their product, but one functional copy can provide enough gene product for the organism to live. Dominant alleles always show their phenotype, and a good example of a dominant phenotype would be a fully functional gene. Codominant alleles will always be shown in the phenotype, but the other allele will be shown as well. The classical example of codominance is the ABO blood groups. This assortment of chromosomes during sexual reproduction provides much of the genetic diversity found in eukaryotes. Defective alleles can be compensated for to allow more offspring to survive, and genes from other populations can be incorporated into the gene pool with time.
2.86 Transposons Transposons are short stretches of DNA that have flanking inverted repeat regions. These transposons can undergo recombination to be incorporated into a host genome or plasmid. The DNA carried between the flanking sequence must contain a resolvase gene that is required to integrate the transposon into DNA and may carry other genes. Genes carried by transposons often confer antibiotic resistance to bacteria, and this represents a major public health concern.
2.87 Recombination Recombination is a reaction between homologous regions of DNA that results in physical rearrangement of the DNA molecule. This reaction produces a phenomenon known as crossing over, which is the exchange of parts of one chromosome for homologous parts of another chromosome. Single stranded breaks in the DNA occur, and the area of homology expands until a chi sequence is reached. When this occurs, the DNA undergoes a conformational change that results in the exchange for part of one chromosome for the comparable part of another chromosome. Recombination occurs at a characteristic frequency that can be used to produce a genetic map. If two genes are separated by one centiMorgan (cM), they have a 1% chance of being separated by recombination during meiosis. As the physical distance between two genes on a chromosome increase, the chance of separation by recombination increases.