Chapter 14: Be able to discuss the details of the human genome in relation to chromosomes. Genome is the complete set of DNA within a single cell of an organism. Chromosomes (made up of DNA Aand protein) are a structure containing a chunk of a genome (some of the organism’s genes). Chromosomes help keep the genetic information neat, organized, and compact. What are some of the advantages and disadvantages to working with aDNA? Advantages: -mt DNA safer since smaller -nested primer technique -1,000,000 year limit Disadvantages: -fractured & degraded -postmortem modifications (e.g. C→T, G→A) Are Neanderthals classified as Homo sapiens? Yes An enormous amount of funding has been spent on GWAS. Has it been worth it? genome-wide association studies GWAS found variation in patterns, will be beneficial to constructing personalized medicine. GWAS allows comparison between wt and mutants to try and pinpoint differences. Why would anyone be interested in the discipline of bioinformatics? Bioinformatics - analysis of entire genomes, transcripts and products Used to identify evolutionary relationships among species and functions of genes As a genomics researcher, would you be better served working with genomic DNA, cDNA or ESTs (expressed sequence tags) for a functional genomics project. What if you were working on an evolutionary genomic project? -ORF detection using gDNA (coding region between typical START and STOP codons): →size approximates a typical gene (7-8,000 bp) →identifiable reading frame (1 of 3 possibilities) →expected promoters are near START (TATA, CCAT, CG box, umCpG islands) -using cDNA (complimentary DNA): →reverse transcribe mRNA transcripts to cDNA
→probe gDNA to verify reading frame for comparison →helps identify introns and elucidate alternate splicing of exons; ID by known 5’ and 3’ intron splice motifs -using ESTs (expressed sequence tags): →shorter sequences of ss 5’ or 3’ cDNA and goes in one direction to locate a gene →can determine boundaries of transcript as probes →being cDNA, ESTs sometimes span introns cDNAs and ESTs can be used to reveal exons or gene ends in genome searches. Functional genomics project: EST bc it is small sequences of cDNA which are codons expressed, cheap to use. Evolutionary genomics project: gDNA bc it is the whole genome Be able to explain a very simplified version of the steps necessary to sequence a gene, gene region, intergenic spacer, or intron. Sequence a gene: 1) Isolate genomic DNA (liquid N2, crush, DNA extraction to make a pellet, characterize product size with DNA ladders) 2) Sequence region of interest (design primers for forward and reverse strands; why both directions? Checks for errors) 3) Assemble contigs; WHY? confirmation that returned reads are correct 4) Alignment; WHY? -allows you to make comparisons to locate problematic regions like indels – frameshift in ORF, reversions – different products, nonsense – shortened products. Results in LOF, GOF, and disease. 5) Construct a cladogram; WHY? trace character evolution; see where problem came from and perhaps how other organisms deal with it. How would your approach differ if you were interested in creating an entire genome? Understand the very basics of Sanger sequencing versus pyrosequencing? How does a beaded approach work? For Entire genome – shot gun sequence Sanger sequencing: Has the ability to stop the DNA synthesis by the presence of a modified nucleotide dideoxynucleotide (~ddNTP). In a primer DNA we add: DNA pol, 4 dNTP (dATP, dTTP, dCTP and dGTP) and 1/chase ddNTP (ddATP/ddTTP/ddCTP/ddGTP). Pyrosequencing: o magnetic bead with primer stuck in water droplet inside oil wells, requires 4 enzymes = polymerase, sulfurylase, luciferase and apyrase
o amplify DNA in H2O droplets in oil solution, each droplet contains DNA template attached to primer-coated bead o luciferase generates light to detect nucleotide additions o faster and cheaper; enormous quantity of product but each read short Beaded methods: (-) charged DNA attracted to (+) bead What is the role of a primer? Name 3 things to consider when designing a primer. Primer – short chain of nucleotides; based on the sequence of the adjacent vector DNA, it guides the sequencing reaction into the insert -melting temperature (~55-72°C) -annealing temperature (~5°C) -G-C content of 40-60% -identify primers incapable of forming 2° structures -no self-complementarity for > 4 bp (because will form loop) -no complementarity for other primers for > 4 bps -no single bases with long runs, e.g. AAAAAAAA… -~18-30 bp in length Indels are frequent in eukaryotic genomes; how can one determine whether an indel event is an insertion or a deletion event? How does secondary structure affect insertions, deletions, and inversions during transcription? Compare sequences to closely related organisms to see what they have in that region. Is a piece missing or in something added? Explains indel- Look at mRNA loop (2 shape) When annotating genomic DNA, what are some key features that may indicate the presence of a gene? (You should be able to name at least five things.) 1) TATA box 2) Start and stop codons 3) CCAAT box 4) GC richness 5) unmCpG 6) Size of gene in ORF If ESTs are quicker and cheaper to produce than contigs, why isn’t all genomic work based on ESTs? EST based on cDNA, SHOWS genes location but not introns. Introns may have role in RNA interference or alternative transcripts (splicing) In the world of genomics, what is a query? Why would anyone query a sequence? Query – generate alignments between a nucleotide or protein sequence. To identify similar functions of genes across species or relatedness.
What is the ‘25th’ chromosome in humans and why is it considered so unique? 22 autosomal, X chromosome, Y chromosome, mt chromosome and if plant chloroplast chromosome (cpDNA) 25 is mt chromosome – from mother and has genetic information for ancestral lineage but only through mothers side. What is being referred to when a researcher says a ‘coding gene’? Coding gene is referring to a protein or gene being expressed. Does ‘junk’ DNA exist? Are introns truthfully ‘junk’? Yes junk DNA exist. “Junk DNA” – doesn’t do anything – is untenable (not supported) because a lot of it makes up RNA products (promoters, enhancers and regulatory elements in there). True junk DNA are pseudogenes and retroposons (RNA back into DNA) and even some of these will become functionable, or create new functions in the future, or contain RNAi segments. The operon system used to be considered unique to prokaryotes; why is this potentially considered inappropriate now? Operon system in prok - multiple genes sharing one promoter. This is now also true with some euk. like nematodes – c.elegans and fruit fly – Drosophilia melanogaster. What is a multigenic transcript? Transcript of multiple genes produced by multiple promoters. (Euk) Polcistronic - prok What is the implication of a promotor being bidirectional? The direction of gene synthesis varies depending on the location of the promoter and bidirectionally - allows the direction it goes to where molecules bind to it to express the genes you need. How can a mitochondrion exist with so few genes? Why does it retain genes for tRNA and rRNA? Why are introns lacking in mtDNA? Came from horizontal gene transfer into the nucleus. Nucleus maintain control of mt. It needs its own tRNA because ribosomes are small, nuclear ribosomes in tRNA won’t fit for making mt protein. And rRNA for large and small subunit to make ribosome. If 12p31 contains umCpG islands, would you expect this position to be telomeric, subtelomeric or centromeric? If CpG islands are beneficial in transcriptively active regions, why aren’t they more common throughout the genome? Subtelomeric – below telomeres Telomere and centromere are heterochromatin and noncoding
Why are processed pseudogenes more likely to become inactivated than nonprocessed pseudogenes? If a pseudogene is inactive, why doesn’t the host cell eventually purge it since it seems to no longer have a protein-coding function? Pseudogenes- ORFs or partial ORFs that are nonfunctional or inactive due to the manner of their origin or to mutations.; inactivated LOF – chop off middle or make nonsense mutations May have a function for Host cell (RNAi) evolutionarily beneficial due to abundance & homology →antisense pseudogene may regulate parent gene →some produce siRNA →some take on a new function (neofunctionalization) via exaptation (evolved for one function and now has a different function?): -XIST = silences one of female’s double Xs -imprinting = silences certain parental alleles’ -non-processed = →copying took place at the genomic DNA level →intronic, exonic & occasionally upstream promoters →rarer; ~4000 in humans -processed = →reverse transcription of mRNA into cDNA; exonic →~8000 in humans (why nearly double the number?) Explain the pros and cons to an RNA world. RNA World Hypothesis: -RNA was the original genomic molecule -autocatalytic in some instances, e.g. Rnases = ribozymes -lots of housekeeping roles for RNA →key to protein synthesis: tRNA, mRNA, rRNA, snRNA, snoRNA Con: unstable With DNA: -greater stability; lack of 2’-OH (deoxy’) makes it less prone to hydrolytic cleavage; methylated thymine -db-nature offers greater fidelity during replication Why are linear chromosomes pervasive in eukaryotes and not so in prokaryotes? Greater stability Know the most basic difference between tRNA, rRNA, mRNA, siRNA, miRNA, piRNA, lncRNA, snRNA and RNase.
tRNA – transporters to move amino acids to the ribosome that pieces them together to make a protein rRNA – makes up large and small ribosomal subunits and also has catalytic power to piece the amino acids together mRNA – template to produce a protein siRNA – 20-25 bp, post-transcriptional gene silencing, RNAi: dsRNA separated into ss (guide) that binds to mRNA and chops it up miRNA – ~22 bp, post-transcriptional gene silencing and modification, RNAi: dsRNA separated into ss that imperfectly binds to mRNA and blocks translation; sometimes some miRNAs act as TFs that upregulate transcription piRNA – RNAi pathway typically found in germ cells (produces gametes) that mess up transposons thereby silencing their potentially harmful effect lncRNA – regulate gene transcription; X inactivation snRNA –ribozymes that act like enzymes and also form the spliceosome mechanism that cuts out introns and pieces together exons in eukaryotes during processing of the primary transcript RNase – ribonuclease; a type of nucleases that catalyzes the degradation of RNA into smaller components In regards to genes, if a homolog can become an ortholog or a paralog can a paralog ever become an ortholog? Evolutionary history of a group = phylogeny Closely related genes (DNA sequence and amino acid sequence of the proteins they encode) = homologs Genes at the same genetic loci (location) in different species; inherited through a common ancestor = orthologs Homologous genes at different genetic loci (location) in the same organism through duplication of genes within a genome = paralogs Yes, through mutations or transposons How do people use phyogentic cladograms in research? Why are ‘old-school’ phylogeneticists obsessed with parsimony? (We didn’t discuss this in class but most modern phylogenetics resolve cladograms using maximum likelihood and Bayesian modeling now.) Cladograms- to show the relationship between species. Parsimony – using a common ground to distinguish relationships between species Easy explanation How has molecular systematics advanced our understanding of evolutionary relationships between organisms? Molecular systematic is considered unbiased Ability to study and distinguish differences of individuals and species over time, and ability to narrow a common ancestor among them.
Is a mouse model appropriate in medical research? Why not use chimpanzees like researchers did for decades during the mid-to-late 1900s? Mouse are easy, short life span, genome is known and reproduce rapidly. They are also genetically similar to humans. Cell culture is best Explain the concept of synteny. Synteny – the order of genes on a chromosome for one species is the same for another species If your research is able to compile a genome, why would anyone else have to continue on with your findings and produce transcriptomes, proteomes, interactomes, or epigenomes? Not all genes are expressed in Genome so need transcriptome. Post-transcriptional or co-transcriptional modification need proteome Because of complexity of signaling pathways, few proteins work in isolation and need interactomes Who cares if you can introduce GFP into a tissue sample or an organism? Why would anyone be interested in doing that in the first place? For identifying the location of the target of interest. What is targeted mutagenesis? If one can do this, why the need for crossing experiments and phenotypic scoring? Targeted mutagenesis- specific chemicals that cause mutations in areas Testing, identifying carriers from dominant and recessive conditions Is ChIP-chip technology some kind of way that Martha Stewart figured out for getting more chocolate chips into a cookie? ChIp is a technique for isolating the DNA and its associated proteins in a specific region of chromatin so that both can be analyzed together. ChIp-chip is to identify multiple binding sites in a sequence genome. Proteins bind to many genomic regions are immunoprecipitated. after cross-linking is reversed, the DNA fragments are labeled and used to probe microarray chips that contain the entire genomic sequence of the species under study. Microarray chips can detect differences in DNA expression. -Create a target cDNA sample -Extract transcripts from sample at a certain time and make cDNA probes -Probes hybridize (bind) to original cDNA sample to show what genes were being expressed