Exercises

  • Uploaded by: Diego Forero
  • 0
  • 0
  • April 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Exercises as PDF for free.

More details

  • Words: 1,497
  • Pages: 6
NEUROGENOMICS: APPLICATIONS AND ANALYSIS Diego A. Forero, MD, PhD (c) 1,2,3,4,5 1

Applied Molecular Genomics Group, Department of Molecular Genetics, Flanders Institute for Biotechnology (VIB); 2 University of Antwerp, Antwerp, Belgium; 3 Laboratory of Developmental Genetics, VIB; 4 Catholic University of Leuven, Leuven, Belgium; 5 Grupo de Neurociencias, Universidad Nacional de Colombia, Bogotá, Colombia. Email: [email protected] http://users.skynet.be/dforero/index.htm I have consolidated a set of exercises, in which you can apply different in-silico approaches to common research problems in genetics and genomics. It is expected that the application of these tools will enhance the design and analysis of neurogenomics experiments, in terms of scope, precision and speed. All the bioinformatics tools required to solve these exercises are listed in my website: http://users.skynet.be/dforero/df9.htm 1. Identify the number of haplotype blocks that are found in the following human genes -CREM gene in European population -GABRA6 gene in African population -BDNF gene in Asian population -LMNA gene in African population -PRNP gene in European population 2. Identify the tagging SNPs for the following human genes: -GRIA2 gene in European population -PDE4B gene in African population -HTR2C gene in Asian population -KCNA2 gene in African population -RIMS3 gene in European population 3. Find the top 10 candidate targets for each one of the following human microRNAs: -hsa-mir-132 -hsa-mir-134 -hsa-mir-7 -hsa-mir-135b -hsa-let-7a 4. Identify the predicted secondary structures of the following human miRNAs: -hsa-mir-132 -hsa-mir-134 -hsa-mir-7 -hsa-mir-135b -hsa-let-7a 5. Retrieve the tissue with the highest expression in humans for each one of these genes. -APOE 1

-CREM -BDNF -PRNP -BACE1 6. Retrieve the dbSNP identifiers for the following human variations -GCTGTAGGCCAGACCCTGGCA(A/C)GATCTGGGTGGATAATC -AAATGAGGACTTCTGACCTC(A/G)AACGCTGCCCTTGTTCTT -GCAGCCGGACAAACTTGCCCTCCTC(A/G)CCACCTCCTCCAC -ACTATTAATGATAATACT(A/G)TCTCTCATTTATTGAGCATT -CTGACACTTTCGAACAC(A/G)TGATAGAAGAGCTGTTGGATG

7. Identify the top 10 candidate genes for Alzheimer disease and the top 10 candidate genes for Parkinson disease (with basis in meta-analysis of published association studies): 8. Retrieve the list of known genes located in the following human genomic regions: -9q34.3 -21q21.3 -17p13.1 -11q23.3 -1q23.2 9. Identify the repeat sequences that are present in the following human genomic regions: -chr17:8279904-8312206 -chr2:86247142-86276108 -chr6:16846682-16869700 -chr1:40858939-40903911 -chr6:163755665-163914884 10. Identify the effects on transcription factors binding sites for the following SNPs: -rs34706444 -rs12028379 -rs5774713 -rs12239355 -rs17129477

11. Identify the vector sequences that are present in the following DNA fragments: -acacctttgaggtgaaagagtattcagtgaatatgatggtcatgatgatgtcaccttggatttaaggcattttcttaag atgtgtaaagtatgttcctttagccgccaccgcggtggagctcccagcttttgttcccttta -tatctgggctttagtttctccatcattacaatgaagagatgtgctatccttttccaccctgttctaaaattgtgtaact tttttttttcttttttgagacatgcacgagtgggttacatcgaactggatctcaacagcggt -gtagtcaggattctgctgacctgcttacagggcactaaatacctgaggaggcaggagcttgggggaaagctgagaggta tctatccccatctacctactgatggagttccgcgttacataacttacggtaaatggcccgcc

12. Identify the top candidate variations in the following human DNA sequence traces: You will use a file with the chromatograms of 96 subjects sequenced for a 500 bp region. 13. Retrieve the genomic lengths, protein lengths, chromosomal positions and number of exons for the following genes: -PLXNA2 -NRG1 -MTHFR -DTNBP1 -SLC6A4

2

14. Identify the homologues in mouse and drosophila of the following human genes: -SV2A -PDE4B -DRD1 -SYT1 -RGS4 15. Design overlapping PCR primers to sequence the following human genomic regions: -chr1:40,879,177-40,883,673 -chrX:77,256,575-77,258,830 -chr8:26,530,136-26,532,811 -chr4:122,960,094-122,962,212 -chr5:161,054,462-161,056,347 16. Identify the differential GO and KEGG terms in the following two lists of human genes: List 1. GPR51, GRIA2, KIF5C, MBP, MEF2C, NAP1L3, NCDN, NDRG4, NEFL, NRGN, NTRK2, OLFM1 List 2. AKAP6, BRF1, CCNA2, DST, MACF1, NBEA, RAB11A, RANBP5, SEC8L1, SYNE1, ZFYVE20, ZNF490 17. Identify the proteins encoded by the following RNA sequences: -atggaaaaccccagcccggccgccgccctgggcaaggccctctgcgctctcctcctggccactctcggcgccgccggcc agcctcttgggggagagtccatctgttccgccagagccccggccaaatacagcatcaccttcacg

-atggagctggaccaccggaccagcggcgggctccacgcctaccccgggccgcggggcgggcaggtggccaagcccaacgtgatcctgc agatcgggaagtgccgggccgagatgctggagcacgtgcggcggacgcaccggcac -atgggcttgttagagtgctgtgcaagatgtctggtaggggccccctttgcttccctggtggccactggattgtgtttct ttggggtggcactgttctgtggctgtggacatgaagccctcactggcaca

18. Identify the cDNAs encoding the following protein sequences: -LCADARMYGVLPWNAFPGKVCGSNLLSICKTAEFQMTFHLFIAAFVGAAATLVSLLTFMIAATYNFAVLKLMGRGTKF -EMMDLQHGSLFLRTPKIVSGKDYNVTANSKLVIITAGARQQEGESRLNLVQRNVNIFKFIIPNVVKYSPNCKLLIVSN -MVDMMDLPRSRINAGMLAQFIDKPVCFVGRLEKIHPTGKMFILSDGEGKNGTIELMEPLDEEISGIVEVVGRVTAKAT

19. Find the hierarchical clustering of the following list of genes: Gene 1 Gene 2 Gene 3 Gene 4

Tiss1 0,052905 0,0336 0,021603 0,01405

Tiss2 Tiss3 Tiss4 Tiss5 0,058392 0,06977 0,056961 0,074954 0,095512 0,061694 0,036708 0,050386 0,024434 0,021238 0,018759 0,01518 0,018037 0,008364 0,010938 0,017524

Tiss6 0,061005 0,042539 0,015751 0,006858

Tiss7 0,050068 0,030157 0,012132 0,005407

Tiss8 0,059917 0,056136 0,027813 0,016314

20. Find the genes that have their highest expression in prefrontal cortex (200 fold enrichment in comparison with other tissues), repeat it for amygdala. 21. Identify the transcripts that are targeted by the following affymetrix probes: -204312_x_at -207630_s_at -210400_at -212581_x_at -201891_s_at 22. Identify the haplotypes that are present in the following dataset (including their frequency and calculate the LD values between SNPs). 3

S1

S2

S3

S4

S5

S6

S7

S8

S9

S10

S11

S12

S13

subj1

CT

AG

AC

TT

AC

CT

GG

CC

CT

AG

CT

AA

AG

subj2

TT

GG

AA

TT

CC

CC

GG

CC

CT

AG

CT

AG

AG

subj3

CT

AG

AA

TT

CC

CC

GG

CC

CC

GG

TT

AA

AG

subj4

CT

AG

AA

CT

CC

CC

AG

CT

CT

GG

CT

AG

AG

subj5

CT

AG

AA

TT

CC

CC

GG

CC

CT

AG

CT

AG

AG

subj6

CT

AG

AA

TT

CC

CC

GG

CC

CC

GG

TT

AA

AA

subj7

CT

AG

AA

TT

CC

CC

GG

CT

CT

GG

CT

AG

AG

subj8

CC

AG

AC

TT

CC

CC

GG

TT

TT

GG

CC

GG

GG

subj9

TT

GG

CC

TT

AA

TT

GG

CT

CT

GG

CT

AG

AG

subj10

TT

GG

AA

TT

CC

CC

GG

CC

CT

AG

CT

AG

AG

subj11

CT

AG

AA

TT

CC

CC

GG

CC

CT

AG

CT

AA

AG

subj12

CT

GG

AA

TT

CC

CC

GG

TT

TT

GG

CC

GG

GG

subj13

CT

AG

AA

TT

CC

CC

GG

CC

CC

GG

TT

AA

AG

subj14

TT

GG

AA

CT

CC

CC

AG

CC

CT

AG

CT

AG

AG

subj15

CT

AG

AA

TT

CC

CC

GG

CC

CC

GG

TT

AA

AG

subj16

CT

AG

AA

TT

CC

CC

GG

CC

CT

AG

CT

AG

AG

subj17

CT

GG

AC

TT

AC

CT

GG

TT

TT

GG

CC

GG

GG

subj18

CC

AA

AA

TT

CC

CC

GG

CC

CT

AG

CT

AG

AG

subj19

CT

AG

AA

TT

CC

CC

GG

CC

CC

GG

TT

AA

AA

subj20

CC

AA

AC

TT

CC

CT

GG

CC

CC

GG

TT

AA

AA

subj21

CT

AG

AA

TT

CC

CC

GG

CC

CC

GG

TT

AA

AG

subj22

CC

AA

CC

TT

AA

TT

GG

CC

CC

GG

TT

AA

AG

subj23

TT

GG

AA

TT

CC

CC

GG

CC

CC

GG

TT

AA

AG

subj24

TT

GG

AA

TT

CC

CC

GG

CC

CC

GG

TT

AA

AA

23. Identify the predicted functional effects of each one of the following nsSNPs: -rs28931579 -rs769452 -rs28931577 -rs11542040 -rs11542035 24. Retrieve the genomic sequence for all the exons (including 50 bp of flanking sequence) of the following genes: -RGS4 -RIMS3 -RTN1 -SLC1A3 -SNAP25 25. Identify the interacting partners for each one of the following genes: -MEF2C -NAP1L3 -NCDN -NDRG4 -NEFL 26. Identify which of the next P values pass a False Discovery Rate of 0.05. 0,650106935, 0,308093469, 0,463145394, 0,19572116, 0,112681844, 0,493084372, 0,043017213, 0,515230709, 0,098477813, 0,276669253, 0,4536028, 0,927263525, 0,000763073, 0,391324056, 0,381511095, 0,003431856, 0,206671413, 0,354702281, 0,25477432

4

27. Identify the top 10 down-regulated genes in post-mortem schizophrenia brains, repeat it for bipolar disorder. 28. Design PCR primers that allow the cloning of the following fragments: -chrX:77256575-77256975; EcoRI and HindIII -chr8:26530136-26530636; HindIII and XbaI -chr6:16846682-16846982; EcoRI and XbaI -chr1:40858939-40859339; HindIII and EcoRI 29. Identify the genomic regions that are amplified using the following PCR primer pairs: -F-ATGGAGTGGCTAGAAGAGTCAG R-TGGATCATTTGCGATTTCCAGTT -F-AGGGCTTCCTTATGTCCTCCA R-TACCCACGTACCATTAGGAGC -F-AAAAGCAGGAGTGTGATGACG R-CGATCCCAAGTGTGTTACTGG 31. Identify the maximum LOD score simulated for the following pedigree:

32. Identify the nucleotide that is conserved in mouse and rat for the following SNPs: -rs9817739 -rs1937690 -rs7973772 -rs278151 -rs10128858 33. Design primers to genotype the following SNPs by AS-PCR: -rs974849 -rs246835 -rs12768718 -rs10185953 -rs5753220 34. Design primers to genotype the following SNPs by PCR-RFLP: -rs16949418 -rs4979416 -rs4852259 -rs11593916 -rs10488140 35. Identify the number of citations for the papers with the following PMIDs: -17173049 -16862116 5

-8895455 -818641 -17571346 36. Identify the predicted network of interactions for the following genes: CAMK2B, DNER, DNM1, EEF1A2, ELAVL4, GFAP 37. Identify the best predicted drug compound that can modulate the activity of the following genes: -CAMK2B -NTRK2 -VDAC1 -CCNA2 -PDE4B 38. Design PCR primers to differentiate between cDNA and genomic DNA for the following genes: -TF -TU3A -TUBB4 -UCHL1 -VSNL1 39. Identify the most suitable journal to publish a hypothetical paper with the following abstract: Human memory is a polygenic trait. We performed a genome-wide screen to identify memoryrelated gene variants. A genomic locus encoding the brain protein KIBRA was significantly associated with memory performance in three independent, cognitively normal cohorts from Switzerland and the United States. Gene expression studies showed that KIBRA was expressed in memory-related brain structures. Functional magnetic resonance imaging detected KIBRA allele– dependent differences in hippocampal activations during memory retrieval. Evidence from these experiments suggests a role for KIBRA in human memory. 40. Identify the significant SNPs in a genome wide association study and identify possible runs of homozigosity in the same dataset. You will download a publicly available dataset with results from about 500.000 SNPs. 41. Identify SNPs that are located in conserved transcription factor binding sites in chromosome 1; retrieve SNPs that are located in microRNA binding sites in chromosome 2. 42. Identify the Ensembl IDs for the genes of the point 36.

DF, 03-2008 If you use these exercises for teaching purposes, please cite the original source; if you have commentaries or suggestions, please do not hesitate to contact me by email.

6

Related Documents

Exercises
November 2019 47
Exercises
April 2020 35
Exercises
October 2019 49
Exercises
December 2019 28
Exercises
June 2020 15
Exercises
November 2019 54

More Documents from ""

Exercises
April 2020 35
Psiquiatria-genetica
April 2020 8
Data Mining Workshop
May 2020 10
November 2019 24
April 2020 18