Blast

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Blast as PDF for free.

More details

  • Words: 847
  • Pages: 28
BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

Basic Principles of BLAST Analysis Additional information can be obtained from the information pages at www.ncbi.nlm.nih.gov/Blast

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

Analyzing the Sequenced Genes •





• • •



Structure prediction – Secondary structure of DNA and RNA – Possible 3-D structure of proteins Identity of the encoded gene/gene product – Prediction of general physical properties (e.g. M.W., pI; may be important for proteonomic analysis) – Database (e.g. Genbank) search based on sequence homology Possible function of the encoded gene product – Search for signature domains or function motifs using consensus patterns (based on statistics) Possible location of the encoded gene product – Prediction of subcellular localization by consensus patterns Prediction of evolutionary relationship – Multiple alignment, clustering, etc. Gene prediction from genomic sequences – Prediction for coding regions and location of introns – Prediction for promoter regions Prediction of regulatory sites – Prediction of consensus cis-acting regulatory elements

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

blastn: good for high score search; not for comparison of distant relationship blastp: use substitution matrix to find distant relationship; can use SEG to filter low complexity region blastx: use for new DNA sequences and analysis of ESTs tblastn: search for coding regions that are not defined in the database tblastx: use for analysis of ESTs

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BLAST Search • www.ncbi.nlm.nih.gov/Blast • Basic Local Alignment Search Tool • Uses heuristic algorithm which seeks local (instead of global) alignments; able to detect relationships among sequences which shares similarity only in isolated regions • The initial search is done for a word of length “W” that scores at least “T” when compared to the query using a substitution matrix • Word hits are then extended in either direction in an attempt to generate an alignment with a score exceeding the threshold of “S”

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

Word Size = Word Length = 11 Expect = The statistical significance threshold for reporting matches against database sequences; the default value is 10, meaning that 10 matches are expected to be found merely by chance Expect=Kmne-λT

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam Bit Score The value S’ is derived from the raw alignment score S in which the statistical properties of the scoring system used have been taken into account. Because bit scores have been normalized with respect to the scoring system, they can be used to compare alignment scores from different searches. S’=(λS-lnK)/ln2 [λ and K are normalizing parameters]

E Value Expectation value. The number of different alignments with scores equivalent to or better than S’ that are expected to occur in a database search by chance. The lower the E value, the more significant the score. E=mn2-S’ [m: effective length of the query; n: total number of bases of the database]

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

CDD Search Compares protein sequences to the Conserved Domain Database. The CDD is a database containing a collection of functional and/or structural domains derived from two popular collections, Smart and Pfam, plus contributions from colleagues at NCBI.

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

PSI-BLAST Position specific iterative BLAST refers to a feature of BLAST 2.0 in which a profile (or position specific scoring matrix, PSSM) is constructed (automatically) from a multiple alignment of the highest scoring hits in an initial BLAST search. The PSSM is generated by calculating position-specific scores for each position in the alignment. Highly conserved positions receive high scores and weakly conserved positions receive scores near zero. The profile is used to perform a second (etc.) BLAST search and the results of each "iteration" used to refine the profile. This iterative searching strategy results in increased sensitivity.

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

PSSM Position-specific scoring matrix. Based on a Profile (A table that lists the frequencies of each amino acid in each position of protein sequence. Frequencies are calculated from multiple alignments of sequences containing a domain of interest). The PSSM gives the log-odds score for finding a particular matching amino acid in a target sequence.

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

BIO4320 Lecture Materials, Prepared by Dr. Hon-Ming Lam

Related Documents

Blast
November 2019 28
Blast Furnace.docx
May 2020 17
Blast Furnace.docx
May 2020 11
Bomb Blast
May 2020 15
0 Blast
November 2019 26
Blast Loading
July 2020 11