Das

  • October 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Das as PDF for free.

More details

  • Words: 727
  • Pages: 3
DAS The Distributed Annotation System (DAS) defines a communication protocol used to exchange biological sequence annotations. It is motivated by the idea that such annotations should not be provided by single centralized databases, but should instead be spread over multiple sites. Data distribution, performed by DAS servers, is separated from visualization, which is done by DAS clients. DAS is a client-server system in which a single client integrates information from multiple servers. It allows a single machine to gather up sequence annotation information from multiple distant web sites, collate the information, and display it to the user in a single view. Little coordination is needed among the various information providers. DAS is heavily used in the genome bioinformatics community. Over the last years we have also seen growing acceptance in the protein sequence and structure communities. [References 1 and 2] DAS Clients: 1. Ensembl 2. US Santa Cruz genome browser Ensembl: Ensembl is a bioinformatics research project aiming to "develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes". It is run in a collaboration between the Wellcome Trust Sanger Institute and the European Bioinformatics Institute, an outstation of the European Molecular Biology Laboratory. [References 3] US Santa Cruz genome browser: The University of California, Santa Cruz Genome Browser Database contains, as of September 2006, sequence and annotation data for the genomes of 13 vertebrate and 19 invertebrate species. The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up to a full chromosome and includes assembly data, genes and gene predictions, mRNA and EST alignments, and comparative genomics, regulation, expression and variation data. The database is optimized for fast interactive performance with web tools that provide powerful visualization and querying capabilities for mining the data. [References 4] Annotation server: The DAS consists of a reference sequence server, and one or more annotation servers. Annotation servers are specialized for returning lists of annotations across a certain region of the genome. Each annotation is anchored to the genome map by way of a start and stop position relative to one of the reference subsequences. Annotations have an ID that is unique to the server and a structured description that describes its nature and

attributes. Annotations may also be associated with Web URLs that provide additional human readable information about the annotation. Annotations have types, methods and categories. The annotation type is selected from a list of types that have biological significance, and correspond roughly to EMBL/GenBank feature table tags. Examples of annotation types include "exon", "intron", "CDS" and "splice3." The annotation method is intended to describe how the annotated feature was discovered, and may include a reference to a software program. The annotation category is a broad functional category that can be used to filter, group and sort annotations. "Homology", "variation" and "transcribed" are all valid categories. The existence of these categories allows researchers to add new annotation types if the existing list is inadequate without entirely losing all semantic value. It is intended that larger annotation servers provide pointers to human-readable data that describes its types, methods and categories in more detail. Another optional feature of annotation servers is the ability to provide hints to clients on how the annotations should be rendered visually. This is done by returning a XML "stylesheet". Although the servers are conceptually divided between reference servers and annotation servers, there is in fact no key difference between them. A single server can provide both reference sequence information and annotation information. The main functional difference is that the reference sequence server is required to serve the DNA itself, while annotation servers have no such requirement. [References 1]

XML: The Extensible Markup Language (XML) is a general-purpose markup language. Its primary purpose is to facilitate the sharing of data across different information systems, particularly via the Internet. It is a simplified subset of the Standard Generalized Markup Language (SGML), and is designed to be relatively human-legible. By adding semantic constraints, application languages can be implemented in XML. These include XHTML, RSS, MathML, GraphML, Scalable Vector Graphics, MusicXML, and thousands of others. Moreover, XML is sometimes used as the specification language for such application languages. Related Projects: 1. Bio-perl 2. Bio-Java These are the Projects that are used as a background for Human genome projects.

References:

1. 2. 3. 4.

http://biodas.org/documents/spec.html http://stein.cshl.org/das/ http://www.ensembl.org/info/data/external_data/index.html http://nar.oxfordjournals.org/cgi/content/abstract/gkl928

Related Documents

Das
October 2019 74
Das
November 2019 65
Das
June 2020 33
Das Wetterchen
June 2020 17
Das Darlehen
May 2020 18
Das Wars
December 2019 46