Structured Data In Google

  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Structured Data In Google as PDF for free.

More details

  • Words: 7,140
  • Pages:
Structured data (rich snippets) • • • • • • •

Marking up structured data About microformats About RDFa Reviews People Products Businesses and organizations

Marking up structured data Print

About structured data Structured data makes the web a better place. It also helps Google better understand and present your page in search results. For example, we can recognize the following kinds of information on pages containing reviews, and may make them available in search results: • • • •

The writer of the review The date the review was written The rating (for example, 4/5). For items with multiple user reviews, the number of reviews and average rating. Google's first use of this data will be in search results snippets for two kinds of objects: Reviews and People. Providing more detail in search results helps users to understand the value of your pages. When users get more information showing how your page is relevant to their search, they're more likely to click through to see the full page. (Including this information will not necessarily affect your search results. As always, Google will use its own algorithms and policies to determine what information to show and when to show information based on user needs.) This structured data can also be used by Custom Search engines on your site, and gives you much more control over the behavior of your Custom Search engine. (Note: We currently support this structured content in English only.) At Google, we believe in openness, so we are using two open standards to allow you to annotate structured data on your site: microformats and RDFa. Both standards allow markup of information on your pages. To ensure that Google understands your markup, we encourage you to follow the format of our examples. You don't need any prior knowledge of microformats or RDFa to use these standards, just a basic knowledge of XHTML.

How microformats and RDFa work

Imagine that you have a review of a restaurant on your page. In your HTML, you show the name of the restaurant, the address and phone number, the number of users who have provided reviews, and the average rating. People can read and understand this information, but to a computer it is nothing but strings of unstructured text. With microformats or RDFa, you can label each piece of text to make it clear that it represents a certain type of data: for example, a restaurant name, an address, or a rating. This is done by providing additional HTML tags that computers understand. These don't affect the appearance of your pages, but Google and any other services that look at the HTML can use the tags to better understand your information, and display it in useful ways—for example, in search results. You can use either whichever standard you prefer—microformats or RDFa— and you don't need to understand one in order to use the other. If you are marking up structured data on your web pages, you can let us know. While we won't be able to individually reply to everyone who fills out this form and we can't make guarantees about how we may use data from any particular site, we may be in touch to learn more about your data. For more information, see the following articles: • About microformats • About RDFa For specific vocabulary and examples, see: • • • •

Reviews People Products Businesses and organizations

About microformats Print

Microformats Marking up data using microformats Microformats are simple conventions used on web pages to represent commonly published things such as reviews, people, products, and businesses. Generally, microformats consist of <span> and
elements and a class property, along with a brief and descriptive property name (such as dtreviewed or rating to represent the date a business was reviewed and its rating, respectively.) Here is a short HTML block showing basic contact information for a person. HTML

Rendered HTML in browser

<strong>Bob Smith

Senior editor at ACME Reviews

200 Main St

Desertville, AZ 12345



Bob Smith Senior editor at ACME Reviews 200 Main St Desertville, AZ 12345

This is the HTML marked up with additional microformats:

<strong class="fn">Bob Smith

<span class="title">Senior editor at <span class="org">ACME Reviews

<span class="adr"> <span class="street-address">200 Main St
<span class="locality">Desertville, <span class="region">AZ <span class="postcode">12345

To understand microformats, think about two concepts: entities (for example, a person) and properties of those entities (for example, name, job, company, and address). Entities are called "microformats". The microformat used to describe people is called hCard and referred to in HTML as vcard. In general, microformats uses the class attribute in HTML tags to assign labels to specific types of data. In the first line of the example above, we create a div tag (any other tag, such as span would work equally well) to include a class attribute with the value vcard. The vcard class indicates that the enclosed HTML describes a person. (hCard can also be used to describe places or organizations.) To use microformats, all you need to do is wrap your content in the appropriate labels using the class attribute. The labels change depending on the kind of data you are marking up.

Inside the HTML block labeled vcard, you can label properties about the person, such as photo (a photo of the person), fn (the person's name), title (the person's job title), org (the person's company or organization), and adr (the person's address). Properties such as adr can contain sub-properties such as street-address, locality (city), region (state or province), and postcode. The class attribute is also used in many web pages for other reasons, such as CSS rendering. Microformats classes and non-microformats classes can generally co-exist happily, but be careful to avoid naming conflicts. Nested microformats Let's now say that you now have the following example: HTML

Rendered HTML in browser

<strong>Blast 'Em Up Review

by Bob Smith, Senior Editor at ACME Reviews

This is a great game. I enjoyed it from the opening battle to the final showdown with the evil aliens.

4.5



Blast 'Em Up Review by Bob Smith, Senior Editor at ACME Reviews This is a great game. I enjoyed it from the opening battle to the final showdown with the evil aliens. 4.5

In this case, one entity (a person, Bob Smith) is nested inside a review. Reviews are described by the hReview microformat, written as class="hreview". People are described by the hCard format, written as class="vcard". Here is how the code block above looks with microformats markup:

<span class="item"><strong>Blast 'Em Up Review

by <span class="reviewer vcard">

by <span class="fn">Bob Smith, <span class="title">Senior Editor at <span class="org">ACME Reviews



<span class="description">This is a great game. I enjoyed it from the opening battle to the final showdown with the evil aliens.

<span class="rating">4.5

Since this is a review, we use the class="hreview" label to wrap the entire HTML block. This example introduces a few new properties, such as item (the item being reviewed; item can include the sub-property fn, which represents the name of the item), reviewer, description, and rating. We combine review and people data by adding the following line: <span class="reviewer vcard"> By putting reviewer and vcard on the same line and separated by a space, we are defining the person (Bob Smith) as the writer of the review. To learn more about microformats, visit microformats.org. For specific vocabulary and examples, see: • • • •

Reviews People Products Businesses and organizations

About RDFa Print

RDFa Marking up content with RDFa The following block of HTML shows a review of a video game. HTML

Rendered HTML in browser

<strong>B last 'Em Up Review

by Bob Smith

March 20, 2009

This is a great game. I enjoyed it from the opening battle to the final showdown with the evil aliens.

4.5 out of 5 stars

Blast 'Em Up Review by Bob Smith March 20, 2009 This is a great game. I enjoyed it from the opening battle to the final showdown with the evil aliens.

To understand how to use RDFa, think about two concepts: entities (for example, a review) and properties of those entities (for example, the author of the review, the date of the review, the review itself, and the rating). This is the HTML with RDFa markup:

<strong><span property="v:itemreviewed">Blast 'Em Up Review

by <span property="v:reviewer">Bob Smith

<span property="v:dtreviewed">March 20, 2009

<span property="v:description">This is a great game. I enjoyed it from the opening battle to the final showdown with the evil aliens.

<span property="v:rating">4.5 out of 5 stars

This example contains three important properties: •

xmlns. Occurs in the first line, and specifies the namespace where the vocabulary (a list of entities and their components) is specified. You can use the xmlns:v="http://rdf.data-vocabulary.org namespace

declaration any time you are marking up pages for people, review, product, or place data. Be sure to use a trailing slash (xmlns:v="http://rdf.datavocabulary.org/" ). • typeof: Occurs in the first line of this HTML block, and defines entities. Since this example contains a review, the entity is of type review. • property: Used to label the properties of an entity. In the example, there are many properties of the review that are labeled: the reviewer, date of the review (dtreviewed), the review itself (description), and the rating (rating). These three properties can be used in any HTML tags that open and close (div and span are two common choices). To mark up content using RDFa: 1. Begin with a namespace declaration using xmlns 2. Specify the type of content that is being marked up using typeof 3. Label the properties using property. Relationships between entities in RDFa In the example below, we describe two entities: a review and a person. HTML

Rendered HTML in browser

<strong>Bl ast 'Em Up Review

by Bob Smith, Senior Editor at ACME Reviews

This is a great game. I enjoyed it from the opening battle to the final showdown with the evil aliens.

4.5 out of 5 stars

Blast 'Em Up Review by Bob Smith, Senior Editor at ACME Reviews This is a great game. I enjoyed it from the opening battle to the final showdown with the evil aliens. 4.5 out of 5 stars

In this example, the relationship between the two entities is that the person is the reviewer who created the review. Both the review and the person have their own set of properties. The properties of the person are their name (Bob Smith), job title

(Senior Editor), and company (ACME Reviews). The properties of the review are the reviewer (an entity), the review itself, and the rating (4.5). To convey the relationship between the review and the person, we use the rel property. Here is how this example looks with RDFa markup:

<strong><span property="v:itemreviewed">Blast 'Em Up Review

by <span rel="v:reviewer"> <span typeof="v:person"> <span property="v:name">Bob Smith, <span property="v:title">Senior Editor at <span property="v:org">ACME Reviews

<span property="v:description">This is a great game. I enjoyed it from the opening battle to the final showdown with the evil aliens.

<span property="v:rating">4.5 out of 5 stars

The following two lines define the relationship between the two entities:

by <span rel="v:reviewer"> <span typeof="v:person"> Here, by using rel instead of property, we define a relationship between the review and the person, namely that the writer of the review (the "reviewer") is an entity, with its own properties such as name, title, and org. "rel" without "typeof" The final concept to understand in order to mark up your content with RDFa is that rel can exist without an explicitly labeled typeof. In these cases, the entity is implicitly defined. HTML

Rendered HTML in browser

<strong>Bob Smith

Senior editor at ACME Reviews

200 Main St

Desertville, AZ 12345



Bob Smith Senior editor at ACME Reviews 200 Main St Desertville, AZ 12345

Here is the HTML with RDFa markup:
<span rel="v:photo">

<span<strong property="v:name"><strong>Bob Smith

<span property="v:title">Senior Editor at <span property="v:org">ACME Reviews

<span rel="v:address">

<span property="v:street-address">200 Main St

<span property="v:locality">Desertville

<span property="v:region">AZ

<span property="v:postcode">12345

In this example there are two implicitly defined entities: the person's photo and their address. Since the address property always relates to an entity of type address, there is no need to explicitly include a line with typeof="v:address". Similarly, a photo always relates to a URL pointing to an image, so there is no need to explicitly define a typeof property. This article gives an overview of how to start marking your content with RDFa. For more information, see the official RDFa primer. For specific vocabulary and examples, see: •

Reviews

• • •

People Products Businesses and organizations

Reviews Print

About review data When review information is marked up in the body of a web page, Google can identify it and may make it available in search results pages. Review information such as ratings and descriptions can help users to better identify pages with good content.

Properties Google recognizes the following Review properties. Where the RDFa Review and microformats hReview property names differ, the hReview property appears in parentheses. Property

Description

itemRevi The item being reviewed ewed (item) name (fn) The name of the item being reviewed. Child of item. rating

A numerical quality rating for the item (for example, 4) based on a scale of 1-5. You can optionally specify worst (default: 1) or best (default: 5)

reviewer The author of the review. dtreview The date that the item was reviewed. ed description The body of the review. Note: For simplicity, examples are shown with only the information that is being tagged. However, in realistic pages, these tags will be spread throughout the web page, mixed with unmarked text. Example: Microformat
<span class="item"> <span class="fn">L'Amourita Pizza <span class="rating">3.5

<span class="reviewer">Ulysses Grant
<span class="dtreviewed">2009-01-06 <span class="summary">"Delicious, tasty pizza in Eastlake."
The structured information is conveyed by the class properties (such as class="rating" and class="reviewer", and the values (such as 3.5, Ulysses Grant), etc. You can change the tags such as span and div to suit your formatting needs. Note:Sometimes the rating is included directly in the HTML, but in other cases webmasters often use an image to indicate a rating (for example, an image showing four stars out of five). If your site uses an image to indicate a rating, you should add class="rating" to the image tag. Google will extract the rating from the alt text. For example: 4 Star Rating: Recommended Example: RDFa <span xmlns:v="http://rdf.data-vocabulary.org/" typeof="v:Review"> <span property="v:itemReviewed">Komala Vilas <span property="v:reviewer">Meenakshi Ammal <span property="v:rating">3.7 <span property="v:dtreviewed">1st April 2005 <span property="v:summary">Best south Indian vegetarian food in South Bay You can use the additional expressiveness of RDFa to provide more information about the subject of your review. Google does not currently use the about property in search results, but it may be used in the future. For example: <span xmlns:v="http://rdf.data-vocabulary.org/" typeof="v:Review"> <span rel="v:itemReviewed"> <span about="http://komalavilas.com" property="v:name" typeof="v:Restaurant">Komala Vilas <span rel="v:reviewer"> <span about="http://rdf.freebase.com/ns/ en.s_meenakshi_ammal" property="v:name">Meenakshi Ammal <span property="v:rating" >3.7 <span property="v:date">1st April 2005

<span property="v:summary">Best south Indian vegetarian food in the bay area If the object you're referring to does not have an obvious URL to include, you could use the URL of pages on Wikipedia or similar web sources.

Aggregated reviews Google also recognizes markup about aggregated reviews (for example, the total number of reviews, or the average rating). For example, a restaurant may have 45 reviews, with an average rating of 4.5. Aggregating reviews allows you to convey this information. Google recognizes the following Review-aggregate properties. Where the RDFa Review-aggregate and microformats hReview-aggregate property names differ, the hReview-aggregate property appears in parentheses.

Property

Description

itemReviewed (item)

The item being reviewed

name (fn)

The name of the item being reviewed. Child of item.

rating

Container for rating information.

average

The average rating of all reviews. Child of rating.

count

The total number of reviews for the object.

Example: Microformats
<span class="fn">L'Amourita Pizza
<span class="average">4.4
<span class="count">1,313
Example: RDFa <span xmlns:v="http://rdf.data-vocabulary.org/" typeof="v:Review-aggregate"> <span property="v:itemReviewed">Komala Vilas <span rel="v:rating"> <span property="v:average">3.6

<span property="v:count">20

updated 5/13/2009

Was this article: The information you were looking for? Yes No

People Print

About contact information Much of the personal information on the web falls into one of two categories: Contact information (for instance, address, work information), and social networking information.

Properties Google recognizes the following contact properties (derived from hCard), and may include their content in search results. Where the RDFa Contact and microformats hCard property names differ, the hCard property name appears in parentheses. Propert Description y name (fn)

Name

nickna Nickname me url

Link to a web page, such as the person's home page.

affili The name of an organization with which the person is associated (for ation example, an employer). If fn and org have the exact same value, Google will interpret the information as referring to a business or (org) organization, not a person. addres The location of the person s (adr) street The street address. addres s

locali The city. Child of address. ty region The geographic region (such as state, province, or county). Child of address. postal The postal code. Child of address. -code countr The country. Child of address. y-name photo

An image link

title

The person's title (for example, Financial Manager)

role

The person's role (for example, Accountant.)

Social networking information Google also recognizes the XFN friend, contact, and acquaintance properties, which are used to identify social relationships. Darryl Note: The examples below describe John "Smithy" Smith, an engineer who is friends with Darryl. For simplicity, examples are shown with only the information that is being tagged. However, in realistic pages, these tags will be spread throughout the web page, mixed with unmarked text. Example: Microformat/XFN
John Smith
<span class="nickname">Smithy <span class="url">http://www.example.com <span class="org">ACME <span class="adr"> <span class="locality">Albuquerque <span class="title">Engineer Darryl
Note: We'll also accept class="friend". The friend property is imported from XFN. Example: RDFa

<span property="v:name">John Smith <span property="v:nickname">Smithy <span property="v:url">http://www.example.com <span property="v:affiliation">ACME <span rel="v:address"> <span property="v:locality">Albuquerque <span property="title">Engineer <span property="v:name">Darryl
For the structured data geeks out there, you can also use RDFa's ability to express relationships using the about property. Google does not currently use the about property in search results, but it may be used in the future.

updated 5/13/2009

Products Print

About product information

Google is experimenting with markup for product data, and currently we recognize product data included in reviews.

Properties Google recognizes the following Product properties, and may include their content in search results. Where the RDFa Product and microformats hProduct property names differ, the hProduct property name appears in parentheses. Property

Description

brand

The brand of the product—for example, ACME.

category

The product category—for example, "Books - Fiction", "Heavy Objects", or "Cars".

descripti Product description on name (fn) Product name price

Floating point number. Can use currency format.

photo

URL of product photo

url

URL of product page

Note: The examples below describe an ACME-brand anvil. For simplicity, examples are shown with only the information that is being tagged. However, in realistic pages, these tags will be spread throughout the web page, mixed with unmarked text. Example: Microformats
<span class="brand">ACME <span class="category">Heavy objects <span class="fn">Large all-purpose anvil <span class="description">If you need an object to drop from a height, the classic A23859 anvil from ACME is the way to go. <span class="url">http://anvil.example.com
Example: RDFa
<span property="v:brand">ACME <span property="v:category">Heavy objects <span property="v:name">Large all-purpose anvil

<span property="description">If you need an object to drop from a height, the classic A23859 anvil from ACME is the way to go.
Google does not currently use the about property in search results, but it may be used in the future.

Businesses and organizations Print

About business and organization data Google is experimenting with markup for business and location data, and currently we recognize business data included in reviews.

Properties Google recognizes the following Organization properties, and may include their content in search results. Where the RDFa Organization and microformats hCard property names differ, the hCard property name appears in parentheses. Property Description name (org/ name)

The name of the business. If you use microformats, you should use both org and name, and ensure that these have the same value.

url

Link to a web page

address The location of the business (adr) street- The street address. Child of address. address localit The city. Child of address. y region

The geographic region. Child of address.

postal- The postal code. Child of address. code country The country. Child of address. -name tel Note:

The telephone number

The examples below describe a restaurant. For simplicity, examples are shown with only the information that is being tagged. However, in realistic pages, these tags will be spread throughout the web page, mixed with unmarked text. Example: Microformats
<span class="fn org">L'Amourita Pizza <span class="tel">(206) 555-7242
<span class="street-address">2040 Any Street <span class="locality">Springfield <span class="region">WA <span class="postal-code">98102
Example: RDFa
<span property="v:name">L'Amourita Pizza <span property="v:tel">(206) 555-7242
<span property="v:street-address">2040 Any Street <spanspan property="v:locality">Springfield <span property="v:region">WA <span property="v:postal-code">98102<span>


updated 5/13/2009

RDFa Primer Bridging the Human and Data Webs W3C Working Group Note 14 October 2008 This version:

http://www.w3.org/TR/2008/NOTE-xhtml-rdfa-primer-20081014/ Latest version: http://www.w3.org/TR/xhtml-rdfa-primer/ Previous version: http://www.w3.org/TR/2008/WD-xhtml-rdfa-primer-20080620/ Editors: Ben Adida, Creative Commons Mark Birbeck, webBackplane <[email protected]> Copyright © 2008 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.

Abstract Today's web is built predominantly for human consumption. Even as machine-readable data begins to appear on the web, it is typically distributed in a separate file, with a separate format, and very limited correspondence between the human and machine versions. As a result, web browsers can provide only minimal assistance to humans in parsing and processing web data: browsers only see presentation information. We introduce RDFa, which provides a set of XHTML attributes to augment visual data with machine-readable hints. We show how to express simple and more complex datasets using RDFa, and in particular how to turn the existing human-visible text and links into machine-readable data without repeating content. This document provides only a Primer to RDFa. The normative specification of RDFa can be found in [RDFA-SYNTAX].

Status of This Document This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http:// www.w3.org/TR/.

This document is a Working Group Note produced jointly by the W3C Semantic Web Deployment Working Group [SWD-WG] and the W3C XHTML2 Working Group [XHTML2-WG]. This work is part of both the W3C Semantic Web Activity and the HTML Activity. The transition of this document to Working Group Note occurs simultaneously with the transition of the RDFa syntax specification to W3C Recommendation. This version of the RDFa Primer contains small editorial changes to the previous version as well as a short additional section (4.1) providing pointers to those wishing to create new relationship vocabularies. The changes are detailed in a differences document. The Working Groups have received suggestions that this document be expanded and the Groups may add to it in the future but are not committing to do so. Comments on this Working Group Note are welcome and may be sent to [email protected]; please include the text "comment" in the subject line. All messages received at this address are viewable in a public archive. Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress. This document was produced groups operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the XHTML 2 group and another public list of any patent disclosures made in connection with the deliverables of the Semantic Web Deployment Working Group; those pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents 1 Introduction !!!!1.1 HTML vs. XHTML 2 Adding Flavor to XHTML !!!!2.1 Licensing your Work !!!!2.2 Labeling the Title and Author !!!!2.3 Multiple Items per Page 3 Going Deeper !!!!3.1 Contact Information !!!!3.2 Social Network 4 You Said Something about RDF? !!!!4.1 Custom Vocabularies 5 Find Out More 6 Acknowledgments 7 Bibliography

1 Introduction The web is a rich, distributed repository of interconnected information organized primarily for human consumption. On a typical web page, an XHTML author might specify a headline, then a smaller sub-headline, a block of italicized text, a few paragraphs of average-size text, and, finally, a few single-word links. Web browsers will follow these presentation instructions faithfully. However, only the human mind understands that the headline is, in fact, the blog post title, the sub-headline indicates the author, the italicized text is the article's publication date, and the single-word links are categorization labels. The gap between what programs and humans understand is large. On the left, what browsers see. On the right, what humans see. Can we bridge the gap so browsers see more of what we see?

What if the browser received information on the meaning of a web page's visual elements? A dinner party announced on a blog could be easily copied to the user's calendar, an author's complete contact information to the user's address book. Users could automatically recall previously browsed articles according to categorization labels (often called tags). A photo copied and pasted from a web site to a school report would carry with it a link back to the photographer, giving her proper credit. When web data meant for humans is augmented with hints meant for computer programs, these programs become significantly more helpful, because they begin to understand the data's structure. RDFa allows XHTML authors to do just that. Using a few simple XHTML attributes, authors can mark up human-readable data with machine-readable indicators for browsers and other programs to interpret. A web page can include markup for items as simple as the title of an article, or as complex as a user's complete social network. RDFa benefits from the extensive power of RDF [RDF], the W3C's standard for interoperable machine-readable data. However, readers of this document are not expected to understand RDF. Readers are expected to understand at least a basic level of XHTML.

1.1 HTML vs. XHTML To date, because XHTML is extensible while HTML is not, RDFa has only been specified for XHTML 1.1. Web publishers are welcome to use RDFa markup inside HTML4: the design of RDFa anticipates this use case, and most RDFa parsers will recognize RDFa attributes in any version of HTML. The authors know of no deployed Web browser that will fail to present an HTML document as intended after adding RDFa markup to the document. However, publishers should be aware that RDFa will not validate in HTML4 at this time. RDFa attributes validate in XHTML, using the XHTML1.1+RDFa DTD.

2 Adding Flavor to XHTML Consider Alice, a blogger who publishes a mix of professional and personal articles at http://example.com/alice. We will construct markup examples to illustrate how Alice can use RDFa. The complete markup of these examples can be viewed independently.

2.1 Licensing your Work In her blog's footer, Alice declares her content to be freely reusable, as long as she receives due credit when her articles are cited. The XHTML includes a link to a Creative Commons [CC] license: ... All content on this site is licensed under a Creative Commons License .

A human clearly understands this sentence, in particular the meaning of the link with respect to the current document: it indicates the document's license, the conditions under which the page's contents are distributed. Unfortunately, when Bob visits Alice's blog, his browser sees only a plain link that could just as well point to one of Alice's friends or to her resume. For Bob's browser to understand that this link actually points to the document's licensing

terms, Alice needs to add some flavor, some indication of what kind of link this is. She can add this flavor using the rel attribute (which we'll write as @rel so as not to repeat the word "attribute" too often), which defines the relationship between the current page and the linked page. The value of the attribute is license, an XHTML keyword reserved for just this purpose: ... All content on this site is licensed under a Creative Commons License .

With this small update, Bob's browser will now understand that this link has a flavor: it indicates the blog's license. A link with flavor: the link indicates the web page's license. We can represent web pages as nodes, the link as an arrow connecting those nodes, and the link's flavor as the label on that arrow.

2.2 Labeling the Title and Author Alice is happy that adding XHTML flavor lets Bob find the copyright license on her work quite easily. But what about the article title and author name? Here, instead of marking up a link, Alice wants to augment existing text within the page. The title is a headline, and her name a sub-headline:

The trouble with Bob



Alice

...
To indicate that h2 represents the title of the page, and h3 the author, Alice uses @property, an attribute introduced by RDFa

for the specific purpose of marking up existing text in an XHTML page.

The trouble with Bob

Alice

...
Why use dc:creator and dc:title, instead of simply creator and title? As it turns out, XHTML does not have reserved keywords for

those two concepts. Alice could boldly choose to write property="title", but how does a program reading this know whether "title" here refers to the title of a work, a job title, or the deed for some real-estate property? And, if every web publisher laid claim to their own short keywords, the world of available properties would become quite messy, a bit like saving every file on a computer's desktop without any directory structure to organize them. To enforce a modicum of organization, RDFa does not recognize property="title". Instead, Alice must indicate a directory somewhere on the web, using simply a URL, from where to import the specific creator and title concepts she means to express. Fortunately, the Dublin Core [DC] community has already defined a vocabulary of useful concepts for describing documents, including both creator and title, where title indeed means the title of a work. So, Alice: 1. imports the Dublin Core vocabulary using xmlns:dc="http:// purl.org/dc/elements/1.1/", which associates the prefix dc with the URL http://purl.org/dc/elements/1.1/, and 2. uses dc:creator and dc:title. These are short-hands for the full URLs http://purl.org/dc/elements/1.1/creator, and http://purl.org/dc/elements/1.1/title. In RDFa, all property names are, in fact, URLs.

Literal Properties: RDFa lets Alice connect not just one URL to another—for example to connect her blog entry URL to the Creative Commons license URL— but also to connect one URL to a string such as "The Trouble with Bob". All arrows are labeled with the corresponding property name, which is also a URL.

2.3 Multiple Items per Page Alice's blog contains, of course, multiple entries. Sometimes, Alice's sister Eve guest blogs, too. The front page of the blog lists the 10 most recent entries, each with its own title, author, and introductory paragraph. How, then, should Alice mark up the title of each of these entries individually even though they all appear within the same web page? RDFa provides @about, an attribute for specifying the exact URL to which the contained RDFa markup applies:
! ! !

The trouble with Bob

Alice

...

Jo's Barbecue

Eve

...
...


We can represent this, once again, as a diagram connecting URLs to properties:

Multiple Items per Page: each blog entry is represented by its own node, with properties attached to each. Here we've used the shorthands to label the arrows, in order to save space and clarify the diagram. The actual labels are always the full URLs.

Alice can use the same technique to give her friend Bob proper credit when she posts one of his photos:

The trouble with Bob

The trouble with Bob is that he takes much better photos than I do: !
<span property="dc:title">Beautiful Sunset by <span property="dc:creator">Bob.
Notice how the innermost @about value, http://example.com/bob/ photos/sunset.jpg, "overrides" the outer value /alice/posts/ trouble_with_bob for all markup inside the innermost div. And,

once again, as a diagram that abstractly represents the underlying data of this new portion of markup: Describing a Photo

3 Going Deeper In addition, Alice wants to make information about herself (email address, phone number, etc.) easily available to her friends' contact management software. This time, instead of describing the properties of a web page, she's going to describe the properties of a person: herself. To do this, she adds deeper structure, so that she can connect multiple items that themselves have properties.

3.1 Contact Information Alice already has contact information displayed on her blog.

Alice Birpemswick

Email: [email protected]

Phone: +1 617.555.7332



The Dublin Core vocabulary does not provide property names for describing contact information, but the Friend-of-a-Friend [FOAF] vocabulary does. In RDFa, it is common and easy to combine

different vocabularies in a single page. Alice imports the FOAF vocabulary and declares a foaf:Person. For this purpose, Alice uses @typeof, an RDFa attribute that is specifically meant to declare a new data item with a certain type:
...

Then, Alice can indicate which content on the page represents her full name, email address, and phone number:

Alice Birpemswick

Email: [email protected]

Phone: +1 617.555.7332

Note how Alice didn't specify @about like she did when adding blog

entry metadata. What is she associating these properties with, then? In fact, the @typeof on the enclosing div implicitly sets the subject of the properties marked up within that div. The name, email address, and phone number are associated with a new node of type foaf:Person. This node has no URL to identify it, so it is called a blank node. A Blank Node: blank nodes are not identified by URL. Instead, many of them have a @typeof attribute that identifies the type of data they represent. This approach—providing no name but adding a type— is particularly useful when listing a number of items on a page, e.g. calendar events, authors on an article, friends on a social network, etc.

3.2 Social Network Next, Alice wants to add information about her friends, including at least their names and homepages. Her plain XHTML is: First, Alice indicates that all of these friends are of type foaf:Person.

Beyond declaring the type of data we're dealing with, each @typeof creates a new blank node with its own distinct properties, all without

having to provide URL identifiers. Thus, Alice can easily indicate each friend's homepage:

And, of course, each friend's name: Using @property, Alice specifies that the linked text ("Bob", "Eve", and "Manu") are, in fact, her friends' names. With @rel, she

indicates that the clickable links are her friends' homepages. Alice is ecstatic that, with so little additional markup, she's able to fully express both a pleasant human-readable page and a machinereadable dataset.

Alice is tired of repeatedly entering information about her friends in each new social networking sites. With RDFa, she can indicate her friendships on her own web page, and let social networking applications read it automatically. So far, Alice has listed three individuals but has not specified her relationship with them; they might be her friends, or they might be her favorite 17th century poets. To indicate that she, in fact, knows them, she uses the FOAF property foaf:knows: Using rel="foaf:knows" once is enough to connect Bob, Eve, and

Manu to Alice. This is achieved thanks to the RDFa concept of chaining: because the top-level @rel is without a corresponding @href, it connects to any contained node, in this case the three nodes defined by @typeof. (The @about="#me" is a FOAF/RDF convention: the URL that represents the person Alice is http:// example.com/alice#me. It should not be confused with Alice's homepage, http://example.com/alice. You are what you eat, but you are far more than just your homepage.) Alice's Social Network

4 You Said Something about RDF? RDF, the Resource Description Framework, is exactly the abstract data representation we've drawn out as graphs in the above examples. Each arrow in the graph is represented as a subjectpredicate-object triple: the subject is the node at the start of the arrow, the predicate is the arrow itself, and the object is the node or literal at the end of the arrow. An RDF dataset is often called an "RDF graph", and it is typically stored in what is often called a "Triple Store." Consider the first example graph:

The two RDF triples for this graph are written, using the Notation3 syntax [N3], as follows: "The Trouble with Bob";

"Alice" . Also, the TYPE arrows we drew are no different from other arrows, only their label is actually a core RDF property, rdf:type, where the rdf namespace is . The contact information example from above should thus be

diagrammed as:

The point of RDF is to provide a universal language for expressing data. A unit of data can have any number of fields, and field names are URLs which can be reused by any publisher, much like any web publisher can link to any web page, even ones they did not create themselves. Given data, in the form of RDF triples, collected from various locations, and using the RDF query language SPARQL [SPARQL], one can search for "friends of Alice's who created items whose title contains the word 'Bob'," whether those items are blog posts, videos, calendar events, or other data types we haven't thought of yet. RDF is an abstract, machine-readable data representation meant to maximize the reuse of vocabularies. RDFa is a way to express RDF data within XHTML, by reusing the existing human-readable data.

4.1 Custom Vocabularies As Alice marks up her page with RDFa, she may discover the need to express data, e.g. her favorite photos, that is not covered by existing vocabularies like Dublin Core or FOAF. Since RDFa is simply a representation of RDF, the RDF schema mechanism that enables RDF extensibility is the same that enables RDFa extensibility. Once an RDF vocabulary created, it can be used in RDFa markup just like existing vocabularies.

The instructions on how to create an RDF schema are available in Section 5 of the RDF Primer [RDF-SCHEMA-PRIMER]. At a high level, the creation of an RDF schema for RDFa involves: 1. Selecting a URL where the vocabulary will reside, e.g. http:// example.com/photos/vocab#. 2. Distributing and RDF document, at that URL, which defines the classes and properties that make up the vocabulary. For example, Alice may want to define classes Photo and Camera, as well as the property takenWith that relates a photo to the camera with which it was taken. 3. Using the vocabulary in XHTML+RDFa with the usual prefix declaration mechanism, e.g. xmlns:photo="http:// example.com/photos/vocab#", and typeof="photo:Camera". It is worth noting that anyone who can publish a document on the Web can publish an RDF vocabulary and thus define new data fields they may wish to express. RDF and RDFa allow fully distributed extensibility of vocabularies.

5 Find Out More More examples, links to tools, and information on how to get involved can be found on the the RDFa Wiki.

6 Acknowledgments This document is the work of the RDF-in-HTML Task Force, including (in alphabetical order) Ben Adida, Mark Birbeck, Jeremy Carroll, Michael Hausenblas, Shane McCarron, Steven Pemberton, Manu Sporny, Ralph Swick, and Elias Torres. This work would not have been possible without the help of the Semantic Deployment Working Group and its previous incarnation, the Semantic Web Deployment and Best Practices Working Group, in particular chairs Tom Baker and Guus Schreiber (and prior chair David Wood), the XHTML2 Working Group, Eric Miller, previous head of the Semantic Web Activity, and Ivan Herman, current head of the Semantic Web Activity. Earlier versions of this document were officially reviewed by Gary Ng and David Booth, and more recent versions by Diego

Berrueta and Ed Summers, all of whom provided insightful comments that significantly improved the work. Bob DuCharme also reviewed the work and provided useful commentary.

7 Bibliography RDFA-SYNTAX RDFa in XHTML: Syntax and Processing (See http://www.w3.org/ TR/rdfa-syntax.) CC Creative Commons (See http://creativecommons.org.) DC Dublin Core Metadata Initiative (See http://dublincore.org.) FOAF The Friend of a Friend (FOAF) Project (See http://www.foafproject.org/.) N3 Notation 3 (See http://www.w3.org/TeamSubmission/n3/.) RDF Resource Description Framework (See http://www.w3.org/RDF/.) RDFHTML RDF-in-HTML Task Force (See http://www.w3.org/2001/sw/ BestPractices/HTML/.) RDF-SCHEMA-PRIMER RDF Primer - Section 5 on RDF Schema (See http://www.w3.org/ TR/2004/REC-rdf-primer-20040210/#rdfschema.) SWD-WG Semantic Web Best Deployment Working Group (See http:// www.w3.org/2006/07/SWD/.) SWBPD-WG Semantic Web Best Practices and Deployment Working Group (See http://www.w3.org/2001/sw/BestPractices/.) XHTML2-WG XHTML2 Working Group (See http://www.w3.org/MarkUp/.)

Changes

The previous version of this document was a significant rewrite for clarity and simplicity. This version includes only a small handful of updates: • • •

some typos fixed. changed "HTML" to "XHTML" and added Section 1.1 explaining the situation. added section 4.1 on custom vocabularies.

Related Documents