HTML5 Microdata: Why You Should Use It

By Rob Gravelle

You've got a new visitor to impress with your web pages: the web bot. You've no doubt heard of them, those spider apps that follow links from one html document to another, collecting information for search engines, companies, and God only know who else. In the past, web bots were content to read your headings and content to decipher what it is that your page is about. Now, you can also annotate your content with specific machine-readable labels in order to take away as much ambiguity as possible for your non-human visitors. While Microdata is a standard that is in its infancy, Google has already started to use it as another way of providing rich search results back to the user. The moral of this story is that you might want to get ahead of the curve on this one and start including Microdata in your pages. This article provides an overview of how to do that.

Basic Syntax

In a nutshell, Microdata consists of a group of name-value pairs, whereby the groups are called items, and each name-value pair is a property. Items are defined using the following five attributes: itemscope, itemtype, itemid, itemprop, and itemref. To better understand how each of these attributes is used, let's take a look at an actual HTML snippet and analyze what each attribute is doing:

<div itemscope itemtype="http://data-vocabulary.org/Person">
 <img src="rob.jpg" itemprop="photo">
My name is <span itemprop="name">Rob Gravelle</span>, and I am a <span itemprop="title">freelance writer</span> for
<a href="http://net.tutsplus.com" itemprop="affliation">HTMLGoodies.com+</a>.
I live in
<span itemprop="address" itemscope
��� itemtype="http://data-vocabulary.org/Address">
��� <span itemprop="locality">Ottawa</span>,
��� <span itemprop="country-name">Canada</span>

It's easy enough for us to understand that the above blurb is description of me, which includes my name, a picture, and my general location. Not so for a machine; it requires the Microdata to provide meaning to the text. The itemscope and itemtype work in tandem to specify exactly what kind of entity is being described. The itemtype contains a URL string that points to the specification on the data.vocabulary.org site. You can actually paste the link into your web browser's address box and it will display a page that describes what that itemtype refers to along with the type�s defined property names. For instance, the http://data-vocabulary.org/Person page above describes the Person itemtype as follows:

An item with the item type http://data-vocabulary.org/Person represents a person.
Just as I thought! (OK, that was an easy one)

All content within the enclosing DIV tag (the one with the itemscope attribute) is expected to contain information pertaining to a person and hence, Microdata names from the http://data-vocabulary.org/Person specification.

One of the ramifications of Microdata being expressed as attributes is that there are no concrete Microdata tags. As a result, we must include the attributes in our standard HTML tags. The web bots that read the Microdata don't care what kind of tags you embed the Microdata attributes within, as long as the tags are semantically related as above. The tags you choose should be those that make your content meaningful to the human visitors. Only once you're happy with those, should you incorporate the Microdata attributes. That's the �humans first, machines second� philosophy.

An itemprop can itself represent an entity with its own set of itempprops, as seen above with the address.

Time and Meta Tags

Dates and times can be difficult for people to understand, let alone machines. Formats like "dd/mm/yyyy" and "mm/dd/yyyy" make telling days and months apart a guessing game. To make dates unambiguous to web bots, you can include the time tag along with the datetime attribute, using the ISO 8601 YYYY-MM-DD format. You can also include a time portion, by prefixing it with a capital "T" and formatting it as hh:mm or hh:mm:ss, such as 2011-05-08T19:30 for May 8, 7:30pm.

The Meta tag is to convey information that cannot otherwise be marked up, such as an image, or doesn't appear in the page itself.

Let's see the Time and Meta tags in context. Here's some HTML describing a Stock Strategy Analysis web application that I wrote to help me evaluate the effectiveness of investment protocols such as Dollar Cost Averaging and Value Averaging:

<div id="inputForm" itemscope itemtype="http://schema.org/SoftwareApplication">
<meta itemprop="name" content="Stock Strategies Simulator" /> 
��<meta itemprop="softwareVersion" content="Beta Release 3" />
<time itemprop="datePublished" datetime="2012-08-23"></time>
<time itemprop="dateUpdated" datetime="2012-10-05"></time>
<meta itemprop="softwareApplicationCategory" content="BrowserApplication" />
<meta itemprop="softwareApplicationSubCategory" content="FinanceApplication" />
<meta itemprop="author" content="Rob Gravelle" />
<meta itemprop="url" 
���content="http://www.gravelleconsulting.com/stock_strategies_simulator_beta_release.html" />
<meta itemprop="image" 
���content="http://www.financialwebring.org/forum/download/file.php?id=1886&mode=view" />
<div itemprop="offers" itemscope itemtype="http://schema.org/Offer">
��� <meta itemprop="price" content="$0.00" />
��� <meta itemprop="priceCurrency" content="USD" />
<form id="frmSimulationOpts" method="post" action="#">


Today we got a taste of how to use Microdata, and, more importantly, why we should start doing so now. There are many, many more uses - and corresponding itemtypes - for Microdata than we saw here today. For more information on the subject, visit the Microdata specifications on the whatwg.org site.

If you enjoyed this article, please contribute to Rob's rock star aspirations by purchasing one of Rob's cover or original songs from iTunes.com for only 0.99 cents each.

Rob Gravelle resides in Ottawa, Canada, and is the founder of GravelleConsulting.com. Rob has built systems for Intelligence-related organizations such as Canada Border Services, CSIS as well as for numerous commercial businesses. Email Rob to receive a free estimate on your software project. Should you hire Rob and his firm, you'll receive 15% off for mentioning that you heard about it here!

In his spare time, Rob has become an accomplished guitar player, and has released several CDs. His former band, Ivory Knight, was rated as one Canada's top hard rock and metal groups by Brave Words magazine (issue #92).

Rob uses and recommends MochaHost, which provides Web Hosting at $3.10 per month, 2 LifeTime Free Domains, and 6 Months Free!

  • Web Development Newsletter Signup

    Invalid email
    You have successfuly registered to our newsletter.
Thanks for your registration, follow us on our social networks to keep up-to-date