Behind the Scenes with XHTML
Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
headportion of the Web page. This is the portion of the document that the user agent (i.e., browser) will read first. It's important that it doesn't stumble here. Remember, our goal is to develop standards-based Web pages (here, I make the assumption that the reader has a working knowledge of HTML).
Let's start at the top of a valid XHTML document and work our way down. For this part of the discussion, I'll be referring to the code below.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<title>Document Title Goes Here</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<link rel="stylesheet" type="text/css" media="screen" href="/style/core-style.css" />
The first line shown above is the XML
declaration. This line defines the version of XML you're using as well as
the character coding. It's recommend by the World Wide Consortium (W3C) but
is not required. If you're not using XML, it's not necessary. In fact, it can
cause problems with some of the older browsers. Most of them will choke if they
encounter a page that begins with this encoding. If you don't include the XML
declaration, you'll need to include the meta tag below (also shown above, after
title tag). Do not include both.
If the XML declaration is used, it must be the first line on the page. If the content meta tag is used, it must be placed within the head of the document.
The content meta tag is divided into two parts. The first part (
tells the browser the mime type. Mime is short for "Multipurpose
Internet Mail Extensions".
It was originally used in formatting e-mail but is now also used by Web browsers
to declare the type of content being served to the browser. The W3C actually
as the mime type for an XHTML document. However, there are problems with using
it. An example is Internet Explorer (up to version 6 for both Windows and Mac),
which doesn't recognize it, nor do many other browsers. Using
should make your page acceptable to IE and is "allowable" by the W3C. The second
half of the meta statement (
charset=UTF-8) identifies the character
set used by the browser.
Note that the meta tag ends with a "
/>". This is because, in
XHTML, all tags must be closed, except for the DOCTYPE statement. The
meta tag is an empty element tag. This means the tag itself is
the content or a place holder for the content. Empty element tags include
<img /> and
<br />. Since the tag has no additional
content, it doesn't have an end tag and must be closed within itself. If you
leave a space before the slash, older browsers won't get confused.
Document Type Definition
If the XML declaration isn't used, the first line in the document must be the Document Type Definition (DTD), or DOCTYPE. This statement is used to "set out the rules and regulations for using HTML in a succinct and definitive manner" (W3C). Failure to use a full DTD could send your visitor's browser into 'quirks' mode, causing it to behave like a version 4 browser (interestingly, a large number of Web pages do not use the doctype statement; many of them are Web development sites). There are three doctypes available for XHTML: strict, transitional, and frameset. Be careful as these declarations are case-sensitive.
The strict DTD is used for documents containing only clean, pure structural mark-up. In these documents, all the mark-up associated with the layout comes from Cascading Style Sheets (CSS).
The transitional DTD is used when your visitors may have older browsers which can't understand CSS too well. You can use many of HTML's presentational features with this DTD.
Finally, use the frameset DTD when you want to use HTML to partition the browser window into two or more frames.
The DTD files referenced above are plain text files. You can enter the URL and download them. There is nothing earth-shattering in the files but you'll be able to see what the browser is reading.
The XML Namespace
The next line in our document is the XML namespace. This statement identifies the primary namespace used throughout the document. An XML namespace "is a collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names" (W3C). The
html tag is included at the beginning, effectively combining the two tags. In addition, the language attribute is also included, in both XML (
xml:lang="en") and HTML (
The rest of the header shown is basic HTML code. The
opens the header and must be closed before the body tag. The
tag follows the opening header tag. Next, the meta tags and the link to the
style sheet, if necessary, are included. Be sure to close the meta and style
sheet link tags with "
/>". Remember, in XHTML, all tags and attributes
must be lower case and all tags, except the DOCTYPE, must be closed.
The article "XHTML 1.0: Where XML and HTML meet" provides a further, in-depth study of transitioning to XHTML.
- Content-Negotiation Techniques
- Serving XHTML with the Right MIME Type
- XHTML's Dirty Little Secret
- HTML and XHTML Frequently Answered Questions