Tuesday, April 16, 2024

XHTML – An Introduction


Perhaps you’ve heard of XHTML
& wondered what it’s all about.
Read on!


 


You probably know what a markup language is:  you have a bunch of text you
need a way to say that this piece should be bold, this piece should be italic,
this should be a heading, and so on.  A markup language is a computer
language (a set of codes) that allows you to specify which pieces should be
which way.


Early on in the game came the WPMLs — the Word Processing Markup Languages. 
Trouble was, they were all different, and being so, meant nothing to each other.


Then along came HTML and the birth of what has become the World Wide Web. 
HTML is a standardized markup language that allows documents all over the world
to be defined using the same set of rules, or codes, such that they can all be
read using the same tool (a "Web Browser" such as Internet Explorer, Netscape or
Opera.)


Everything in the World Wide Web was good.  The Internet and its Web had
been born, flourished and grown into by far the largest repository of knowledge
and information in the history of the world.


Only trouble was — folks realized there may be a better way (thank goodness for
creativity!)


In the late nineties a specification was created for a new markup language known
as XHTML.  The specification, written in XML ( the eXtensible Markup
Language) is maintained by the World Wide Web Consortium (W3C –
www.w3c.org) and to put it in
their words:


"The Extensible HyperText Markup Language (XHTML) is a family of current and
future document types and modules that reproduce, subset, and extend HTML,
reformulated in XML. XHTML family document types are all XML-based, and
ultimately are designed to work in conjunction with XML-based user agents. XHTML
is the successor of HTML, and a series of specifications has been developed for
XHTML."


Now that must have cleared it up for you!  Just in case it didn’t quite
make it, let me expound.


(By the way; I’ve read the following paragraph a few times.  It’s horribly
complicated, but if you read it a few times it starts to become clearer.  I
just wish there was a way to say exactly the same thing without sounding so
technical.  Trust me though; if you read it a few times, it will become
clear!  The information it contains is vital.)


XML (the eXtensible Markup Language) is a powerful and rigorous set of
specifications that is a meta-language, in that it is a language for defining a
markup language.  To explain: HTML is a markup language (the HyperText
Markup Language) specified in SGML.  SGML (the Standard Generalized Markup
Language) is the international standard meta-language for markup languages. 
SGML is a huge and complicated set of rules (specifications) for defining the
elements of a Document Type Definition (DTD).   By defining the
"elements" of a DTD we create a language for marking up a document of the type
defined by the DTD.  Put another way, by creating a DTD for an HTML
document, we define the language "HMTL".  The "elements" we defined are all
the tags we are familiar with using.  For example, something, somewhere,
has to define <b> </b> as meaning the beginning and end of something we wish to
have displayed (or printed) as Bold.  The DTD contains these definitions.


The DTD for HTML is based on SGML.  XML is a simplified form, or subset of
SGML.  XHTML is based on XML.


That doesn’t mean that XHTML is just a simplified form of HTML!  In fact,
though not more complicated than HTML, it does have more hard and fast syntax
rules, but at the same time it allows for a lot more flexibility.  The key
is in the X — eXtensible.


You’ll notice in the fancy W3C wording above that they talk about "a family of
current and future document types and modules".  "Future types and modules"
includes one of the far reaching goals of XHTML, namely to create web pages that
are "understood" by computers as well as people.


"Huh?  They’re already understood by computers" you’re probably thinking. 
Please allow me to distinguish between "understood" and "interpreted".


A computer can interpret a web page inasmuch as it can read the markups and
display it accordingly.  It does not, however, understand the page the same
way you do.  The better search engines concern themselves with attempts to
interpret the actual meaning of web pages, but imagine what could be done if the
real meaning of those pages was contained in their code.  Imagine the value
of being able to tell your computer to visit every car dealership within fifty
miles (or a hundred kilometers — we have such a long way to go!) of your home
and find the lowest price for a particular model of car with a particular set of
options.  This sort of thing is one of the long term goals of XHTML.


Another problem is that when a web page is created, the creator really has no
idea on what type of device it is going to be displayed.  Until recently,
we have been able to languish in the probability that a standard web browser on
a PC with a resolution of 800×600 or better would be the rule.  In today’s
world, cell phones, cars, even refrigerators have browsers built right in. Add
to that Braille and speech synthesis devices designed for those who don’t see or
hear things the same way as the rest of us and you have no real way of knowing
how your web page is going to be interpreted.  Wouldn’t it be nice if you
could still know how it would be understood?


If you design your web pages using the XHTML standard you have the best chance
of knowing that your pages will be understood by any device or any available
interpretive (or understanding) "browser program".


By the way, most WYSIWYG page generation programs produce moderate to acceptable
HTML code.  If you want to create good XHTML, you’re going to have to learn
how.  Watch this site!


 

 

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Popular Articles

Featured