HTML or XHTML? Fact From Fiction

By Stephen Philbin

https://www.htmlgoodies.com/beyond/xml/article.php/3669451/HTML-or-XHTML-Fact-From-Fiction.htm (Back to article)



On an almost daily basis I see the same question asked: Should I be using HTML or XHTML? On an almost equally frequent basis I also see people imply that one (usually XHTML) should always be used in favour of the other. Such suggestions, much like most of the reasons given for them, are complete nonsense.

There's absolutely no reason why we can't use one or the other, as we see fit, on a case-by-case basis. So to all those out there that are frantically trying to convert their entire life's work from HTML to XHTML because HTML has been replaced by XHTML, and to those that are refusing to use XHTML because browsers don't properly support it yet, stop, put the IDE down and step away from the myths.

It seems as though most of the pro-XHTML myths come from developers that are keen to be seen to be keeping pace with (what one could be forgiven for mistaking as the rapid) progression of web development. A perfectly sensible thing to do; if you stop learning you'll just get left behind. It's just that few people seem to be willing to admit that they're experimenting and trying new stuff out, and so they just seem to parrot or make up weird and wild reasons for using XHTML. On the other hand most of the pro-HTML rhetoric seems to come largely from fuddy-duddy stick-in-the-mud types, or people that have taken the time to learn about XML and XHTML, but somehow think they might be keeping some kind of competitive or intellectual edge by discouraging others from doing the same (yes there are people that petty).

Given the prevalence of pro-XHTML mythology in today's world of web development we'll start out by taking a look at some of the pro-XHTML myths. Some might sound quite feasible, but the majority are just so absurd that I can't help but wonder if some of the people saying these things might be a few pennies short of a full quid. See if any of them sound familiar.

It's cleaner and more precise.
This one's a bit of a weird one. I'm guessing it's probably just a misinterpretation of the well-formedness constraints inherited from XML, but essentially it's just nonsense.
It's more standards-compliant.
An absolutely bonkers assertion. XHTML follows the XHTML Recommendations, and HTML follows the HTML Recommendations. They each follow their own rock-solid standards.
It's more accessible.
Certainly not true. Try to use XHTML for anything other than just pointlessly serving it as broken HTML and you could quite easily hit some pretty serious accessibility issues.
It has replaced HTML.
HTML is going to be around for a long time to come yet. The fact that the W3C has reopened the W3C HTML Working Group, it could be argued, reflects the failure of XHTML to be adopted as anything more than a buzz-word with syntax errors and that we are far from ready to break away from HTML. At the rate most web browsers are progressing, HTML and XHTML served as broken HTML are probably going to continue to be the norm right up until the earth is destroyed to make way for a new hyperspace bypass.
HTML is not well-formed.
Indeed HTML doesn't meet the well-formed constraints as defined in the XML Recommendation, but then it doesn't need to. HTML is not an application of XML; it's an application of SGML. HTML is perfectly capable of defining an unambiguous document structure (a key purpose of the well-formedness constraints expressed in the XML Recommendation). HTML allows you to omit more, but it doesn't require that you omit as much markup as possible. My personal preference is to explicitly write in as much of the optional markup as my feeble little memory will allow, but in the end, it all comes down to you — the author.
It's better.
A true classic. No reasoning, no justification, just — It's better. You might as well be asking a three-year-old why Pikachu is better than Bulbasaur.

The list really does go on and on. The internet is awash with nonsense about why we shouldn't use HTML any more.

The reasons I've heard for preferring HTML, on the other hand, are far fewer in number, but they do tend to carry a little more plausibility. Though that's not to say we should blindly accept them any more readily than we should accept the pro-XHTML nonsense. The key to making the right choice is understanding a few basic concepts and using this understanding to weigh the beneficial and problematic effects your choice might have on whoever or whatever is accessing your document.

The first point to recognize is that XHTML and XML have some very useful features and that they can be used to great effect, but the thing is if you're not using these features then there's probably not really much point in using XHTML. There's also the consideration that if you do serve up XHTML as HTML (Content-Type: text/html), which is the case in 99% of the XHTML I've seen on the web, then that's exactly how the browser will treat it: as just plain old HTML with bucketloads of syntax errors. So in such cases, it may well have been better to just make the document in plain old HTML without all the perceived syntax errors.

I know of a few developers that like to use other markup languages defined in XML and combine them with XHTML to create documents that simply would not be possible with HTML alone. Be that combining XHTML with markup languages they've made themselves, or by using the pre-packaged DOCTYPEs from the W3C that bundle together XHTML, SVG and MathML so that you can jump straight in and get busy trying to make your latest and greatest. The ability to combine XHTML with new markup languages made from the same parent language is indeed a great step foward. With any luck we'll see big improvements in support for XML in more browsers so that this sort of thing can really take off and allow greater innovation in web development.

It's rather obvious, but XML processors are only required to process XML. An XML processor is no more obliged to accept text/html as valid input — as it is tinky/winky. Some implementations may well accept text/html, but I certainly wouldn't recommend relying on such behaviour. So if you're planning for a document to be usable by both people and fully- or semi-automated XML processors -- which can include a wide range of applications from Javascript implementations retrieving documents and including the content into the existing structure of an already visually rendered document (or Ajax as some people insist on calling it) to clients hoovering up all the data they can find to chuck in a database -- then I'd strongly consider using XHTML and serving it as such (application/xhtml+xml is recommended by the W3C, but most cases may well require the use of application/xml or text/xml). It all depends on what you deem to be the most important modes of retrieval and use of the document you're making.

Accessibility, I think, is actually XHTML's biggest problem. As I've just said, if you do decide to use XHTML for something other than serving it as broken HTML then you're probably going to have to carefully think about what, who and possibly why you want to be able to access the document. Aside from (what I hope is) the very obvious side of accessibility that I'd like to think we all consider by default: access for the disabled and disadvantaged, XHTML can also throw up major accessibility issues for many more users in general. Here's three of the most prevalent problems for you to consider when attempting to use XHTML.

Making XHTML that wouldn't be better off being HTML in the first place and works for all intended recipients is not usually an easy task, but hopefully you'll now be a little more aware of the potential pitfalls of using XHTML and wary of the massive amounts of misinformation about which to use. So to sum up, unless you've got a real reason to be using XHTML, there's probably not much point. Just make sure that if you are using it, that you're careful about what you're doing and that you do plenty of testing. I would strongly suggest using XHTML to learn about XML (a perfectly valid reason for using XHTML in my opinion). XHTML provides many pre-made working examples of various aspects of XML for you to learn from, tweak and experiment with, but don't feel as though you have to go rushing into anything. HTML is going to around for a very long time to come.

Stephen Philbin is a freelance web developer and writer that would live at http://www.stephenphilbin.com/ if only he could find the time to build a site for himself.