HTML 4.0: Getting Started

By Joe Burns

Updated to include the 4.01 Specification changes. Editorial comments are enclosed in square braces  - [ ]
 

Use these to jump around or read it all...

[Reader's Questions On HTML 4.0] [On Versions...] [SGML and XML]
[Will You Puh-Leeze Get to HTML 4.0?!]
[New Commands] [New Sub-Commands]
[Deprecated Elements] [Dead Elements] [That's That]

     Apparently the HTML 4.0 buzz words are starting to make the mainstream. The e-mail letters asking questions about what is to come are beginning to pile up. So I looked into it.
     Wow.
     I say "wow" because there is very, very little on the Net regarding HTML 4.0 right now. And what is out there is written in less-than-layman-terms language. So I printed and read and printed and read and finally got smart enough to stop reading and start interviewing. I spoke to the computer and HTML wizards at Susquehanna University and they helped greatly.
     Below are the results of my quest. I tried to write this at the lowest level possible so all could get a handle on it. If I seem to be a little Fischer-Price in some areas -- it's done on purpose. So, let's get started with a few questions from HTML Goodies readers that I saved for just such an occasion.

[As we are now several years past Joe's original writing, there is now plenty of material on the net concerning HTML 4.01.  The Specification can be found at the W3 Consortium, here, and there is a very useful list of HTML 4.01 tags, including deprecated tag information on the Web Developer's Virtual Library, here.]


 

Readers' Questions Regarding HTML 4.0

(Please understand this section was
written before HTML 4.0 came out.)

 

1. What is HTML 4.0? Should I be concerned about it?

     Concerned? As in HTML 4.0 might steal away in the night with your good china? Nah. From everything I've found, it will actually make writing a bit easier on you and on the search engines. The people of the World Wide Web Consortium stepped up to the plate and agreed, in late 1997, that the next version of HTML, version 4.0, should be the accepted version. Now...
     This BY NO MEANS indicates that it is so. Professors I have spoken with tell me that HTML version 4.0 will not be in full use for at least a year (this tutorial was written 3/7/98). Even so, remember whom you are writing for. Contrary to what many Web-heads believe, the vast majority of Web users are still at around version 2.0 level. Watch the level of your writing. You may be above some heads. Actually, most heads.
     Stay with me here. I have every intention of describing the new commands and what they do. But first, let me ease you into it a bit.

2. I use Netscape 4 (or Explorer 4). Does that mean I should be writing in HTML 4.0?

     Ah, logic! Man, that sounds like it should be correct I know, but it isn't. Version numbers of the two main browsers have nothing to do with what version of HTML they use. Now, someone is going to go bonkers at this point and tell me that some elements of HTML 4.0 are available for use in browsers version 4. True -- but the manufacturers of the browsers did not wait for their version 4 number to incorporate HTML 4.0. This is one of those strange scientific synchronicities called a coincidence.
     BUT! With that said, some of these commands can be run using the 4.0 (and some earlier) browsers. When we actually get to the commands, I'll offer some additional pages of examples of the commands. Then you can see if they actually work for your browser or not.

[All the commonly used browsers these days fully support HTML: 4.01]


On Versions...

     Now might be a good time to discuss what all these version numbers mean. There are no hard or fast rules to this, but here's the generally accepted method for giving version numbers to software:
  • If there is a major change to the product, step up the number by one.
  • If there are tweaks to the product, add a point "something" number.

     With good writing like that, it's a wonder I wasn't hired on at Microsoft years ago, huh? For instance:
     I create a software program that counts the number of times you curse at your computer screen while writing HTML code. I call it "Curser". The first version out of the gate is version 1.0. I offer that version for free over the Net so I can get fine people like you to test it out for free. You will be my R&D (research and development) team. This is what companies call a "beta version." It's something they assume will be replaced by a better version. Kind of like beta videotape. Remember that?
     You play with it and find a few bugs. I fix the bugs. Now I have Curser Version 1.1. This is probably the one I sell.
     Six months later, I decide I hate the interface (the look of the screen) and decide to make major changes to my Curser program. I change the look completely, add a Recycle Bin, and call it Curser 95! Just kidding. But, the changes are substantial, so I change the version number. I now have Curser 2.0. Again with the beta testing. You find a few bugs, I fix them. Ta da! Curser 2.1. But wait! You suggest an extra function for the program. I add it. We now have Curser Version 2.2. Get it?

[Accordingly, HTML 4.01 is a relatively minor enhancement to the original HTML4.0 specification.]


Back to the Questions...

3. What version of HTML are we currently using?

     The last accepted version is HTML 3.2. So, from the discussion above, you can see that HTML has gone through three major overhauls and then a couple of tweaks.
     Have you ever seen one of these:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

     That's a Declaration Statement. It sits at the very top of the page and proclaims to the browser that displays the page that the following page is using HTML 3.2. See above? The Document Type Declaration (DTD) is HTML 3.2. When you start to run 4.0, you'll change out the 3.2 with the number 4.0. The "EN" means English.

[here is a sample HTML4.01 DTD:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
  ]

 

4. Who decided we should all go to HTML 4.0?

     The World Wide Web Consortium. And actually, they don't decide at all. They are the governing body of HTML, among other things. They make suggestions and hopefully the browser-makers follow. But not always. Case in point -- the <BLINK> command works in Netscape, but not in Explorer. Go figure.

[& the w3c (World Wide Web Consortium) is also the body responsible for the release of the 4.01 specification.  <blink>, by the way, is a deprecated tag - it's ignored by most modern browsers]

5. Every time I hear about HTML 4.0, I also hear about SGML and XML. What are they?

     And you have these conversations with whom? You and your friends are either really, really up on the future of code or you need to get out more.
     Here's the scoop... and I am way ahead of myself here. Please read this little section as entertainment, at the moment. In the near future, I intend to write separate tutorials on both of these topics. But for now, these are the simple explanations.

  • SGML stands for Standard Generalized MarkUp Language.
         And you already use it. SGML is the mother of HTML. Think of it this way: By using SGML, you have the ability to create your own tags. I want the tag <ZORK> to represent text that is bold, italic, and Arial font. By using SGML code, I can set it up. HTML is simply a set standard of tags under the huge SGML umbrella.
         I'll have something soon, but until then see On SGML and HTML, SGML: Getting Started, An Introduction to SGML, or my personal favorite, SoftQuad's SGML primer.

  • XML stands for eXtensible MarkUp Language.

    (I've written my own XML tutorials since I wrote this - check them out too.)

         XML is a sub-set of SGML, as is HTML. The best description of the language comes from Ken Kopf, Computer Specialist at Susquehanna University. He denotes XML as a very simplified version of SGML. It is a version of the language that people can understand. If SGML is a bear, XML is a kitten. It will allow you to set up your own tags and mathematical equations using commands that you can probably understand.
         Again, I intend to write on the subject in the near future but if you'd like to fill your brain cells now there is no better piece on the language, in my opinion, than the Frequently Asked Questions about the Extensible MarkUp Language page.

     Now, the concern I see coming out of all of this fancy new stuff is that it might do damage to the Web.
     You see, HTML was a stunningly easy language that took computer programming out of the hands of folks with slide rules and gave it to you and me -- the weekend silicon warriors. We understood it. It made some sense.
     Introducing SGML and XML, in my mind, is the first real shot the higher-ups have of driving people away. It's something new and most people are comfortable now. Introducing it might stratify the audience or make people drop out all together.
     But not to worry. The full incorporation of these languages is years away. By then, there will probably be good, solid, programming software that will do most of the XML work for you.

[The specification for HTML 4.01 has a strong relationship to SGML/XML, but it is not necessary to know either SGML or XML to effectively use HTML 4.01.]


Will You Puh-Leeze Get to HTML 4.0!?

     Stop yelling. I'm there. I should say up front that the majority of my research came from the World Wide Web Consortium's pages on HTML 4.0. There are miles of data available. Here, I'll attempt to boil it down to the basics.

[The W3C HTML 4.01 pages are here.]

     There are four sections below: New Commands, New Sub-Commands, Deprecated Commands, and Dead Commands. After the first two sections, there will be a link to a page containing the commands. Some are actually available today for use. You'll be able to see if your setup does the trick or not with these HTML 4.0 commands.


New Commands

     The following commands are "new" and are incorporated into HTML 4.01:

The Command What It Does
<ABBR> This indicates an abbreviated form of a word. Example:

<ABBR TITLE="National Football League">NFL</ABBR>

The TITLE command produces a rollover title like the ALT command does on pictures.
<ACRONYM> This works the same way as above except it denotes an acronym. Example:

<ACRONYM TITLE="Self-Contained Underwater Breathing Apparatus>SCUBA</ACRONYM>
<BDO> This is difficult to explain. Text goes left to right and sometimes right to left. The BDO command denotes to the computer to leave the text in the direction it is currently in. If you write in Hebrew, a language written right to left, using the BDO will ensure that other elements such as spell checkers and such won't be incorporated that will flip text around. It is most often used in the PRE tags. Example:

<PRE>
<BDO DIR="LTR">hello</BDO>
</PRE>

LTR means "left to right". Guess what "right to left" is represented by. Yup, RTL.

<BUTTON> This will become standard code for creating link buttons, like in a guestbook form. Example:

<BUTTON name="submit" value="submit" type="submit"></BUTTON>

What's more, this format will easily allow for an image to be placed on the button.

<COLGROUP> This command allows for an entire column of data in a table to be affected by one command rather than using a separate command for each cell. Example:

<COLGROUP WIDTH="30%"></COLGROUP>
<DEL> Surrounding something with this command will provide a strikethrough over what it deleted. Example:
Version <DEL>3</DEL><INS>4</INS>

Now you have a jump on what the new command INS does. You'll get to it an a couple.
<FIELDSET> This allows people to group controls on a page together, like grouping buttons that affect a certain JavaScript so there won't be any interaction between other scripts on the same page or sections of a guestbook. It works in tandem with the LEGEND command below. An example will be waiting there.
<FRAME> This works the same way as the FRAME command we have today except it has been delegated new powers to denote specific traits to each frame cell. It allows for many more abilities with Style Sheets. The reason this is listed is that it will be a specific subset of commands for use with SGML format styles.
<FRAMESET> Ditto this one, except this deals with larger sections of frame pages. For instance, you have a page with four frame cells. You want only the ones on the left to have green borders. You use this command to set aside those two vertical frames and assign traits to just that section. The reason this is listed is that it will be a specific subset of commands for use with SGML format styles.
<IFRAME> This again works much the same way as the In-Line frames we currently use. Again, the reason this is listed is that it will be a specific subset of commands for use with SGML format styles.
<INS> You saw how this works above. It sets something aside as having been added or "inserted" at a later time. It is denoted by an underline.
<LABEL> This command attaches a label to form commands. Example:

<FORM ACTION="--">
<LABEL for="email">Email Address</LABEL>
<INPUT type="text" name="email_address" id="email">

<LEGEND> Now, we get to the example denoted above from the command FIELDSET. FIELDSET groups form items together. LEGEND denotes those sections. Example:

<FIELDSET>
<LEGEND>Personal Information</LEGEND>
Name: [Input Text Box]
EMAIL: [Input Text Box]
AGE: [Input Text Box]
</FIELDSET>

It keeps it all straight for the computer.

<NOFRAMES> This denotes text content that displays if the user does not have frame capabilities. It's been around for a while, but now is officially being brought into the fold.
<NOSCRIPT> Ditto above.
<OBJECT> This command will become a replacement command for IMG, ISMAP, APPLET, SCRIPT, and myriad other "objects" that appear on the page. This one command will represent that something is going to be placed on the page. The computer then decides what kind of object it is due to its extension. Example:

<OBJECT data="image.gif" type="image/gif"></OBJECT>

~or~
<OBJECT classid="applet.class"></OBJECT>

~or~
<OBJECT data="movie.avi" type="application/avi"></OBJECT>
<OPTGROUP> How this will be handled is still a little fuzzy, but it appears that this will allow for multiple groups of information inside Pull-Down menus -- much like the menus produced by the W95 "Start" button.
<PARAM> This command will be used with applets to set parameters. It's already in use, but is now being brought into the fold.
<SPAN> Think of the SPAN element in terms of it's being an equal to the <DIV> command. It denotes a certain division of the page or span of text that can then be altered to your heart's content. Example:

<SPAN CLASS="green">This would be green text</SPAN>
<TBODY> This command will surround a block of table cells so that you can affect just that section. Keep reading...
<TFOOT> This will allow you to place a footer below each TBODY section of a table. Notice all the commands are TR rather than TD. Here's an example for both TBODY and TFOOT:

<TABLE>
<TBODY bgcolor="--">
<TR> text
<TR> text
</TBODY>
<TFOOT><TR>The above cells...</TFOOT>
</TABLE>

<THREAD> This is header information for a group of cells -- used exactly the same way as the TFOOT above -- except above the group of cells are set apart by the TBODY command. Like so:

<TABLE>
<THREAD><TR> The following cells...</THREAD>
<TBODY bgcolor="--">
<TR> text
<TR> text
</TBODY>
<TFOOT><TR>The above cells...</TFOOT>
</TABLE>

<Q> The difference between the Q command and the BLOCKQUOTE command is that the Q command is much easier to write. Use them exactly the same way.

Take Them For a Test-Drive

     Click below to go to a page that contains the commands listed above. You'll be able to see which ones work with your system and which don't.

Okay. Gimme the keys -- I'll try them out.


Some New Sub-Commands

     In my opinion, this is where HTML shines: The sub-commands. The sub-commands allow a simple table cell to have color and size. They allow an image to have text and set sizes. The sub-commands are where true HTML usage shines. And there are a few new ones to be concerned with in HTML 4.0. Here you go....

The Sub-Command What It Does
<CLASS> This is already in use in Explorer versions 3 and 4. First you set up a class with Style Sheet commands. (See my tutorial on Classes and IDs for how to do it). Then you call for the style sheet using the class command. Example:

<SPAN CLASS="purple">Affected text</SPAN>
<DIR> This was touched on above in the BDO command. The DIR sub-command denotes whether the text is to be read LTR (Left to Right) or RTL (Right to Left).
<ID> The ID can be used in the same manner as the CLASS sub-command above; however, in HTML 4.0 it is also being used to denote sections of the page. In short, it acts like a Page Jump. Example:

<A HREF="#sectionone-id">Jump to Section One</A>

The command above will jump to this:

<SPAN ID="sectionone">section One</SPAN>

This method is a little better than the page jump because it jumps to a section of text rather than just to a point on the page.
<LANG> This is clever, because it helps the search engines understand different languages as being different languages rather than just misspelled English. First, an Example:

<SPAN LANG="es">Hola! Como esta?</SPAN>

Those of you who remember your high school Spanish know that phrase above loosely translates to "Hi, how ya doin'?"

Now, contrary to what you might be thinking, the LANG sub-command does not translate. You must still write the text in the native tongue. The LANG command just allows the search engines to recognize that section as Spanish text.

In case you're wondering, here are some other codes: ar (Arabic), de (German), el (Greek), fr (French), he (Hebrew), hi (Hindi), ja (Japanese), it (Italian), nl (Dutch), pt (Portuguese), ur (Urdu), ru (Russian), sa (Sanskrit), zh (Chinese).

Yes, there is also a code set aside if you wish to denote a language that doesn't really exist, like Pig-Latin or Klingon. Follow the same format as above except add x- before the name, like so: LANG="x-ubbee dubbie". The "x" means it's an experimental language.

<TITLE> This title command works the same way as the ALT command in an IMG command. It allows you to place a title onto just about anything so that when the mouse remains stationary for a second, a text box pops up. Example:

<SPAN TITLE="National Football League">NFL</SPAN>

Now, every time someone places their mouse on that set of initials, the box will pop up saying "National Football League." It can be very helpful.

Okay, give these a whirl...

See the Sub-Commands in Action


Deprecated Elements

     These are commands that are still good, but there are better ways of getting the effect.

The Command Use This Instead
<APPLET> The <OBJECT> tag
<BASEFONT> Style Sheet Commands
<CENTER> The ALIGN="center" sub-command or Style Sheet Commands
<DIR> Create lists through the <UL> command
<FONT> Style Sheet Commands
<ISINDEX> Create various <INPUT> commands to create the text box ISINDEX creates
<MENU> Create lists through the <UL> command
<S> Create strike-through text using Style Sheet Commands
<STRIKE> Create strike-through text using Style Sheet Commands
<U> Create underlined text using Style Sheet Commands


Dead Elements

     In with the good, out with the bad. These three puppies are gone for good.

~~R.I.P~~ Now What?
<LISTING> Use <PRE> instead
<PLAINTEXT> Use <PRE> instead
<XMP> Use <PRE> instead


That's That

     Now you know far more about HTML 4.0 than I'm sure you cared to know. If I were to put it all into a few simple sentences, I would say that this is not yet something to get all excited or nervous over. It will probably be a while before these commands are all used commonplace.

[These commands and sub-commands are now commonplace.  It would serve you well to avoid the deprecated commands, and if you use the dead ones, they will most probably be ignored -- though some browsers may actually throw an error of some sort when these tags are encountered.  There is nothing particularly complicated about using the new tags, so go ahead -- dive in!]

     Let me again say that in any Web page design class, you should be taught the value of understatement. Just because you have all these fancy commands doesn't mean you have to use them.
     Think of your audience. If they are real Web-heads that simply go gah-gah over the newest stuff and have the latest browsers and all the stunning plug-ins, then maybe you should get involved with this. If your audience is a mass of people with greatly varying browsers and platforms, then maybe you should stay at a lower level. No matter how cool your page is, if I can't display it, I can't be impressed by it.

     There. I've now put my soapbox away.

[And again -- most modern browsers fully support HTML 4.01 -- if there is a browser out there that has a problem with it, it is most probably a very obscure browser, and is highly unlikely to be used by any significant portion of your audience.]

 

Enjoy!

 

[Reader's Questions On HTML 4.0] [On Versions...] [SGML and XML]
[Will You Puh-Leeze Get to HTML 4.0?!]
[New Commands] [New Sub-Commands]
[Deprecated Elements] [Dead Elements] [That's That]


Make a Comment

Loading Comments...

  • Web Development Newsletter Signup

    Invalid email
    You have successfuly registered to our newsletter.
  •  
  •  
  •  
Thanks for your registration, follow us on our social networks to keep up-to-date