Index or Default?

By Vince Barnes

What in the URL is an index for?

To visit a website you simply type in its address, www.htmlgoodies.com for example, and off you go.  The next process, which is hidden from your eye, uses the Domain Name System (DNS) to resolve that name to a server address.  Your browser now formats a request for the specified site, sends it off the the server address obtained, and waits for a response.  The details of what happens at the server end when this request comes in are often overlooked, but those details reveal of few possible tricks that can be very useful to a web designer.  We'll take a look at them now.

What you typed in to the browser is called a Uniform Resource Locator, or URL.  The example above is actually an abbreviation for a URL.  The first part of the URL is the protocol.  Our browser (that is, most modern browsers) assumes that if we don't specify anything else, we mean the web protocol, which is the Hyper Text Transfer Protocol and is designated by "HTTP://" (without the quotes, and doesn't need capitals.) 

The next part of the URL is the name of the site itself.  It is actually made up of several discreet components also, but that is a subject for discussion under a DNS heading.  For now, all we need to know is that it points to the server that hosts the named site.  When the request is sent to that server (remember DNS returns us its actual, numerical, address) the name  of the requested site is included in the "request headers" -- a part of the request.  Sometimes the numerical address of the server is all that is needed for the server to identify which site is being asked for (a server can have multiple addresses and could use one of them to uniquely identify the site) but most of the time it will read the request header to find out which site you are asking for.

Having identified the requested site, the server now looks at its configuration details and finds out which directory (a.k.a. folder) contains the site.  It then goes to that directory and selects........

The trouble is, we never told it which page we wanted from the site.  Since we left out that information, it has to make an assumption, and so it picks the "default page" to be delivered.

In the days of the web's infancy, it was thought that if you pointed a browser to a particular directory, it would be handy to be returned a list of the files that are available to you in that directory.  This technique, called "Directory Browsing", while useful for certain types of things, usually creates a variety of security issues.  Additionally, it would be useful to not only know which files are available, but also something about what each contains.  Thus, there would be a page that is an index to the other pages that are to be made available to you in the directory.  What better name for such a page than "index.html"?  This became the first default page.  Having a default allows us to request www.htmlgoodies.com rather than having to specify www.htmlgoodies.com/index.html.  This default page, since it is the first thing you home in on (and its the place you always want to return to!) is called the home page.

Think a little further about this and how it applies to directories contained within the website directory.  Inside the main HTML Goodies directory is a directory called "letters".  It is used to hold the Goodies To Go newsletter archive.  (By the way, if you haven't subscribed, take a look over there in the left margin; put in your email address and click subscribe -- we never distribute your address to others and you'll enjoy up to the minute news and tips every week!)  To get to the archive you only need the directory name: www.htmgoodies.com/letters  and not the page name index.html which will be assumed by the server.  Less typing, less to remember; always a good thing!

Do you see the relationship between the URL and the directory structure in the site's main directory?  It's an important relationship to be thinking about when it comes to the usability of your site.  If you want to provide different sections of the site that your visitors can go to without having to go through the site's home page, it will help them if you put each section in its own directory.  That way there'll be less for them to remember and to type.

In "ancient history" (the days of Windows NT 3.51) filenames on Microsoft based systems were stuck in the old DOS naming method which used eight characters and a three character suffix (called 8.3 naming.)  Obviously, index.html wasn't going to work!  With perhaps a wish to be more leaders than followers, the good folks at Microsoft decided that their default page would, by default, be called "default.htm".  New OS releases eliminated that old naming restriction, but the default default page name remained default.htm (I hope this isn't getting confusing - I promise it's not default of de writer!)

In order to provide the flexibility of moving a site from one server to another in the easiest of manners, and bearing in mind that one server may be running apache with default pages named index.html while the other runs IIS with default.htm, it was necessary to enable specifying a list of default page names.  Having a list for defaults necessarily requires that they have a priority; look for this one, if it's not there look for this one, and so on.  Most web servers these days allow for such a list.

As a web designer, you can take advantage of this list to make your life much easier!  (You will need to know what the list is, however, so if you're not sure, check with your web host.)

Suppose you have an active website sitting up on your hosting service and you want to replace it with a new one with no disruption in service.  Not only can you do this easily by taking advantage of the default page name list, but you can have both versions residing in the same folders simultaneously, allowing you and certain insiders to check out the new stuff while your visitors are still delivered the older pages.

Let's assume that the site has been written such that all hyperlinks back to the home page use only the site's name and not the page name of the default page.  Let's also say for now that the default page name list for the site (on some servers, including IIS,  it can be different for different folders) is index.html, default.html, default.asp.  If we rename the site's home page to (or verify that its existing name is) index.html, we can put up the new one with the name default.html.  To see the new page we would include the page name in the URL we ask for.  Visitors not using a page name will still get the old home page since its name is the first name in the default page name list.  All the links from our new home page will take us where they are supposed to, but backward links will take us to their old counterparts (we expect that can override the effect by typing the page names we want.)  When we are satisfied with the new pages, all we have to do is rename the old home page, for example to oldindex.html, and the new home page is live!  (Since the first name in the default page name list is not found, the server will look for the second name in the list, find our new home page and deliver that.)

With a little careful naming of pages and directories, we can take advantage of the default page name list, and default pages in general to make navigation of our site a lot more comfortable for our visitors, while making our own lives a little easier at the same time.  Now that's a good thing!



Make a Comment

Loading Comments...

  • Web Development Newsletter Signup

    Invalid email
    You have successfuly registered to our newsletter.
  •  
  •  
  •