Tuesday, March 19, 2024

Index or Default?

What in the URL is an index for?

To visit a website you simply type in its
address, www.htmlgoodies.com for example, and off you go.  The next
process, which is hidden from your eye, uses the Domain Name System (DNS) to
resolve that name to a server address.  Your browser now formats a request
for the specified site, sends it off the the server address obtained, and waits
for a response.  The details of what happens at the server end when this
request comes in are often overlooked, but those details reveal of few possible
tricks that can be very useful to a web designer.  We’ll take a look at
them now.

What you typed in to the browser is called a Uniform Resource Locator, or
URL.  The example above is actually an abbreviation for a URL.  The
first part of the URL is the protocol.  Our browser (that is, most modern
browsers) assumes that if we don’t specify anything else, we mean the web
protocol, which is the Hyper Text Transfer Protocol and is designated by
"HTTP://" (without the quotes, and doesn’t need capitals.) 

The next part of the URL is the name of the site itself.  It is actually
made up of several discreet components also, but that is a subject for
discussion under a DNS heading.  For now, all we need to know is that it
points to the server that hosts the named site.  When the request is sent
to that server (remember DNS returns us its actual, numerical, address) the name 
of the requested site is included in the "request headers" — a part of the
request.  Sometimes the numerical address of the server is all that is
needed for the server to identify which site is being asked for (a server can
have multiple addresses and could use one of them to uniquely identify the site)
but most of the time it will read the request header to find out which site you
are asking for.

Having identified the requested site, the server now looks at its
configuration details and finds out which directory (a.k.a. folder) contains the
site.  It then goes to that directory and selects……..

The trouble is, we never told it which page we wanted from the site. 
Since we left out that information, it has to make an assumption, and so it
picks the "default page" to be delivered.

In the days of the web’s infancy, it was thought that if you pointed a
browser to a particular directory, it would be handy to be returned a list of
the files that are available to you in that directory.  This technique,
called "Directory Browsing", while useful for certain types of things, usually
creates a variety of security issues.  Additionally, it would be useful to
not only know which files are available, but also something about what each
contains.  Thus, there would be a page that is an index to the other pages
that are to be made available to you in the directory.  What better name
for such a page than "index.html"?  This became the first default page. 
Having a default allows us to request www.htmlgoodies.com rather than having to
specify www.htmlgoodies.com/index.html.  This default page, since it is the
first thing you home in on (and its the place you always want to return to!) is
called the home page.

Think a little further about this and how it applies to directories contained
within the website directory.  Inside the main HTML Goodies directory is a
directory called "letters".  It is used to hold the Goodies To Go
newsletter archive.  (By the way, if you haven’t subscribed, take a look
over there in the left margin; put in your email address and click subscribe —
we never distribute your address to others and you’ll enjoy up to the minute
news and tips every week!)  To get to the archive you only need the
directory name: www.htmgoodies.com/letters  and not the page name
index.html which will be assumed by the server.  Less typing, less to
remember; always a good thing!

Do you see the relationship between the URL and the directory structure in
the site’s main directory?  It’s an important relationship to be thinking
about when it comes to the usability of your site.  If you want to provide
different sections of the site that your visitors can go to without having to go
through the site’s home page, it will help them if you put each section in its
own directory.  That way there’ll be less for them to remember and to type.

In "ancient history" (the days of Windows NT 3.51) filenames on Microsoft
based systems were stuck in the old DOS naming method which used eight
characters and a three character suffix (called 8.3 naming.)  Obviously,
index.html wasn’t going to work!  With perhaps a wish to be more leaders
than followers, the good folks at Microsoft decided that their default page
would, by default, be called "default.htm".  New OS releases eliminated
that old naming restriction, but the default default page name remained
default.htm (I hope this isn’t getting confusing – I promise it’s not default of
de writer!)

In order to provide the flexibility of moving a site from one server to
another in the easiest of manners, and bearing in mind that one server may be
running apache with default pages named index.html while the other runs IIS with
default.htm, it was necessary to enable specifying a list of default page names. 
Having a list for defaults necessarily requires that they have a priority; look
for this one, if it’s not there look for this one, and so on.  Most web
servers these days allow for such a list.

As a web designer, you can take advantage of this list to make your life much
easier!  (You will need to know what the list is, however, so if you’re not
sure, check with your web host.)

Suppose you have an active website sitting up on your hosting service and you
want to replace it with a new one with no disruption in service.  Not only
can you do this easily by taking advantage of the default page name list, but
you can have both versions residing in the same folders simultaneously, allowing
you and certain insiders to check out the new stuff while your visitors are
still delivered the older pages.

Let’s assume that the site has been written such that all hyperlinks back to
the home page use only the site’s name and not the page name of the default
page.  Let’s also say for now that the default page name list for the site
(on some servers, including IIS,  it can be different for different
folders) is index.html, default.html, default.asp.  If we rename the site’s
home page to (or verify that its existing name is) index.html, we can put up the
new one with the name default.html.  To see the new page we would include
the page name in the URL we ask for.  Visitors not using a page name will
still get the old home page since its name is the first name in the default page
name list.  All the links from our new home page will take us where they
are supposed to, but backward links will take us to their old counterparts (we
expect that can override the effect by typing the page names we want.) 
When we are satisfied with the new pages, all we have to do is rename the old
home page, for example to oldindex.html, and the new home page is live! 
(Since the first name in the default page name list is not found, the server
will look for the second name in the list, find our new home page and deliver
that.)

With a little careful naming of pages and directories, we can take advantage
of the default page name list, and default pages in general to make navigation
of our site a lot more comfortable for our visitors, while making our own lives
a little easier at the same time.  Now that’s a good thing!

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Popular Articles

Featured