HTML Goodies: Script Tip: Week 80

By Joe Burns

 

Around the Script in 80 Tips...

     Last week we hit upon the doit() function. That function is actually sitting inside of a larger function named DelHTML(). You can probably guess that the name is shorthand for "Delete HTML". Today we'll figure out how it works and wrap this script up.


The Script's Effect


Here's the Code


     As always, it's easier to understand these functions and scripts if you understand the overriding reason for their existence. What does the author want this function to do?

     The purpose of the script is to eliminate all of the HTML tags and return just the text. That's not that hard of a job because we know that every tag begins with < and ends with >. So we need to create a function that will look for < and > and return to use everything except those characters and what is inside them.

     The process must take place time and time again in order to eliminate all tags. First you eliminate the first html tag, then return all the leftover text and work on it eliminating the next tag. Then you return all of the remaining text and work on that eliminating the next tag. It goes on until all the tags are gone.

     You'll need a couple of extra lines of code that recognizes a couple of abnormalities in HTML:

  • What if the first text is not a flag?
  • What if there's no >?
  • What if there are no tags?

     All of that is handled in the DelHTML() function:

function DelHTML(HTMLWord)
{
a = HTMLWord.indexOf("<");
b = HTMLWord.indexOf(">");

HTMLlen = HTMLWord.length;

c = HTMLWord.substring(0, a);

if(b == -1)
b = a;

d = HTMLWord.substring((b + 1), HTMLlen);

Word = c + d;

tmp = Word.indexOf("<");

if(tmp != -1)
Word = DelHTML(Word);

return Word;
}

     To begin with, the function is only triggered to run when the doit() function is enacted. The function acts upon the text sent to it by the doit() function. Again, there is an order to this script that you must keep.

     The DelHTML() function begins by setting the text sent to it to a new variable name, "HTMLWord". It does that by having that variable name in the instance, the parentheses, when the text arrives from the first textarea box. So now we're no longer working with "Input". It's the same text; we're just calling it "HTMLWord".

     We're looking for HTML tags, so we need to set the < and > apart. The variable names "a" and "b" are assigned respectively. Notice the variable asks for the indexOf < and >. That means it will only find it once. The script will need to run again for every tag until there aren't any left.

     The variable HTMLlen is assigned to the numeric length of the entire HTMLWord text.

     The next line tests if the first character is < or not. The variable "c" is assigned the substring between no text (0) and the first instance of "a", the < character.

     Maybe the first time the function runs, the first character is a <. Maybe the second time it runs it isn't. Remember, the second time the function runs, that first HTML tag has been eliminated.

     Next we ask if there are any > characters. JavaScript counts starting from zero, so minus one is really zero in JavaScript's mind. By asking if "b" (the > character) is equal to not existing, we test if one exists. If not, we look for another <. That's done by setting "b" to "a". Hopefully that never comes into play, but if it does, we're ready for it.

     Next, the variable "d" returns everything after the > to the end of the text. It's done by requesting the substring and > plus one (so the > isn't included in the return) and everything that follows (HTMLWord.length).

     Now a new variable is created, "Word". Word is equal to "c" plus "d". The variable "c" was everything before the first <. The variable "d" is everything after the the first >. Starting to see how it works?

     Now we have to test the remaining text again for another <. The variable "tmp" is set to represent the first < in the text. An If statement asks if a < exists. It's done by asking if "tmp" is not equal (!=) to zero. It seems a little backwards I know, but it works.

     If "tmp" is not equal to zero, meaning one exists, the text represented by "Word" is sent back up to
DelHTML() where it is reassigned the name "HTMLWord" and the process starts all over again.

TechCrawler
Want more information about forms?
Search the Web.

     If "tmp" is equal to zero, then the text represented by "Word" is returned. The text is run through one more time, the < character is never found and the text is returned to be displayed in the second textarea box.

     Believe it or not, that's it.

     It's a little involved, but still a very clever script. You'll use it more than you think right now.

     Next time, we'll get into a script that posts colored text to a page. It's a quick, easy way to create colorful looks on your Web pages. The JavaScript is quite involved and gets into some comments we haven't touched on yet. It's a fun script. We'll get to it next.

Next Week: Colored Text

     Do YOU have a Script Tip you'd like to share? How about suggesting a Script Tip to write about? I'd love to hear it. Write me at: jburns@htmlgoodies.com.


Learn to write your own JavaScripts with the HTML Goodies 30-Step Primer Series


Make a Comment

Loading Comments...

  • Web Development Newsletter Signup

    Invalid email
    You have successfuly registered to our newsletter.
  •  
  •  
  •  
Thanks for your registration, follow us on our social networks to keep up-to-date