Filter DOM Nodes Using a TreeWalker

One of the most important DOM operations is tree traversal. It’s one that is made more complicated by the wide range of possible node types — there are Text, Element, Comment and other special nodes, such as ProcessingInstruction or DocumentType. Most of them won’t have any childNodes, and then there are some which carry only a single piece of information. For instance, a Comment node only carries the specified comment string. That’s where the TreeWalker API comes in. Its job is to filter out and iterate through the nodes we want from a DOM tree. In fact, there are two such APIs: NodeIterator and TreeWalker. They’re quite similar in many ways, but with some notable differences. In today’s tutorial, we’ll learn how to use the TreeWalker, while the NodeIterator will be the subject of the next article.

Instantiating a TreeWalker

The TreeWalker object may be instantiated via its document.createTreeWalker() constructor. This method accepts four parameters and greatly simplifies tasks that would usually take many times more code using conventional methods. The syntax of the document.createTreeWalker() constructor is as follows:

document.createTreeWalker(root, nodesToShow, filter, entityExpandBol)

Here’s a brief description of each of the four parameters:

root: The root node to begin searching the document tree using.

nodesToShow: The type of nodes that should be visited by TreeWalker. It may be one of fifteen constants:

  NodeFilter.SHOW_ALL 	
  NodeFilter.SHOW_ENTITY_REFERENCE 	
  NodeFilter.SHOW_DOCUMENT_TYPE
 *NodeFilter.SHOW_ELEMENT
  NodeFilter.SHOW_ENTITY 	
  NodeFilter.SHOW_FRAGMENT
  NodeFilter.SHOW_ATTRIBUTE 	
  NodeFilter.SHOW_PROCESSING_INSTRUCTION 	
  NodeFilter.SHOW_NOTATION
  NodeFilter.SHOW_TEXT 	
  NodeFilter.SHOW_COMMENT 	 
  NodeFilter.SHOW_CDATA_SECTION 	
  NodeFilter.SHOW_DOCUMENT
  
  *returns all element nodes

filter (or null): Reference to custom function (NodeFilter object) to filter the nodes returned. Enter null for none.
entityExpandBol: Boolean parameter specifying whether entity references should be expanded.

Using the above NodeFilter constants, we can filter all nodes in the document that are of a certain element type and carry a particular attribute.

A Couple of Instantiation Examples

Our first example will simply iterate over all elements within a DIV element:

<div id="main">
<p>This is a <span>paragraph</span></p>
<b>Bold text</b>
 
<script type="text/javascript">
var mainDiv = document.getElementById("main");
var walker  = document.createTreeWalker(mainDiv, NodeFilter.SHOW_ELEMENT, null, false);
console.log(walker);
</script>

Our second example is more complex, and only fetches non-empty textNodes:

var treeWalker = document.createTreeWalker(
  mainDiv,
  NodeFilter.SHOW_TEXT,
  function(node) {
    return (node.nodeValue.trim() !== "") 
         ? NodeFilter.FILTER_ACCEPT 
         : NodeFilter.FILTER_REJECT;
  },
  false
);

Traversing the DOM Nodes

Having created a filtered list of nodes using document.createTreeWalker(), you can then process these filtered nodes using TreeWalker’s traversal methods:

firstChild(): Travels to and returns the first child of the current node.
lastChild(): Travels to and returns the last child of the current node.
nextNode(): Travels to and returns the next node within the filtered collection of nodes.
nextSibling(): Travels to and returns the next sibling of the current node.
parentNode(): Travels to and returns the current node’s parent node.
previousNode(): Travels to and returns the previous node of the current node.
previousSibling(): Travels to and returns the previous sibling of the current node.

Not to be confused with the standard DOM element methods, the above methods belong to the TreeWalker object exclusively for navigating through its filtered nodes.

Using the same DIV as above, let’s see how to use the traversal methods to walk through the returned nodes:

 
//Alert the starting node Tree Walker currently points to (root node)
//displays DIV (with id=main)
console.log(walker.currentNode.tagName); 
 
//Step through and alert all child nodes
while (walker.nextNode()) {
  //displays P, SPAN, and B.
  console.log(walker.currentNode.tagName); 
}

//Go back to the first child node of the collection and display it
//to do that, we must reset TreeWalker pointer to point to main DIV
walker.currentNode = mainDiv; 
//displays P
console.log(walker.firstChild().tagName);

As we step through each node using the traversal methods, true to its name, the TreeWalker does not only return each node, but travels to it. That’s why after each call to walker.nextNode(), we must reset the TreeWalker's position back to its root node before trying to retrieve the firstChild of the filtered collection:

//reset TreeWalker pointer to point to main DIV
walker.currentNode = mainDiv;

This is necessary because, after running through the while loop, the TreeWalker’s pointer is directed at the very last node (B element) of the collection. Not only is there no firstChild, even if there were, it wouldn’t be the firstChild of the entire filtered collection, but rather the B element’s.

Conclusion

Iterating through the DOM tree is often necessary for DOM manipulation and node retrieval. The TreeWalker API offers one way to do that. If there is a downside to the TreeWalker API, it’s that tree structures are not as simple as 1-dimensional arrays. They can be mapped to 1-dimensional arrays, but that requires iterating over its structure, which is sort of redundant.

Filter DOM Nodes Using a TreeWalker

Instantiating a TreeWalker

A Couple of Instantiation Examples

Traversing the DOM Nodes

Conclusion

Get the Free Newsletter!

Popular Articles

How to Reload the Page

HTML5 Navigation: Using an Anchor Tag for Hypertext

How to Create Indents and Bullet Lists

Featured

Top Online Courses to Learn SEO

Sellzone Marketing Tool for Amazon Review

The Top Database Plugins for WordPress

The Revolutionary ES6 Rest and Spread Operators

Advertisers

Menu

Our Brands