A few weeks ago I was looking for a solution to parse an XML file into XHTML. This can be helpful if you want to store data for your web site in XML and want a quick scripting language to access it and parse it, this technique can be used to parse RSS since it is XML.
I spent a couple of hours scouring the web for information. I was able to gather bits and pieces from different ends of the Internet to come up with a solution.
First let’s look at the XML document. The data in my XML file contains multiple choice questions and their answers. This means I will need to parse out the data and form a label for the question and radio buttons for the answers. Nothing fancy here, my question node has a number attribute and a text attribute. The answer nodes have a text attribute as well, and a value attribute to determine whether the radio button should be checked by default.
The XML looks like:
<?xml version=”1.0″ encoding=”iso-8859-1″?>
<questions>
<question number=”1″ text=”How painful is it to parse XML with JavaScript?”>
<option value=”Y” text=”Ungodly painful”></option>
<option value=”N” text=”Extremely painful”></option>
<option value=”N” text=”Moderately painful”></option>
<option value=”N” text=”Mildly painful”></option>
</question>
<question number=”2″ text=”How retarded is JScript?”>
<option value=”N” text=”Poo flinging retarded”></option>
<option value=”N” text=”Extremely retarded”></option>
<option value=”Y” text=”Moderately retarded”></option>
<option value=”N” text=”Mildly retarded”></option>
</question>
</questions>
On to coding, you need to determine what sort of browser the user will be using to view your parsed XML. The function I found to check the browser and load the XML appropriately, checks for Internet Explorer and non-evil-empire browsers, Firefox, Opera, etc.
Internet Explorer uses a different type of type of JavaScript when parsing a web page. Without boring you about the details, the basic difference is Microsoft uses a custom version of JavaScript called JScript, leave it to Microsoft to make life hard on us all.
I gathered the load XML data function from w3schools.
Here’s the function to check the browser and load the XML document:
//create variable to store the XML document
var xmlDoc;
//set variable to a nodeType of ELEMENT_NODE
//this way we can easily tell what type of node we are grabbing in our code
var ELEMENT_NODE = 1;
//create function to load the XML data from the specified document
//found this at w3schools http://www.w3schools.com/xml/xml_parser.asp
function loadXmlData()
{
// code for IE
if (window.ActiveXObject)
{
//setup active x object
xmlDoc=new ActiveXObject(“Microsoft.XMLDOM”);
//tell browser that its not an asynchronous call
xmlDoc.async=false;
//provide URL to the XML document to be parsed
xmlDoc.load(“http://sbiefeld.com/wp-content/uploads/2008/03/questions.xml”);
//call function to parse data
writeListForPOSBrowsers();
}
// code for Mozilla, Firefox, Opera, etc.
else if (document.implementation && document.implementation.createDocument)
{
//set up document variable
xmlDoc = document.implementation.createDocument(“”, “”, null);
//call function to parse data
xmlDoc.onload = writeListForRealBrowsers;
//provide URL to the XML document to be parsed
xmlDoc.load(“http://sbiefeld.com/wp-content/uploads/2008/03/questions.xml”);
}
else
{
//fire alert if XML cannot be parsed by an unknown browser
alert(‘Your browser cannot handle this script’);
}
}
Notice the different function calls per browser, writeListForPOSBrowsers() and writeListForRealBrowsers(). This was to handle the peculiarities of IE and JScript. The writeListForPOSBrowsers() function handles the parsing for IE while the writeListForRealBrowsers() function handles the parsing for other browsers.
I guess I will start with the writeListForPOSBrowsers() first and save the best for last. The basic flow is: loop through the question nodes in the XML file and pull out question attributes and parse them into HTML elements. For each question node, I loop through the answer nodes to and create the radio buttons for the answer choices.
The most frustrating thing is the lack of support for the DOM that JScript has. Instead of being able to reference HTML elements and append data accordingly I had to manually put my HTML elements in a string and then append to that string. Once everything is in place, I put that HTML string in the document. See comments in code for more specifics.
//parses XML for IE browser
//because of lack of sufficient DOM support in JScript things are done a bit differently
//i.e. hard coding html elements
function writeListForPOSBrowsers()
{
//searches for and gets the question tag elements in the XML file
var questions = xmlDoc.getElementsByTagName(‘question’);
//creates an HTML string and adds an unordered list element
var htmlCode = ‘<ul>’;
//loop through all of the question elements
for (i=0; i < questions.length; i++)
{
//get the question number from the number attribute of the question element
var questionId = questions[i].getAttribute(‘number’);
//add list item and question number to HTML string
htmlCode += ‘<li>’+‘(’ + questionId + ‘) ‘;
//ensure we are still dealing with a node of type ELEMENT_NODE
if (questions[i].nodeType != ELEMENT_NODE) continue;
//store the question data
var cdata = questions[i].getAttribute(‘text’);
//add question data and closing list item tag to HTML string
htmlCode += cdata + ‘</li>’;
//time to loop through the answer nodes
for (j=0; j < questions[i].childNodes.length; j++)
{
//adds break if not the first item, used purely for aesthetic reasons
if(j!=0){htmlCode += ‘<br/>’;}
//create a unique id for the input element
var inputId = questionId + ‘_’ + j;
//ensure we are still dealing with a node of type ELEMENT_NODE
if (questions[i].childNodes[j].nodeType != ELEMENT_NODE) continue;
//gets any answer text data from the node
var rbtnTxt = questions[i].childNodes[j].getAttribute(‘text’);
//gets any value data from the node
var isChkd = questions[i].childNodes[j].getAttribute(‘value’);
//declare variable to hold isChecked value
var isChkdValue;
//checks value data to set checked attribute on radio button
if(isChkd.toUpperCase() == ‘Y’)
{
//set radio button to checked
isChkdValue = ‘checked’;
}
else
{
//set radio button to unchecked
isChkdValue = ”;
}
//adds input element tags and attribute values to the HTML string
htmlCode += ‘<input type=”radio” id=”‘ + inputId + ‘” name=”question’+i+‘” ‘+isChkdValue+‘ value=”‘+ rbtnTxt+‘”‘;
//adds label element that stores answer text data to the HTML string
htmlCode += ‘<label>’+rbtnTxt+‘</label>’;
}
//add break elements to the HTML string
htmlCode += ‘<br/><br/>’;
}
//add questions and answers to the page
document.getElementById(‘updateTarget’).innerHTML = htmlCode;
}
Now, the best for last. The function to parse XML for non-IE browsers. In this function I could use the DOM, w00t for strongly typed objects, well kind of, since JavaScript is inherently not typed at all, but that is another topic.
I use exactly the same process as before loop through the question nodes and for each loop though its answer nodes. Then append the parsed items to the page.
//parses XML for non-IE browsers
function writeListForRealBrowsers()
{
//searches for and gets the question tag elements in the XML file
var questions = xmlDoc.getElementsByTagName(‘question’);
//creates an HTML ul element and stores it in a variable for later use
var ul = document.createElement(‘ul’);
//loop through all of the question elements
for (i=0; i < questions.length; i++)
{
//creates an new HTML li element and stores it in a variable
var li = document.createElement(‘li’);
//creates an new HTML br element and stores it in a variable
var brk = document.createElement(‘br’);
//get the question number from the number attribute of the question element
var questionId = questions[i].getAttribute(‘number’);
//I create a text node to store the question number
var questoinNum = document.createTextNode(‘(’ + questionId + ‘) ‘);
//append the question number to the list item element
li.appendChild(questoinNum);
//append the list item to the unordered list element
ul.appendChild(li);
//ensure that the current node we are looking at in the loop is an ELEMENT_NODE, if so continue on our journey
if (questions[i].nodeType != ELEMENT_NODE) continue;
//create a text node the hold the question data
var cdata = document.createTextNode(questions[i].getAttribute(‘text’));
//append question data to the list item
li.appendChild(cdata);
//append a break element to the list item
li.appendChild(brk);
//time to loop through the answer nodes
for (j=0; j < questions[i].childNodes.length; j++)
{
//creates an new HTML br element and stores it in a variable
var brk = document.createElement(‘br’);
//create a input element for the current answer node we are on
var inpt = document.createElement(‘input’);
//create a label to hold the answer data
var lbl = document.createElement(‘label’);
//specify the type of input element, in this case a radio button
inpt.type = ‘radio’;
//create a unique id for the input element
var inputId = questionId + ‘_’ + j;
//set the unique id
inpt.id = inputId;
//create a unique name for the input element
inpt.name = ‘question’ + i;
//ensure we are still dealing with a node of type ELEMENT_NODE
if (questions[i].childNodes[j].nodeType != ELEMENT_NODE) continue;
//gets any answer text data from the node
var rbtnTxt = questions[i].childNodes[j].getAttribute(‘text’);
//gets any value data from the node
var isChkd = questions[i].childNodes[j].getAttribute(‘value’);
//checks value data to set checked attribute on radio button
inpt.checked = (isChkd.toUpperCase() == ‘Y’);
//sets value data
inpt.value = rbtnTxt;
//appends answer text data to list item
lbl.appendChild(document.createTextNode(rbtnTxt));
//appends input element to list item
li.appendChild(inpt);
//appends answer text to the list item
li.appendChild(lbl);
//appends a break element to the list item
li.appendChild(brk);
}
}
//appends unordered list to the page
document.getElementById(‘updateTarget’).appendChild(ul);
}
Last but not least, the HTML to use the javascript:
<body onLoad=”loadXmlData()” id=’updateTarget’></body>
Simple enough, right? Well I hope my gatherings of how to parse XML with Javascript has been useful and helpful. This may by no means be the best way to do it, and it could be re-factored. It does however stay true to the title, Quick and Dirty XML Parsing with JavaScript. Grab the XML file and grab and test the HTML/JavaScript. Thanks much and expect more.