Thursday, March 29, 2012

Xml stack for node.js that works on windows

When I developed wcf.js I extensively used xml operations. Finding the right libraries was not an easy task so I thought to share my findings here. My requirements were to use dom style xml parsing and that the whole stack will be multi-platform (read: work on windows). It turned out that there are many libraries that fulfill one of these requirements but it was very hard to find one which fulfills both. Then I wanted to run xpath operations on the dom. And again I needed a library that works on windows and integrates well with the former dom parser.

I started my journey with googling for "node.js xml parser". I immediately found node-xml which is a pure javascript sax parser. Finding other sax parsers was also easy but that was not what I had in mind. I then moved to "node.js xml dom". This actually led me to the main listing of node libraries sorted by category, and I immediately turned to the xml section. I felt like I was drinking from the firehose: Over 15 xml parsers were listed. It was very disappointing to find out that most of them are based on libxml2 which means they will work on windows only via cygwin. That's evil.


xmldom
Just before I started to roll my own xml parser I have found xmldom. Xmldom is a pure javascript implementation of dom (and sax) which makes it fully portable to any environment.

xpath.js
I also needed an xpath engine. Finding one that is cross platform was not an easier task. I have finally found xpath.js. The latter was actually not written as a node.js module (it dates back to 2006) but it was fairly easy to migrate it there. As you can see here, I just added to it this method in the end:

function SelectNodes(doc, xpath)

{
  var parser = new XPathParser();
  var xpath = parser.parse(xpath);
  var context = new XPathContext();
  context.expressionContextNode = doc.documentElement;
  var res = xpath.evaluate(context)
  return res.toArray();
}

exports.SelectNodes = SelectNodes;

Making it all work together
The following sample shows how to parse an xml document and match an xpath on it.
Note you should include the updated xpath.js as part of your project (e.g. in /lib) and the second line in the sample should reference that path. You should also install xmldom using  npm install xmldom.

var select = require('./xpath').SelectNodes   //the path to xpath.js in your project
  , Dom = require('xmldom').DOMParser

var doc = new Dom().parseFromString('<x><y id="1"></y></x>')
  , res = select(doc, "//*[@id]") //select all nodes that has an "id" attribute

 if (res.length==1)
  console.log(res[0].localName); //prints "y"

What's next? get this blog rss updates or register for mail updates!

5 comments:

Dave said...

Yes! Thank you!

Surender Panuganti said...

Thanks a lot.

syed amjad said...

Hi Yaron,

var doc = new DOMParser().parseFromString('amjad');
var y = select(doc, "/*/*[local-name(.)='y']")[0]

In the above i got the element "y", after, how to get the content of this element i mean to say the value of the element.

Please help me soon

Thanks
Amjad

Yaron Naveh (MVP) said...

syed check this out:

https://github.com/yaronn/xpath.js

kimOz said...

Great work ...