Duck Typing: xml

Thursday, May 10, 2007

Snakeskin Pyjamas?

I was browsing around and I came across this project.. which promises that I can "build AJAX apps in Python (like Google did for Java)". A bold claim.. I'll have to try it out and see if it really is that easy.

It's not an encouraging sign that the webpage is borked (most links appear to point to something that went away) but follow the links to download and things improve.. Google Code is lurking back there.

Monday, April 30, 2007

A quick Python recipe for a validating XML parser

ElementTree is now included in Python standard libraries from version 2.5 but as good as it is, it has no support for XMLschema validation and limited support for XPath. For that you need lxml which builds on the foundation of ElementTree.

Here's a few lines of Python to validate an XML document using a schema document using lxml:


from lxml import etree

# Parse the schema document
xsd = etree.ElementTree(file = 'schema.xsd')

# Build an XMLSchema object from the parsed document
xsv = etree.XMLSchema(xsd)

# Validate the document using the schema
doc = etree.ElementTree(file = 'doc.xml')
xsv.validate(doc)

And that's it!

If you also want to perform Xpath operations then here's a few examples:

# continuing from above

# Find all nodes with a tagname amount
nodes = doc.xpath('//amount)

# Find all nodes with a tagname amount and attribute value with value 7
nodes = doc.xpath('//amount[value=7])

# Need a namespace? Supply a dictionary
nodes = doc.xpath('//cdf:amount, {'cdf' : 'http://uri.namespace.org/1.0'})

Later: lxml uses libxml under the hood to do its magic. Apparently, there are some bugs. When trying to validate XCCDF documents, errors are generated. This forced me into actually using C++ to build a schema validator which was kind of useful, seeing as that's what I was supposed to be doing in the first place.

Duck Typing

Thursday, May 10, 2007

Snakeskin Pyjamas?

Monday, April 30, 2007

A quick Python recipe for a validating XML parser

Blog Archive

Other Blogs To Read: