Here's a few lines of Python to validate an XML document using a schema document using lxml:
from lxml import etree
# Parse the schema document
xsd = etree.ElementTree(file = 'schema.xsd')
# Build an XMLSchema object from the parsed document
xsv = etree.XMLSchema(xsd)
# Validate the document using the schema
doc = etree.ElementTree(file = 'doc.xml')
xsv.validate(doc)
And that's it!
If you also want to perform Xpath operations then here's a few examples:
# continuing from above
# Find all nodes with a tagname amount
nodes = doc.xpath('//amount)
# Find all nodes with a tagname amount and attribute value with value 7
nodes = doc.xpath('//amount[value=7])
# Need a namespace? Supply a dictionary
nodes = doc.xpath('//cdf:amount, {'cdf' : 'http://uri.namespace.org/1.0'})
Later: lxml uses libxml under the hood to do its magic. Apparently, there are some bugs. When trying to validate XCCDF documents, errors are generated. This forced me into actually using C++ to build a schema validator which was kind of useful, seeing as that's what I was supposed to be doing in the first place.
No comments:
Post a Comment