python.pdf

and libxslt installed and run "python setup.py build install" in the 

module tree. 

The distribution includes a set of examples and regression tests for the 

python bindings in the python/tests directory. Here are some 

excerpts from those tests:tst.py:This is a basic test of the file interface an 

doc = libxml2.parseFile("tst.xml") 

if doc.name != "tst.xml": 

print "doc.name failed" 

sys.exit(1) 

root = doc.children 

if root.name != "doc": 

print "root.name failed" 

sys.exit(1) 

child = root.children 

if child.name != "foo": 

print "child.name failed" 

sys.exit(1) 

doc.freeDoc()The Python module is called libxml2; parseFile is the equivalent of 

xmlParseFile (most of the bindings are automatically generated, and the xml 

prefix is removed and the casing convention are kept). All node seen at the 

binding level share the same subset of accessors:name : returns the n 

type : returns a string indicating the node type 

content : returns the content of the node, it is based on 

xmlNodeGetContent() and hence is recursive. 

parent , children, last, 

next, prev, doc, 

properties: pointing to the associated element in the tree, 

those may return None in case no such link exists. 

Also note the need to explicitly deallocate documents with freeDoc() . 

Reference counting for libxml2 trees would need quite a lot of work to 

function properly, and rather than risk memory leaks if not implemented 

correctly it sounds safer to have an explicit function to free a tree. The 

wrapper python objects like doc, root or child are them automatically garbage 

collected.validate.py:This test check the validation interfaces and redirectio 

messages:import libxml2 

#deactivate error messages from the validation 

def noerr(ctx, str): 

pass 

libxml2.registerErrorHandler(noerr, None) 

ctxt = libxml2.createFileParserCtxt("invalid.xml") 

ctxt.validate(1) 

ctxt.parseDocument() 

doc = ctxt.doc() 

valid = ctxt.isValid() 

doc.freeDoc() 

if valid != 0: 

print "validity check failed"The first thing to notice is the call to registerErr 

defines a new error handler global to the library. It is used to avoid seeing 

the error messages when trying to validate the invalid document.The main interest of th 

createFileParserCtxt() and how the behaviour can be changed before calling 

parseDocument() . Similarly the informations resulting from the parsing phase 

are also available using context methods.Contexts like nodes are defined as class and t 

C function interfaces in terms of objects method as much as possible. The 

best to get a complete view of what methods are supported is to look at the 

libxml2.py module containing all the wrappers.push.py:This test show how to ac

ctxt = libxml2.createPushParser(None, "<foo", 4, "test.xml") 

ctxt.parseChunk("/>", 2, 1) 

doc = ctxt.doc() 

doc.freeDoc()The context is created with a special call based on the 

xmlCreatePushParser() from the C library. The first argument is an optional 

SAX callback object, then the initial set of data, the length and the name of 

the resource in case URI-References need to be computed by the parser.Then the data are 

setting the third argument terminate to 1.pushSAX.py:this test show the use of 

the parser does not build a document, but provides callback information as 

the parser makes progresses analyzing the data being provided:import libxml2 

log = "" 

class callback: 

def startDocument(self): 

global log 

log = log + "startDocument:" 

def endDocument(self): 

global log 

log = log + "endDocument:" 

def startElement(self, tag, attrs): 

global log 

log = log + "startElement %s %s:" % (tag, attrs) 

def endElement(self, tag): 

global log 

log = log + "endElement %s:" % (tag) 

def characters(self, data): 

global log 

log = log + "characters: %s:" % (data) 

def warning(self, msg): 

global log 

log = log + "warning: %s:" % (msg) 

def error(self, msg): 

global log 

log = log + "error: %s:" % (msg) 

def fatalError(self, msg): 

global log 

log = log + "fatalError: %s:" % (msg) 

handler = callback() 

ctxt = libxml2.createPushParser(handler, "<foo", 4, "test.xml") 

chunk = " url=’tst’>b" 

ctxt.parseChunk(chunk, len(chunk), 0) 

chunk = "ar</foo>" 

ctxt.parseChunk(chunk, len(chunk), 1) 

reference = "startDocument:startElement foo {’url’: ’tst’}:" + \ 

"characters: bar:endElement foo:endDocument:" 

if log != reference: 

print "Error got: %s" % log 

print "Expected: %s" % referenceThe key object in that test is the handler, it pr 

points which can be called by the parser as it makes progresses to indicate

the information set obtained. The full set of callback is larger than what 

the callback class in that specific example implements (see the SAX 

definition for a complete list). The wrapper will only call those supplied by 

the object when activated. The startElement receives the names of the element 

and a dictionary containing the attributes carried by this element.Also note that the r 

single character call even though the string "bar" is passed to the parser 

from 2 different call to parseChunk()xpath.py:This is a basic test of XPath wr 


ctxt = doc.xpathNewContext() 

res = ctxt.xpathEval("//*") 

if len(res) != 2: 

print "xpath query: wrong node set size" 

sys.exit(1) 

if res[0].name != "doc" or res[1].name != "foo": 

print "xpath query: wrong node set value" 

sys.exit(1) 

doc.freeDoc() 

ctxt.xpathFreeContext()This test parses a file, then create an XPath context to evalu 

expression on it. The xpathEval() method execute an XPath query and returns 

the result mapped in a Python way. String and numbers are natively converted, 

and node sets are returned as a tuple of libxml2 Python nodes wrappers. Like 

the document, the XPath context need to be freed explicitly, also not that 

the result of the XPath query may point back to the document tree and hence 

the document must be freed after the result of the query is used.xpathext.py:T 

python:import libxml2 

def foo(ctx, x): 

return x + 1 


ctxt = doc.xpathNewContext() 

libxml2.registerXPathFunction(ctxt._o, "foo", None, foo) 

res = ctxt.xpathEval("foo(1)") 

if res != 2: 

print "xpath extension failure" 

doc.freeDoc() 

ctxt.xpathFreeContext()Note how the extension function is registered with the context 

part is not yet finalized, this may change slightly in the future).tstxpath.py:

calls dumpMemory() which saves that list in a .memdump file.

python.pdf

Create successful ePaper yourself

Delete template?

Save as template?