setDocumentLocator in validating parser (xmlproc)

C

Cees Wesseling

Hi,

it seems that xmlproc, the default Validating parser, in my setup does
not call back to setDocumentLocator. Is there anyway to get a locator
in my handler?
Below you find an example and its output.

Regards, Cees

# base imports
from xml.sax.handler import ContentHandler
from xml.sax.handler import EntityResolver
import xml.sax
import xml.sax.sax2exts

class BaseHandler(ContentHandler):
def setDocumentLocator(self,locator):
print "setDocumentLocator called"
self.d_locator=locator

def startElement(self, name, attr):
print "startElement", name


open('e.dtd','w').write('<!ELEMENT E EMPTY>')
open('e.xml','w').write(
"""<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE E SYSTEM "e.dtd"><E/>""")

vp = xml.sax.sax2exts.XMLValParserFactory.make_parser()
print "vp type", vp.__class__
vph = BaseHandler()
vp.setContentHandler(vph)
vp.parse("e.xml")

np = xml.sax.make_parser()
print "np type", np.__class__
nph = BaseHandler()
np.setContentHandler(nph)
np.parse("e.xml")

OUTPUT:
vp type xml.sax.drivers2.drv_xmlproc.XmlprocDriver
startElement E
np type xml.sax.expatreader.ExpatParser
setDocumentLocator called
startElement E
 
J

James Kew

Cees Wesseling said:
it seems that xmlproc, the default Validating parser, in my setup does
not call back to setDocumentLocator. Is there anyway to get a locator
in my handler?

It's a known bug with a simple patch -- I don't know why it wasn't fixed in
PyXML 0.8.4.
http://sourceforge.net/tracker/?func=detail&aid=835638&group_id=6473&atid=106473

I had the same problem a while ago; I ended up doing a monkeypatch to
xml.sax.drivers2.drv_xmlproc to add the missing call:

import xml.sax.drivers2.drv_xmlproc

# Override the set_locator method.
def set_locator(self, locator):
# Existing code.
self._locator = locator
# ...but also call the ContentHandler.
# drv_xmlproc already implements the Locator interface.
self._cont_handler.setDocumentLocator(self)

setattr(xml.sax.drivers2.drv_xmlproc.XmlprocDriver, "set_locator",
set_locator)

HTH,

James Kew
http://jameskew.blogspot.com
 
C

Cees Wesseling

Thanks James, a perfect patch that works perfect for me.

On a side note, it seems the locator is not standarized. expat gives
positions at the "<"-char of startElement while xmlproc (PyXml) gives
it the ">"-char of startElement.And both one columnNumber off. A bit
annoying when swapping parsers.

Cees
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top