S
Simon Willison
I'm having a horrible time trying to get xml.dom.pulldom to consume a
UTF8 encoded XML file. Here's what I've tried so far:
....
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
position 21: ordinal not in range(128)
xml.dom.minidom can handle the string just fine:
u'<?xml version="1.0" ?><msg>Simon\u2019s XML nightmare</msg>'
If I pass a unicode string to pulldom instead of a utf8 encoded
bytestring it still breaks:
....
/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/
xml/dom/pulldom.py in parseString(string, parser)
346
347 bufsize = len(string)
--> 348 buf = StringIO(string)
349 if not parser:
350 parser = xml.sax.make_parser()
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
position 32: ordinal not in range(128)
Is it possible to consume utf8 or unicode using xml.dom.pulldom or
should I try something else?
Thanks,
Simon Willison
UTF8 encoded XML file. Here's what I've tried so far:
....
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
position 21: ordinal not in range(128)
xml.dom.minidom can handle the string just fine:
u'<?xml version="1.0" ?><msg>Simon\u2019s XML nightmare</msg>'
If I pass a unicode string to pulldom instead of a utf8 encoded
bytestring it still breaks:
....
/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/
xml/dom/pulldom.py in parseString(string, parser)
346
347 bufsize = len(string)
--> 348 buf = StringIO(string)
349 if not parser:
350 parser = xml.sax.make_parser()
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
position 32: ordinal not in range(128)
Is it possible to consume utf8 or unicode using xml.dom.pulldom or
should I try something else?
Thanks,
Simon Willison