unicode + xml

L

Laurent Luce

Hello,

I am trying to do the following:

- read list of folders in a specific directory: os.listdir() - some folders have Japanese characters
- post list of folders as xml to a web server: I used content-type 'text/xml' and I use '<?xml version="1.0" encoding="utf-8"?>' to start the xml data.
- on the server side (Django), I get the data using post_data and I use minidom.parseString() to parse it. I get an exception because of the following in the xml for one of the folder name:
'/ufffdX/ufffd^/ufffd[/ufffdg /ufffd/ufffd/ufffdj/ufffd/ufffd/ufffd['

The weird thing is that I see 5 bytes for each unicode character: ie: /ufffdX

Should I format the data differently inside the xml so minidom is happy ?

Laurent
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,733
Latest member
LonaMonzon

Latest Threads

Top