Unable to decode file written by C++ wostringstream

Y

Yan Cheng CHEOK

Currently, I have the following text file (https://sites.google.com/site/yanchengcheok/Home/TEST.TXT?attredirects=0&d=1) written by C++ wostringstream.

What I want to do it, I want to write a python script which accept user browser request, and then send over the entire file for user to download. The downloaded file, should be exactly same as the original text file inside server itself.

The code is written as follow :

import cgi

print "Content-Type: text/plain"
print "Content-Disposition: attachment; filename=TEST.txt"
print

filename = "C:\\TEST.TXT"
f = open(filename, 'r')
for line in f:
print line

However, when I open up the downloaded file, the file is all having weird characters. I try to use rb flag, it doesn't either.

Is there anything I had missed out? What I wish is, the file (TEST.TXT) downloaded by the client by making query to the above script, will be exactly same as the one in server.

I also try to specific the encoding explicitly.

import cgi

print "Content-Type: text/plain; charset=UTF-16"
print "Content-Disposition: attachment; filename=TEST.txt"
print

filename = "C:\\TEST.TXT"
f = open(filename, 'r')
for line in f:
print line.encode('utf-16')

It doesn't work either. Here is the screen shoot for original text file (http://i.imgur.com/S6SjX.png) and file after downloaded from a web browser. (http://i.imgur.com/l39Lc.png)

Is there anything I had missed out?

Thanks and Regards
Yan Cheng CHEOK
 
J

jmfauth

Currently, I have the following text file (https://sites.google.com/site/yanchengcheok/Home/TEST.TXT?attredirect...) written by C++ wostringstream.

The coding of the file is utf-16le. You should take care
of this coding when you *read* the file, and not when
you display its content.
.... r = f.readlines()
....
[u'\n', u' 0.000 1.500 3.000 0.526
0.527 0.527 0.00036 0.00109 1381.88
485.07\n', u' 0.000 1.500 3.000 1.084
1.085 1.086 0.00037 0.00111 1351.86
978.02\n', u' 0.000 1.500 3.000
1.166 1.167 1.168 0.00043 0.00130
1152.71 897.16\n', u' -3.000 0.000 3.000
-0.031 -0.029 -0.025 0.00158 0.00475
632.17 626.13\n']
jmf
 
U

Ulrich Eckhardt

Yan said:
Currently, I have the following text file
(https://sites.google.com/site/yanchengcheok/Home/TEST.TXT?attredirects=0&d=1)
written by C++ wostringstream.

Stringstream? I guess you meant wofstream, or? Anyway, the output encoding
of C++ iostreams is implementation-defined, so you can't assume that such
code is generally portable. If you want a certain encoding, you need to
tell the ofstream using the codecvt facet of the locale, a websearch should
turn up more info on that.

If you have the data in memory and it is encoded as UTF-16 there (which is
what MS Windows uses for its wchar_t) then you could also use a plain
ofstream, open it with the binary flag and then simply write the memory to
a file.

In any case, you need to know the encoding in order to get the content into
a Python string or unicode object, otherwise you will only get garbage.

Good luck!

Uli
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,969
Messages
2,570,161
Members
46,705
Latest member
Stefkari24

Latest Threads

Top