How to read strings cantaining escape character from a file and useit as escape sequences?

S

slomo

How to read strings cantaining escape character from a file and use it
as escape sequences?

for example, a file 'unicodes.txt' has contents:

\u0050\u0079\u0074\u0068\u006f\u006e

Now,
\u0050\u0079\u0074\u0068\u006f\u006e

But I want to get a string:

"\u0050\u0079\u0074\u0068\u006f\u006e"

How do you make it?
 
D

Duncan Booth

slomo said:
\u0050\u0079\u0074\u0068\u006f\u006e

But I want to get a string:

"\u0050\u0079\u0074\u0068\u006f\u006e"

How do you make it?

line.decode('unicode-escape')
 
J

John Machin

line.decode('unicode-escape')

Amazing what you can find in obscure corners of the obscure docs! BTW,
how many folks know what "bijective" means ?

Hmmm ... the encode is documented as "Produce a string that is
suitable as Unicode literal in Python source code", but it *isn't*
suitable. A Unicode literal is u'blah', this gives just blah. Worse,
it leaves the caller to nut out how to escape apostrophes and quotes:

Why would someone bother writing this codec when repr() does the job
properly?

Anyhow, here's a solution to the OP's stated problem from first
principles using basic building blocks:
 
B

Bjoern Schliessmann

John said:
Amazing what you can find in obscure corners of the obscure docs!
BTW, how many folks know what "bijective" means ?

Everyone that can read and is smart enough to enter "bijective" into
Wikipedia search.

Regards,


Björn
 
D

Duncan Booth

John Machin said:
Hmmm ... the encode is documented as "Produce a string that is
suitable as Unicode literal in Python source code", but it *isn't*
suitable. A Unicode literal is u'blah', this gives just blah. Worse,
it leaves the caller to nut out how to escape apostrophes and quotes:


Why would someone bother writing this codec when repr() does the job
properly?
I don't know why it was written, but if it helps I can tell you why I have
had occasion to use it: precisely because it does leave the caller to 'nut
out how to escape apostrophes and quotes'.

repr() does a good enough job if you just want a Python source string, but
you can't control whether repr will escape quotes or apostrophes - if the
string contains an apostrophe and no double-quote then the repr will
enclose it in double-quotes, otherwise it always uses single quotes.
(u'"', u'"\'', u'\'"', u"'")

If you want to force a particular quoting convention then unicode-escape
gets you half way there and you can get the rest of the way with a couple
of replace calls.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,814
Latest member
SpicetreeDigital

Latest Threads

Top