How to 'de-slashify' a string?

A

AK

Hi, if I have a string '\\303\\266', how can I convert it to '\303\266'
in a general way?

The problem I'm running into is that I'm connecting with pygresql to a
postgres database and when I get fields that are of 'char' type, I get
them in unicode, but when I get fields of 'byte' type, I get the text
with quoted slashes, e.g. '\303' becomes '\\303' and so on.

I saw posts online to do

cursor.execute("set client-encoding to unicode")

before running queries but this command causes an error.

So I think I need a way to de-quote the slashes or alternatively
some way to tell pygresql not to quote slashes for 'byte' fields.

Any help, hints, etc appreciated..
 
S

Steven D'Aprano

Hi, if I have a string '\\303\\266', how can I convert it to '\303\266'
in a general way?

It's not clear what you mean.

Do you mean you have a string '\\303\\266', that is:

backslash backslash three zero three backslash backslash two six six

If so, then the simplest way is:
\303\266


Another possibility:
\303\266

So s is:
backslash three zero three backslash two six six

and you don't need to do any more.

The problem I'm running into is that I'm connecting with pygresql to a
postgres database and when I get fields that are of 'char' type, I get
them in unicode, but when I get fields of 'byte' type, I get the text
with quoted slashes, e.g. '\303' becomes '\\303' and so on.

Is pygresql quoting the backslash, or do you just think it is quoting the
backslashes? How do you know? E.g. if you have '\\303', what is the
length of that? 4 or 5?
 
A

AK

Steven said:
It's not clear what you mean.

Do you mean you have a string '\\303\\266', that is:

backslash backslash three zero three backslash backslash two six six

If so, then the simplest way is:

\303\266


Another possibility:

\303\266

So s is:
backslash three zero three backslash two six six

and you don't need to do any more.

Well, I need the string itself to become '\303\266', not to print
that way. In other words, when I do 'print s', it should display
unicode characters if my term is set to show them, instead of
showing \303\266.
Is pygresql quoting the backslash, or do you just think it is quoting the
backslashes? How do you know? E.g. if you have '\\303', what is the
length of that? 4 or 5?

Length is 4, and I need it to be length of 1. E.g.:
1


What I get from pygresql is x, what I need is s. Either by asking
pygresql to do this or convert it afterwards. I can't do
replace('\\303', '\303') because it can be any unicode character.
 
V

Vlastimil Brom

2009/8/22 AK said:
Well, I need the string itself to become '\303\266', not to print
that way. In other words, when I do 'print s', it should display
unicode characters if my term is set to show them, instead of
showing \303\266.


Length is 4, and I need it to be length of 1. E.g.:

1


What I get from pygresql is x, what I need is s. Either by asking pygresql
to do this or convert it afterwards. I can't do replace('\\303', '\303')
because it can be any unicode character.


Hi,
do you mean something like
̃
̃ (dec.: 771) (hex.: 0x303) ̃ COMBINING TILDE (Mark, Nonspacing)
?

vbr
 
A

AK

Vlastimil said:
Hi,
do you mean something like

̃
̃ (dec.: 771) (hex.: 0x303) ̃ COMBINING TILDE (Mark, Nonspacing)
?

vbr

Yes, something like that except that it starts out as '\\303\\266', and
it's good enough for me if it turns into '\303\266', in fact that's
rendered as one unicode char. In other words, when you do:
'\303\266'

I need that result to become a python string, i.e. the slashes need to
be converted from literal slashes to escape slashes.
 
S

Steven D'Aprano

Length is 4, and I need it to be length of 1. E.g.:

1


What I get from pygresql is x, what I need is s. Either by asking
pygresql to do this or convert it afterwards. I can't do
replace('\\303', '\303') because it can be any unicode character.


Use the 'unicode-escape' codec to decode the byte-string to Unicode.
2
 
V

Vlastimil Brom

2009/8/22 AK said:
Yes, something like that except that it starts out as '\\303\\266', and it's
good enough for me if it turns into '\303\266', in fact that's rendered as
one unicode char. In other words, when you do:

'\303\266'

I need that result to become a python string, i.e. the slashes need to
be converted from literal slashes to escape slashes.

Not sure, whether it is the right way of handling the such text data, but maybe:

It might be an IDLE issue, but it still isn't one unicode glyph.

I guess, you have to ensure, that the input data is valid and the
right encoding is used.

hth
vbr
 
A

AK

Vlastimil said:
Not sure, whether it is the right way of handling the such text data, but maybe:

ö

It might be an IDLE issue, but it still isn't one unicode glyph.

I guess, you have to ensure, that the input data is valid and the
right encoding is used.

hth
vbr

Actually, this works perfectly for me. It prints out as one character in
gnome-terminal and also when I write it to a text file, and open it as
utf-8 format in gnumeric, it also shows up properly.

Thanks to all who helped! -AK
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,199
Messages
2,571,045
Members
47,643
Latest member
ashutoshjha_1101

Latest Threads

Top