Convert hexadecimal characters to ascii

D

Durand

Hi,

I've got this weird problem where in some strings, parts of the string are in hexadecimal, or thats what I think they are. I'm not exactly sure...I get something like this: 's\x08 \x08Test!' from parsing a log file. From what I found on the internet, x08 is the backspace character but I'm still not sure.
Anyway, I need to clean up this string to get rid of any hexadecimal characters so that it just looks like 'Test!'. Are there any functions to do this?

Thanks =)
 
M

Mensanator

Hi,

I've got this weird problem where in some strings, parts of the string are in hexadecimal, or thats what I think they are. I'm not exactly sure...I get something like this: 's\x08 \x08Test!' from parsing a log file. From what I found on the internet, x08 is the backspace character but I'm still not sure.
Anyway, I need to clean up this string to get rid of any hexadecimal characters so that it just looks like 'Test!'. Are there any functions to do this?

Thanks =)

Here's one:
a = ''.join([chr(i) for i in xrange(64)])
a
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f
\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&
b = ''.join([i for i in a if ord(i)>32])
b
'!"#$%&\'()*+,-./0123456789:;<=>?'
 
J

John Machin

Hi,

I've got this weird problem where in some strings, parts of the string are in hexadecimal, or thats what I think they are. I'm not exactly sure...I get something like this: 's\x08 \x08Test!' from parsing a log file. From what I found on the internet, x08 is the backspace character but I'm still not sure.

What you have is the output of the repr() function, which gives an
unambiguous representation in printable ASCII of the string, with the
extra bonus that it's a valid Python string constant that can be used
in code to produce exactly the same value. What your example means is:
the string contains 's', a backspace, a space, and a backspace,
followed by 'Test!'. Try this at the Python interactive prompt:

| >>> q = 's\x08 \x08Test!'
| >>> len(q)
| 9
Note there are only 4 characters infront of 'Test!'
| >>> q
| 's\x08 \x08Test!'

What you have looks like very raw keyboard input:
s
oops
space
oops
T
e
etc
Anyway, I need to clean up this string to get rid of any hexadecimal characters so that it just looks like 'Test!'. Are there any functions to do this?

Pardon the pedantry, but you don't need to "get rid of any hexadecimal
characters" ... hexadecimal characters are 01234567890ABCDEFabcdef :)

I guess that what you would like to do is simulate the keyboard
processing of backspaces:

| >>> def unbs(strg):
| ... stack = []
| ... for c in strg:
| ... if c == '\x08':
| ... if stack:
| ... stack.pop()
| ... else:
| ... stack.append(c)
| ... return ''.join(stack)
| ...
| >>> unbs(q)
| 'Test!'

BTW, '\b' means the same as '\x08'; saves keystrokes when testing.

| >>> unbs('abc\b\b\bxyz!\b')
| 'xyz'

HTH,
John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,833
Latest member
BettyeMacf

Latest Threads

Top