Extract data from ASCII file

Ren · Feb 22, 2004

Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.

I can do this in C++ and Pascal, but it seems like Python may be more
suited for the task.

How is this accomplished using Python?

Mike C. Fletcher · Feb 22, 2004

With Python 2.3:
.... line = line[9:] # skip prefix
.... while line:
.... prefix, line = line[:4],line[4:]
.... yield prefix[2:]+prefix[:2]
........ print number
....
28E7
3005
00AC
30A5
00AD
0BAD
2805
0BAC
E2
If you want to convert the hexadecimal strings to actual integers, use
int( prefix, 16 ).

HTH,
Mike

Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.

I can do this in C++ and Pascal, but it seems like Python may be more
suited for the task.

How is this accomplished using Python?

_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/

Irmen de Jong · Feb 22, 2004

Ren said:
Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

Say the file is called data.txt
Try this:
---------------------------------
def process(line):
line=line[9:]
result=[]
for i in range(0,32,4):
result.append( line[i+2:i+4] + line[i:i+2] )
return result

for line in open("data.txt"):
print process(line)
---------------------------------
For your single example data line, it prints
['28E7', '3005', '00AC', '30A5', '00AD', '0BAD', '2805', '0BAC']

It's a list containing the 8 extracted hexadecimal strings.
Instead of printing the list you can do whatever you want with it.
If you need more info, just ask.

--Irmen de Jong

Josiah Carlson · Feb 22, 2004

How is this accomplished using Python?

Check the struct documentation.

- Josiah

wes weston · Feb 23, 2004

Ren,
If you go here:

http://www.python.org/doc/current/tut/node5.html#SECTION005120000000000000000

about half way down the page it talks about string slicing.

wes

eleyg · Feb 23, 2004

Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.

I can do this in C++ and Pascal, but it seems like Python may be more
suited for the task.

How is this accomplished using Python?

The first response only works with python-2.3 (yield is a newly
reserved word).

The second response did not work for me and left off the last couple
values.

You might want to try this. It iterates down the list, grabbing two
characters at a time, reversing them and appending them to a list. It
also allows a second list argument to store the first 8 digits
(mutable lists are passed by reference)

-------------------------------------------------------
from types import *

def process(line,key):
""" Pass in a string type (line) and
an empty list to store the key """
if type(key) is ListType and key == []:
key.append(line[1:8])
else:
print "Key not ListType or not empty"
result=[]
line=line[9:]
while line:
k2,k1 = line[:2],line[2:4]
line=line[4:]
result.append(k1+k2)
return result
-------------------------------------------------------

William Park · Feb 23, 2004

Ren said:
Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.

I can do this in C++ and Pascal, but it seems like Python may be more
suited for the task.

How is this accomplished using Python?

1. Use FIXEDWIDTH in Awk.

2. Use string slice in Python.

3. Use variable operation in (Bash) shell.

Anton Vredegoor · Feb 23, 2004

The first response only works with python-2.3 (yield is a newly
reserved word).

The second response did not work for me and left off the last couple
values.

The third response uses typechecking and stores a value in an
unreachable place ...

Maybe the feachur-less code is better (tested very lightly):

def asBytes(line,offset):
""" split a line into 2-char chunks, starting at offset'"""
res = []
for i in range(offset,len(line),2):
res.append(line[i:i+2])
return res

def asWords(line,offset=0,swapbytes=0):
"""split a line into words that have maximally 4 chars,
starting at offset, optionally swapping 2-char chunks"""
res = []
flip = 0
for b in asBytes(line,offset):
if flip:
if swapbytes:
res.append(b+prev)
else:
res.append(prev+b)
else:
prev = b
flip = 1-flip
if flip:
res.append(b)
return res

def test():
line =":10000000E7280530AC00A530AD00AD0B0528AC0BE2"
print asWords(line,offset=9,swapbytes=1)

if __name__=='__main__':
test()

output is:

['28E7', '3005', '00AC', '30A5', '00AD', '0BAD', '2805', '0BAC', 'E2']

Anton

Ren · Feb 23, 2004

What is 'prefix' used for? I searched the docs and didn't come up with
anything that seemed appropriated.

Mike C. Fletcher said:
With Python 2.3:
... line = line[9:] # skip prefix
... while line:
... prefix, line = line[:4],line[4:]
... yield prefix[2:]+prefix[:2]
...... print number
... ............snip...............
_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/

Irmen de Jong · Feb 23, 2004

Ren said:
What is 'prefix' used for? I searched the docs and didn't come up with
anything that seemed appropriated.

Umm... it's just a variable name

--Irmen

Mike C. Fletcher · Feb 23, 2004

Ren said:
What is 'prefix' used for? I searched the docs and didn't come up with
anything that seemed appropriated.

It's just the name (variable) I used to store the "prefix" of the rest
of the line. It could just as easily have been called "vlad", but using
simple, descriptive names for variables makes the code easier to read
(in most cases, this being the obvious counter-example). In Python when
you assign to something:

x, y = v, t

you are creating a (possibly new) bound name (if something of the same
name exists in a higher namespace it is shadowed by this bound name, so
even if there was a built-in function called "prefix" my assignment to
the name would have shadowed the name).

This line here says:

prefix, line = line[:4],line[4:]

that is, assign the name "prefix" to the result of slicing the line from
the starting index to index 4, and assign the name "line" to the result
of slicing from index 4 to the ending index. Under the covers the
right-hand-side of the expression is creating a two-element tuple, then
that tuple is unpacked to assign it's elements to the two variables on
the left-hand-side.

Python is a fairly small language, if a linguistic construct works a
particular way in one context it *normally* works that way in every
context (unless the programmer explicitly changes that (and that's
generally *only* done by meta-programmers seeking to create
domain-specific functionality, and even then as a matter of style, it's
kept to a minimum to avoid confusing people (and in this particular
case, AFAIK there's no way to override variable assignment (though (evil

) people have proposed adding such a hook on numerous occasions)))).

The later line is simply manipulating the (string) object now referred
to as "prefix":

result.append( prefix[2:]+prefix[:2] )

that is, take the result of slicing from index 2 to the end and add it
to the result of slicing from the start to index 2. This has the effect
of reversing the order of the 2-byte hexadecimal encodings of "characters".

Oh, and since someone took issue with my use of (new in Python 2.2)
yield (luddites

), here's a non-generator version using the same
basic code pattern:
.... line = line[9:] # skip prefix
.... result = []
.... while line:
.... prefix, line = line[:4],line[4:]
.... result.append( prefix[2:]+prefix[:2] )
.... return result
.... ['28E7', '3005', '00AC', '30A5', '00AD', '0BAD', '2805', '0BAC', 'E2']

Have fun

,
Mike

_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/

Anton Vredegoor · Feb 24, 2004

Mike C. Fletcher said:
Oh, and since someone took issue with my use of (new in Python 2.2)
yield (luddites ), here's a non-generator version using the same
basic code pattern:
... line = line[9:] # skip prefix
... result = []
... while line:
... prefix, line = line[:4],line[4:]
... result.append( prefix[2:]+prefix[:2] )
... return result

The basic problem with this code pattern is that it makes a lot of
large slices of the line. With a small line there is no problem but it
looks like it doesn't scale well.

After reconsidering all alternatives I finally favor a variant of
Irmen's code, but without slicing the whole line and -after all-
definitely *using* yield because it seems appropriate here.

def process(line,offset):
for i in xrange(offset,len(line),4):
yield line[i+2:i+4] + line[i:i+2]

def test():
line = ":10000000E7280530AC00A530AD00AD0B0528AC0BE2"
print '\n'.join(process(line,9))

if __name__=='__main__':
test()

output is:

28E7
3005
00AC
30A5
00AD
0BAD
2805
0BAC
E2

Anton

Newbie file / string question	10	Feb 22, 2004
Extract Text Table From File	11	Aug 27, 2012
Extract Text Format Table Data	0	Aug 27, 2012
Ascii to Unicode.	4	Jul 28, 2010
What's the best way to extract 2 values from a CSV file from each row systematically?	6	Sep 23, 2013
Good cross-version ASCII serialisation protocol for simple types	4	Feb 23, 2013
Need A script to open a excel file and extract the data using autofilter	4	Oct 1, 2011
EBCDIC <--> ASCII	4	Dec 4, 2008

Extract data from ASCII file

Ren

Mike C. Fletcher

Irmen de Jong

Josiah Carlson

wes weston

eleyg

William Park

Anton Vredegoor

Ren

Irmen de Jong

Mike C. Fletcher

Anton Vredegoor

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads