G
googlegroups
Hi,
I need to parse a binary file produced by an embedded system, whose
content consists in a set of events laid-out like this:
<event 1> <data 1> <event 2> <data 2> ... <event n> <data n>
Every "event" is a single byte in size, and it indicates how long is
the associated "data". Thus, to parse all events in the file, I need to
take it like a stream and read one event at a time, consuming bytes
according to the event value, and jumping to the next event, until an
EOF is reached.
Since there are dozens of almost completely heterogeneous events and
each one of them may imply different actions on the program parsing the
file, I thought it would be convenient to have one class encapsulating
the logic for every event. The parser would then sit in a loop,
creating objects of different classes and calling a method (say
"execute"). That method (different in every class) is responsible for
consuming the bytes associated with the event.
Hence, as the class the parser needs to instantiate in each iteration
is not known in advance, a factory should be implemented. Somehow the
factory should know how to map an event to a class. I don't know of the
best way I should do that in Python. I made an attempt along the
following lines:
1. Create a base class for the events;
2. For every descendant class declare (in the class body) a public
attribute "eventNum" and assign it the value of the event it will be
responsible for;
3. At runtime, the factory constructor scans the event class hierarchy
and builds a dictionary mapping "eventNum"'s to classes.
A draft of the implementation follows:
#################################
##### <events.py module> #####
class EvtBase:
def __init__(self, file):
self.file = file
def execute(self):
pass
class Evt1(EvtBase):
eventNum = 1
def execute(self):
...
class Evt2(EvtBase):
eventNum = 2
def execute(self):
...
....
class EvtN(EvtBase):
eventNum = N
def execute(self):
...
##### <factory.py module> #####
import inspect
import events
class Factory:
def __isValidEventClass(self, obj):
if inspect.isclass(obj) and obj != events.EvtBase and \
events.EvtBase in inspect.getmro(obj):
for m in inspect.getmembers(obj):
if m[0] == 'eventNum':
return True
return False
def __init__(self):
self.__eventDict = {}
for m in inspect.getmembers(events, self.__isValidEventClass):
cls = m[1]
self.__eventDict.update({cls.eventNum: cls})
def parseEvents(self, file):
while not file.eof():
ev = file.read(1)
self.__eventDict[ev](file).execute()
#################################
I'm using the inspect module to find the event classes. One drawback of
this approach is the need to keep the event classes in a module
different from that of the factory, because the getmembers method
expects an already parsed object or module. (The advantage is keeping
the event number near the class declaration.) I've already had to make
the solution generic and I found it was not straightforward to separate
the common logic while avoiding the need to keep the factory and the
events in two distinct modules.
Is there anything better I can do? I don't have enough experience with
Python, then I don't know whether it offers a more obvious way to
address my problem.
Thanks in advance.
I need to parse a binary file produced by an embedded system, whose
content consists in a set of events laid-out like this:
<event 1> <data 1> <event 2> <data 2> ... <event n> <data n>
Every "event" is a single byte in size, and it indicates how long is
the associated "data". Thus, to parse all events in the file, I need to
take it like a stream and read one event at a time, consuming bytes
according to the event value, and jumping to the next event, until an
EOF is reached.
Since there are dozens of almost completely heterogeneous events and
each one of them may imply different actions on the program parsing the
file, I thought it would be convenient to have one class encapsulating
the logic for every event. The parser would then sit in a loop,
creating objects of different classes and calling a method (say
"execute"). That method (different in every class) is responsible for
consuming the bytes associated with the event.
Hence, as the class the parser needs to instantiate in each iteration
is not known in advance, a factory should be implemented. Somehow the
factory should know how to map an event to a class. I don't know of the
best way I should do that in Python. I made an attempt along the
following lines:
1. Create a base class for the events;
2. For every descendant class declare (in the class body) a public
attribute "eventNum" and assign it the value of the event it will be
responsible for;
3. At runtime, the factory constructor scans the event class hierarchy
and builds a dictionary mapping "eventNum"'s to classes.
A draft of the implementation follows:
#################################
##### <events.py module> #####
class EvtBase:
def __init__(self, file):
self.file = file
def execute(self):
pass
class Evt1(EvtBase):
eventNum = 1
def execute(self):
...
class Evt2(EvtBase):
eventNum = 2
def execute(self):
...
....
class EvtN(EvtBase):
eventNum = N
def execute(self):
...
##### <factory.py module> #####
import inspect
import events
class Factory:
def __isValidEventClass(self, obj):
if inspect.isclass(obj) and obj != events.EvtBase and \
events.EvtBase in inspect.getmro(obj):
for m in inspect.getmembers(obj):
if m[0] == 'eventNum':
return True
return False
def __init__(self):
self.__eventDict = {}
for m in inspect.getmembers(events, self.__isValidEventClass):
cls = m[1]
self.__eventDict.update({cls.eventNum: cls})
def parseEvents(self, file):
while not file.eof():
ev = file.read(1)
self.__eventDict[ev](file).execute()
#################################
I'm using the inspect module to find the event classes. One drawback of
this approach is the need to keep the event classes in a module
different from that of the factory, because the getmembers method
expects an already parsed object or module. (The advantage is keeping
the event number near the class declaration.) I've already had to make
the solution generic and I found it was not straightforward to separate
the common logic while avoiding the need to keep the factory and the
events in two distinct modules.
Is there anything better I can do? I don't have enough experience with
Python, then I don't know whether it offers a more obvious way to
address my problem.
Thanks in advance.