Parsing data from pyserial

L

Lone Wolf

I'm trying to get data through my serial port from a CMUcam.
This gizmo tracks a color and returns a packet of data. The
packet has nine data points (well, really eight since the first
point is just a packet header) separated by spaces as follows: M
xxx xxx xxx xxx xxx xxx xxx xxx

Here is the code I am using (python v24):

import serial

ser=serial.Serial('com1',baudrate=115200, bytesize=8,
parity='N', stopbits=1,xonxoff=0, timeout=1)

ser.write("PM 1") #This sets the CMUcam to poll mode

for i in range(0,100,1):
ser.write("TC 016 240 100 240 016 240\r\n")
reading = ser.read(40)
print reading
components = reading.split()
print components
ser.close

Here is an example output:

M 37 79 3 4 59 124 86 25
['59', '123', '87', '25', 'M', '37', '79', '3', '4', '59',
'124', '86', '25', 'M
']
M 38 77 3 2 59 124 86 25
['39', '85', '26', 'M', '38', '77', '3', '2', '59', '124', '86',
'25', 'M', '38'
, '7']

My problem is that I am trying to get each data point of the
packet into a separate variable. Ordinarily, this would be easy,
as I would just parse the packet, read the array and assign each
element to a variable eg. mx = components[1]. However, that
doesn't work here because the original packet and the array that
I got from using the split() method are different. If I were to
try read the array created in the first example output, mx would
be 123 instead of 37 like it is in the packet. In the second
example, the array is 85 while the packet is 38.

As near as I can figure out, pyserial is reading a stream of
data and helpfully rearranging it so that it fits the original
packet format M xxx xxx xxx xxx xxx xxx xxx xxx. I would have
thought the split() method that I used on original packet (ie
the "reading" variable) would have just returned an array with
nine elements like the packet has. This is not the case, and I
am at a loss about how to fix this.

I've searched the archive here and elsewhere with no luck. Any
help REALLY appreciated!

Wolf :)

________________________________________________
Get your own "800" number
Voicemail, fax, email, and a lot more
http://www.ureach.com/reg/tag
 
D

Dennis Lee Bieber

print reading
print repr(reading) #to see the complete (with nonprintables)
M 37 79 3 4 59 124 86 25
['59', '123', '87', '25', 'M', '37', '79', '3', '4', '59',
'124', '86', '25', 'M
']
M 38 77 3 2 59 124 86 25
['39', '85', '26', 'M', '38', '77', '3', '2', '59', '124', '86',
'25', 'M', '38'
, '7']
As near as I can figure out, pyserial is reading a stream of
data and helpfully rearranging it so that it fits the original

It doesn't do any "rearranging"... It is simply reading a stream of
characters. Looking at what you show, I suspect you are missing the
start of a stream of packetS and picking up in the middle -- and
collecting n-bytes of whatever arrives.
packet format M xxx xxx xxx xxx xxx xxx xxx xxx. I would have
thought the split() method that I used on original packet (ie
the "reading" variable) would have just returned an array with
nine elements like the packet has. This is not the case, and I
am at a loss about how to fix this.
You don't, from what I see, "know" that the read data is just one
"packet"

For example of what I mean, run (in a normal command line console):

-=-=-=-=-=-=-
Data1 = "Is this one line\nOr is this two lines"
Data2 = "Is this one line\rOr is this two lines"

print "Data1"
print Data1
print "Data2"
print Data2
print "end"
-=-=-=-=-=-=-
C:\DOCUME~1\DENNIS~1>python t.py
Data1
Is this one line?
Or is this two lines
Data2
Or is this two lines
end

C:\DOCUME~1\DENNIS~1>

Note how the second one LOOKS line one line when "print"ed.
ALSO note that .split() will strip out the \r (carriage return, which
may be what your remote is sending at the end of each short packet).

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
J

John Machin

Lone said:
I'm trying to get data through my serial port from a CMUcam.
This gizmo tracks a color and returns a packet of data. The
packet has nine data points (well, really eight since the first
point is just a packet header) separated by spaces as follows: M
xxx xxx xxx xxx xxx xxx xxx xxx

Here is the code I am using (python v24):

import serial

ser=serial.Serial('com1',baudrate=115200, bytesize=8,
parity='N', stopbits=1,xonxoff=0, timeout=1)

ser.write("PM 1") #This sets the CMUcam to poll mode

for i in range(0,100,1):
ser.write("TC 016 240 100 240 016 240\r\n")
reading = ser.read(40)

You are asking for 40 bytes of data. You will get 40 bytes of data.

However your packets are (presumably) variable length, (presumably)
terminated by CR and/or LF. What does the documentation for the device
tell you?
print reading

What you see from the print statement is not necessarily what you've
got.
Change that to print repr(reading) and show us what you then see.
components = reading.split()
print components
ser.close

Here is an example output:

M 37 79 3 4 59 124 86 25
['59', '123', '87', '25', 'M', '37', '79', '3', '4', '59',
'124', '86', '25', 'M
']
M 38 77 3 2 59 124 86 25
['39', '85', '26', 'M', '38', '77', '3', '2', '59', '124', '86',
'25', 'M', '38'
, '7']

Let's try to reconstruct "reading":

| >>> a = ['59', '123', '87', '25', 'M', '37', '79', '3', '4', '59',
| ... '124', '86', '25', 'M']
| >>> astrg = ' '.join(a)
| >>> astrg
| '59 123 87 25 M 37 79 3 4 59 124 86 25 M'
| >>> len(astrg)
| 39 <<<<<==== ooh! almost 40!!
| >>> b = ['39', '85', '26', 'M', '38', '77', '3', '2', '59', '124',
'86',
| ... '25', 'M', '38'
| ... , '7']
| >>> bstrg = ' '.join(b)
| >>> bstrg
| '39 85 26 M 38 77 3 2 59 124 86 25 M 38 7'
| >>> len(bstrg)
| 40 <<<<<==== ooh! exactly 40!!!

My guess: the device is pumping out packets faster than you can handle
them. So you are getting 40-byte snatches of bytes. A snatch is long
enough to cover a whole packet with possible fragments of packets at
each end. You will need to discard the fragments. If you need all the
data, you'd better get some help on how to implement flow control --
I've never used pyserial and I'm not going to read _all_ the docs for
you :)

I'm very interested to see what print repr(reading) actually shows. I'm
strongly suspecting there is a CR (no LF) at the end of each packet; in
the two cases shown, this would cause the "print reading" to appear as
only one packet ... think about it: carriage return, with no linefeed,
would cause overwriting. It is a coincidence with those two samples
that the first part of the line doesn't appear strange, with a 4, 5, or
6-digit number showing up where the trailing fragment ends

My problem is that I am trying to get each data point of the
packet into a separate variable. Ordinarily, this would be easy,
as I would just parse the packet, read the array and assign each
element to a variable eg. mx = components[1].

better would be:

mx, foo, bar, ......, eighth_vbl = components[start:start + 8]
once you have worked out what start should be, e.g. start =
components.index('M') + 1
However, that
doesn't work here because the original packet and the array that
I got from using the split() method are different. If I were to
try read the array created in the first example output, mx would
be 123 instead of 37 like it is in the packet. In the second
example, the array is 85 while the packet is 38.

As near as I can figure out, pyserial is reading a stream of
data and helpfully rearranging it so that it fits the original
packet format M xxx xxx xxx xxx xxx xxx xxx xxx.

How, if you've read the docstring for the Serial.read() method, did you
come to that conclusion?

pyserial knows nothing about your packet format.
I would have
thought the split() method that I used on original packet (ie
the "reading" variable) would have just returned an array with
nine elements like the packet has. This is not the case, and I
am at a loss about how to fix this.

I've searched the archive here and elsewhere with no luck. Any
help REALLY appreciated!

With a bit of repr() and a bit of RTFM, one can often manage without
help :)

Cheers,
John
 
G

Grant Edwards

import serial

ser=serial.Serial('com1',baudrate=115200, bytesize=8,
parity='N', stopbits=1,xonxoff=0, timeout=1)

ser.write("PM 1") #This sets the CMUcam to poll mode

for i in range(0,100,1):
ser.write("TC 016 240 100 240 016 240\r\n")
reading = ser.read(40)
print reading
components = reading.split()
print components
ser.close

Here is an example output:

M 37 79 3 4 59 124 86 25
['59', '123', '87', '25', 'M', '37', '79', '3', '4', '59',
'124', '86', '25', 'M
']
M 38 77 3 2 59 124 86 25
['39', '85', '26', 'M', '38', '77', '3', '2', '59', '124', '86',
'25', 'M', '38'
, '7']

My problem is that I am trying to get each data point of the
packet into a separate variable. Ordinarily, this would be
easy, as I would just parse the packet, read the array and
assign each element to a variable eg. mx = components[1].
However, that doesn't work here because the original packet
and the array that I got from using the split() method are
different.

I doubt it. Try printing `reading` instead of reading. I
suspect that the string you're getting from ser.read() has a
carraige-return in it that you aren't seeing when you do print
reading.
If I were to try read the array created in the first example
output, mx would be 123 instead of 37 like it is in the
packet. In the second example, the array is 85 while the
packet is 38.

As near as I can figure out, pyserial is reading a stream of
data and helpfully rearranging it so that it fits the original
packet format M xxx xxx xxx xxx xxx xxx xxx xxx.

No, it isn't. I wrote the Posix low-level code that's in
pyserial. I've used pyserial extensively on both Windows and
Linux. It doesn't rearrange anything.
I would have thought the split() method that I used on
original packet (ie the "reading" variable) would have just
returned an array with nine elements like the packet has. This
is not the case, and I am at a loss about how to fix this.

When something odd seems to be happening with strings, always
print `whatever` rather than whatever
 
S

Si Ballenger

I'm trying to get data through my serial port from a CMUcam.
This gizmo tracks a color and returns a packet of data. The
packet has nine data points (well, really eight since the first
point is just a packet header) separated by spaces as follows: M
xxx xxx xxx xxx xxx xxx xxx xxx

Here is the code I am using (python v24):

import serial

ser=serial.Serial('com1',baudrate=115200, bytesize=8,
parity='N', stopbits=1,xonxoff=0, timeout=1)

ser.write("PM 1") #This sets the CMUcam to poll mode

for i in range(0,100,1):
ser.write("TC 016 240 100 240 016 240\r\n")
reading = ser.read(40)
print reading
components = reading.split()
print components
ser.close

In my dealing with serial gizmos I have to put a delay between
the request sent to the gizmo and the reading of the serial input
buffer for returned data. Serial ports and gizmos need some time
to do their thing.
 
G

Grant Edwards

In my dealing with serial gizmos I have to put a delay between
the request sent to the gizmo and the reading of the serial input
buffer for returned data. Serial ports and gizmos need some time
to do their thing.

I doubt that's the issue. He's reading with a 1-second timeout
value.
 
S

Si Ballenger

I doubt that's the issue. He's reading with a 1-second timeout
value.

I would think a time delay would be needed between the below two
lines in the code if he expects to get a useable data string back
from the gizmo for the command sent to it.

ser.write("TC 016 240 100 240 016 240\r\n")
reading = ser.read(40)
 
J

John Machin

Grant said:
When something odd seems to be happening with strings, always
print `whatever` rather than whatever

:)

Unholy perlism, Batman!

For the benefit of gentle readers who are newish and might not have
seen the ` character in Python code outside a string literal, or for
those who'd forgotten, there is a cure:

| >>> re.sub(r"`(.*?)`", r"repr(\1)", "print `whatever`, `foo`, `bar`")
| 'print repr(whatever), repr(foo), repr(bar)'


:)
 
G

Grant Edwards

I would think a time delay would be needed between the below two
lines in the code if he expects to get a useable data string back
from the gizmo for the command sent to it.

ser.write("TC 016 240 100 240 016 240\r\n")
reading = ser.read(40)

No. A delay isn't needed as long as the device responds within
1 second. The read() call will wait up to 1 second for the
first byte of the response.
 
F

Fredrik Lundh

Si Ballenger wrote:

I would think a time delay would be needed between the below two
lines in the code if he expects to get a useable data string back
from the gizmo for the command sent to it.

ser.write("TC 016 240 100 240 016 240\r\n")
reading = ser.read(40)

why's that? if the gizmo is busy "doing its thing", read() will wait
for up to one second before giving up.

</F>
 
S

Si Ballenger

No. A delay isn't needed as long as the device responds within
1 second. The read() call will wait up to 1 second for the
first byte of the response.

Per what was posted (below), it appears that the the appropriate
data is being received. It may be possible that the cam may be
sending in a mode that is not in alignment with the binary
transmission mode of the serial port. As a test I'd jumper
between the Tx and Rx pin on the serial port and then send out
the "M" line being received, then see if it will parse as
expected.

Here is an example output:

M 37 79 3 4 59 124 86 25
['59', '123', '87', '25', 'M', '37', '79', '3', '4', '59',
'124', '86', '25', 'M
']
M 38 77 3 2 59 124 86 25
['39', '85', '26', 'M', '38', '77', '3', '2', '59', '124', '86',
'25', 'M', '38'
, '7']
 
J

John Machin

Si said:
Per what was posted (below), it appears that the the appropriate
data is being received. [snip]

Here is an example output:

M 37 79 3 4 59 124 86 25
['59', '123', '87', '25', 'M', '37', '79', '3', '4', '59',
'124', '86', '25', 'M
']
M 38 77 3 2 59 124 86 25
['39', '85', '26', 'M', '38', '77', '3', '2', '59', '124', '86',
'25', 'M', '38'
, '7']

Based on the split() results (presumably much more reliable than the
"print reading" results) what appears to me is:
fragment '59', '123', '87', '25'
packet 'M', '37', '79', '3', '4', '59', '124', '86', '25'
fragment 'M', '39' [see note]
fragment '85', '26'
packet 'M', '38', '77', '3', '2', '59', '124', '86', '25',
fragment 'M', '38'

[note] the 39 obviously aligns with the 37 and 38s, not with the 123
and 124s. However the boundary of the 2 split() results lies before the
39, not after. Puzzling.

In any case, I wouldn't call that "the appropriate data is being
received" -- looks like chunks missing to me.
 
S

Si Ballenger

In any case, I wouldn't call that "the appropriate data is being
received" -- looks like chunks missing to me.

Well, below is the posted expected return data format from the
cam and below that is what has been reported to be returned from
the cam when it is polled, which appears to be a fairly
reasonable match. I assume that each xxx is a decimal number
repersenting a single byte. In the binary mode each x in the
string might be considered a byte in itelf and possibly evaluated
as such. Anyhow it should be easy to see if the received string
can be parsed on it own correctly when not being received via the
serial port. That would start to narrow down where something not
understood is comming into play.

M xxx xxx xxx xxx xxx xxx xxx xxx
M 37 79 3 4 59 124 86 25
 
J

John Machin

Si said:
Well, below is the posted expected return data format from the
cam and below that is what has been reported to be returned from
the cam when it is polled, which appears to be a fairly
reasonable match. I assume that each xxx is a decimal number
repersenting a single byte. In the binary mode each x in the
string might be considered a byte in itelf and possibly evaluated
as such. Anyhow it should be easy to see if the received string
can be parsed on it own correctly when not being received via the
serial port. That would start to narrow down where something not
understood is comming into play.

M xxx xxx xxx xxx xxx xxx xxx xxx
M 37 79 3 4 59 124 86 25

Try reading previous posts. The OP reported that to be returned from
the cam, based on print forty_bytes, not print repr(forty_bytes). I
think everybody (including possibly even the OP) is willing to believe
that the cam is *generating* correct parseable stuff, followed by '\r'
-- the problem now is how to get as many samples per second as is
reasonable in the face of problems like lack of buffering, flow
control, etc.
 
G

Grant Edwards

Try reading previous posts. The OP reported that to be returned from
the cam, based on print forty_bytes, not print repr(forty_bytes). I
think everybody (including possibly even the OP) is willing to believe
that the cam is *generating* correct parseable stuff, followed by '\r'
-- the problem now is how to get as many samples per second as is
reasonable in the face of problems like lack of buffering,

What lack of buffering?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,782
Latest member
ThomasGex

Latest Threads

Top