Bullet proof passing numeric values from NMEA data stream.

D

Doug Gray

Folks,
I am looking for a fast but most importantly a bullet proof method to pass
and NMEA data stream (GPS output) ascii numeric strings. The best I can
offer is:

def fint(a):
try: return int(float(a))
except: return 0

The reason for this is the quality of the data from the huge variety of
GPS units available varies considerably. Some units do not follow the
standard and I want to pass the data as best I can without hanging the
code for an oddball data value.

Can anyone suggest better?

For example, each of the following throw the exception so do not return
the correct value:

int('00.')
int(' 00.')
float('- 00')
float(' - 00')
float(' - 00')
float(' - 00.')
float('- 00.')
float('- 10.')
float('- 10.')
float('- 10.')
int('- 10.')
int('- 10.')
float('- 10.')
int('1.0')

Also, why should I consider the string module? Is it faster/better?

TIA,
Doug
 
S

Steve Holden

Doug said:
Folks,
I am looking for a fast but most importantly a bullet proof method to pass
and NMEA data stream (GPS output) ascii numeric strings. The best I can
offer is:

def fint(a):
try: return int(float(a))
except: return 0

The reason for this is the quality of the data from the huge variety of
GPS units available varies considerably. Some units do not follow the
standard and I want to pass the data as best I can without hanging the
code for an oddball data value.

Can anyone suggest better?

For example, each of the following throw the exception so do not return
the correct value:

int('00.')
int(' 00.')
float('- 00')
float(' - 00')
float(' - 00')
float(' - 00.')
float('- 00.')
float('- 10.')
float('- 10.')
float('- 10.')
int('- 10.')
int('- 10.')
float('- 10.')
int('1.0')

Also, why should I consider the string module? Is it faster/better?

TIA,
Doug

Try something like

def fint(s):
return float(s.replace(" ", ""))

I really don't think it's a good idea to silently ignore conversion
errors in GPS positioning.

regards
Steve
 
S

Steven D'Aprano

Folks,
I am looking for a fast but most importantly a bullet proof method to pass
and NMEA data stream (GPS output) ascii numeric strings. The best I can
offer is:

def fint(a):
try: return int(float(a))
except: return 0


Will your application calculate the wrong results if it starts getting a
whole lot of spurious zeroes? Wouldn't it better to signal "this value is
invalid" rather than a false zero?

Do you actually want ints? It seems to me that if your data stream is
delivering floats, you're throwing away a lot of data. For example, if the
data stream is:

2.4, 5.7, 3.9, 5.1, ...

you're getting:

2, 5, 3, 5, ...

which is possibly not even the right way to convert to ints. Shouldn't you
be rounding to nearest (i.e. 2, 6, 4, 5, ...)?

[snip]
For example, each of the following throw the exception so do not return
the correct value:
[snip examples]

All your examples include spurious whitespace. If that is the only
problem, here's a simple fix:

def despace(s):
"""Remove whitespace from string s."""
return

def fix_data(value):
"""Fix a GPS value string and return as a float."""
return float(''.join(value.split()))


If only a few values are malformed, you might find this is faster:

def fix_data2(value):
try:
return float(value)
except ValueError:
return float(''.join(value.split()))

Only measurement with actual realistic data will tell you which is faster.

If you expect to get random non-numeric characters, then here's another
solution:

import string
# initialize some global data
table = string.maketrans("", "") # all 8 bit characters
keep = "1234567890.-+"
dontkeep = ''.join([c for c in table if c not in keep])

def fix_data3(value):
try: # a fast conversion first
return float(value)
except ValueError: # fall-back conversion
return float(string.translate(value, table, don'tkeep))

Once you've built the character tables, the translate function itself is
executed in C and is very fast.

Also, why should I consider the string module? Is it faster/better?

Most of the time you should use string methods, e.g.:

"hello world".upper()

instead of

string.upper("hello world")

The only time you should use the string module is when you need one of the
functions (or data objects) that don't exist as string methods (e.g.
translate).
 
S

Steven D'Aprano

All your examples include spurious whitespace. If that is the only
problem, here's a simple fix:

def despace(s):
"""Remove whitespace from string s."""
return

Gah! Ignore that stub. I forgot to delete it :(


While I'm at it, here's another solution: simply skip invalid values,
using a pair of iterators, one to collect raw values from the device and
one to strip out the invalid results.

def raw_data():
"""Generator to collect raw data and pass it on."""
while 1:
# grab a single value
value = grab_data_value_from_somewhere()
if value is "": # Some special "END TRANSMISSION" value.
return
yield value

def converter(stream):
"""Generator to strip out values that can't be converted to float."""
for value in stream:
try:
yield float(value)
except ValueError:
pass

values_safe_to_use = converter(raw_data())

for value in values_safe_to_use:
print value


Naturally you can extend the converter to try harder to convert the string
to a float before giving up.
 
D

Doug Gray

Folks,
I am looking for a fast but most importantly a bullet proof method to pass
and NMEA data stream (GPS output) ascii numeric strings. The best I can
offer is:

def fint(a):
try: return int(float(a))
except: return 0


Will your application calculate the wrong results if it starts getting a
whole lot of spurious zeroes? Wouldn't it better to signal "this value is
invalid" rather than a false zero?

Do you actually want ints? It seems to me that if your data stream is
delivering floats, you're throwing away a lot of data. For example, if the
data stream is:

2.4, 5.7, 3.9, 5.1, ...

you're getting:

2, 5, 3, 5, ...

which is possibly not even the right way to convert to ints. Shouldn't you
be rounding to nearest (i.e. 2, 6, 4, 5, ...)?

[snip]

Thanks, a very helpful response. I'll need some time to fully digest.
Yes I will need a float variant, the int version was by way of example. I
can deal with the rounding etc as necessary, but I was after an
pythonistic view of the generic problem.

Re the examples: whitespace and mal positioned signs and decimal point
would be the obvious errors I might expect but of course this is
speculation. The few GPS units I have tried have all tripped up my first
cut less tolerant code. I am going to revise the interface to work around
potential problems and my preliminary test efforts highlighted more
exceptions than I expected.

Doug
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,169
Messages
2,570,919
Members
47,460
Latest member
eibafima

Latest Threads

Top