K
Kamilche
I was looking for a way to speed up detecting invalid characters in my
TCP string, and thought of yet another use for the translate function!
If you were to 'translate out' the bad characters, and compare string
lengths afterwards, you would know whether or not the line contained
invalid characters. The new method is more than 10x faster than the
standard 'if char in string' test! So - here's the code plus sample
timings:
'''
Translate Speed Test
This code looks for invalid characters in a string,
and raises an exception when it finds one.
I'm testing 2 methods: the 'if char in string' method,
and one based on using the 'translate' function and
comparing string lengths afterwards.
Wow, what a difference! Translate is over 10x faster!
Function Loops Seconds Loops/sec
***********************************************
In 10000 0.171 58479
Translate 10000 0.016 624998
'''
import mytime
import string
_allchars = None
_deletechars = None
_validchars = string.ascii_letters + string.digits + \
"!@#$%^&*()`~-_=+[{]}\\|;:\'\",<.>/?\t "
def init():
global _allchars, _deletechars
l = []
a = []
for i in range(256):
a.append(chr(i))
if not chr(i) in _validchars:
l.append(chr(i))
_deletechars = ''.join(l)
_allchars = ''.join(a)
def test():
max = 10000
tmr = mytime.Timer()
r = range(max)
s = "This is a string to test for invalid characters."
print tmr.heading
tmr.startit()
for i in r:
for c in s:
if c in _deletechars:
raise Exception("Invalid character found!")
tmr.stopit(max)
print tmr.results('In')
tmr.startit()
for i in r:
s2 = s.translate(_allchars, _deletechars)
if len(s2) != len(s):
raise Exception("Invalid character found!")
tmr.stopit(max)
print tmr.results('Translate')
init()
if __name__ == "__main__":
test()
TCP string, and thought of yet another use for the translate function!
If you were to 'translate out' the bad characters, and compare string
lengths afterwards, you would know whether or not the line contained
invalid characters. The new method is more than 10x faster than the
standard 'if char in string' test! So - here's the code plus sample
timings:
'''
Translate Speed Test
This code looks for invalid characters in a string,
and raises an exception when it finds one.
I'm testing 2 methods: the 'if char in string' method,
and one based on using the 'translate' function and
comparing string lengths afterwards.
Wow, what a difference! Translate is over 10x faster!
Function Loops Seconds Loops/sec
***********************************************
In 10000 0.171 58479
Translate 10000 0.016 624998
'''
import mytime
import string
_allchars = None
_deletechars = None
_validchars = string.ascii_letters + string.digits + \
"!@#$%^&*()`~-_=+[{]}\\|;:\'\",<.>/?\t "
def init():
global _allchars, _deletechars
l = []
a = []
for i in range(256):
a.append(chr(i))
if not chr(i) in _validchars:
l.append(chr(i))
_deletechars = ''.join(l)
_allchars = ''.join(a)
def test():
max = 10000
tmr = mytime.Timer()
r = range(max)
s = "This is a string to test for invalid characters."
print tmr.heading
tmr.startit()
for i in r:
for c in s:
if c in _deletechars:
raise Exception("Invalid character found!")
tmr.stopit(max)
print tmr.results('In')
tmr.startit()
for i in r:
s2 = s.translate(_allchars, _deletechars)
if len(s2) != len(s):
raise Exception("Invalid character found!")
tmr.stopit(max)
print tmr.results('Translate')
init()
if __name__ == "__main__":
test()