E
Emanuele D'Arrigo
Sorry for the previous post, hit the Enter button by mistake... here's
the complete one:
Hi everybody!
I've written the code below to test the differences in performance
between compiled and non-compiled regular expression matching but I
don't quite understand the results. It appears that the compiled the
pattern only takes 2% less time to process the match. Is there some
caching going on in the uncompiled section that prevents me from
noticing its otherwise lower speed?
Manu
------------
import re
import time
## Setup
pattern = "<a>(.*)</a>"
compiledPattern = re.compile(pattern)
longMessage = "<a>"+ "a" * 100000 +"</a>"
numberOfRuns = 1000
## TIMED FUNCTIONS
startTime = time.clock()
for i in range(0, numberOfRuns):
re.match(pattern, longMessage)
patternMatchingTime = time.clock() - startTime
startTime = time.clock()
for i in range(0, numberOfRuns):
compiledPattern.match(longMessage)
compiledPatternMatchingTime = time.clock() - startTime
ratioCompiledToNot = compiledPatternMatchingTime / patternMatchingTime
## PRINT OUTS
print("")
print(" Pattern Matching Time: " + str(patternMatchingTime))
print("(Compiled) Pattern Matching Time: " + str
(compiledPatternMatchingTime))
print("")
print("Ratio Compiled/NotCompiled: " + str(ratioCompiledToNot))
print("")
the complete one:
Hi everybody!
I've written the code below to test the differences in performance
between compiled and non-compiled regular expression matching but I
don't quite understand the results. It appears that the compiled the
pattern only takes 2% less time to process the match. Is there some
caching going on in the uncompiled section that prevents me from
noticing its otherwise lower speed?
Manu
------------
import re
import time
## Setup
pattern = "<a>(.*)</a>"
compiledPattern = re.compile(pattern)
longMessage = "<a>"+ "a" * 100000 +"</a>"
numberOfRuns = 1000
## TIMED FUNCTIONS
startTime = time.clock()
for i in range(0, numberOfRuns):
re.match(pattern, longMessage)
patternMatchingTime = time.clock() - startTime
startTime = time.clock()
for i in range(0, numberOfRuns):
compiledPattern.match(longMessage)
compiledPatternMatchingTime = time.clock() - startTime
ratioCompiledToNot = compiledPatternMatchingTime / patternMatchingTime
## PRINT OUTS
print("")
print(" Pattern Matching Time: " + str(patternMatchingTime))
print("(Compiled) Pattern Matching Time: " + str
(compiledPatternMatchingTime))
print("")
print("Ratio Compiled/NotCompiled: " + str(ratioCompiledToNot))
print("")