J
Joshua Landau
I don't know what you mean by that, but since the joke appears to have
flown over your head, I'll explain it. Steven's "pos" was clearly
mea
What? I don't understand.
I don't know what you mean by that, but since the joke appears to have
flown over your head, I'll explain it. Steven's "pos" was clearly
mea
What? I don't understand.
<Unjustified Insult>. [anumuson from Stack Overflow] has deleted all
my postings regarding Python regular expression matching being
extremely slow compared to Perl. Additionally my account has been
suspended for 7 days. <Unjustified Insult>.
Not a troll. It's just hard to convince Python users that their beloved
language would have inferior regular expression performance to Perl.
Steven D'Aprano said:That's by design. We don't want to make the same mistake as Perl, where
every problem is solved by a regular expression:
http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-prime-numbers/
so we deliberately make regexes as slow as possible so that programmers
will look for a better way to solve their problem. If you check the
source code for the re engine, you'll find that for certain regexes, it
busy-waits for anything up to 30 seconds at a time, deliberately
wasting
cycles.
The same with Unicode. We hate French people, you see, and so in an
effort to drive everyone back to ASCII-only text, Python 3.3 introduces
some memory optimizations that ensures that Unicode strings work
correctly and are up to four times smaller than they used to be. You
should get together with jmfauth, who has discovered our dastardly plot
and keeps posting benchmarks showing how on carefully contrived micro-
benchmarks using a beta version of Python 3.3, non-ASCII string
operations can be marginally slower than in 3.2.
dickwad.
I cannot imagine why he would have done that.
Op 10-07-13 11:03, Mats Peterson schreef:
All right, you have convinced me. Now what? Why should I care?
Python Secret Underground.
Antoon Pardon said:Op 10-07-13 11:03, Mats Peterson schreef:
All right, you have convinced me. Now what? Why should I care?
Joshua Landau said:<Unjustified Insult>. [anumuson from Stack Overflow] has deleted all
my postings regarding Python regular expression matching being
extremely slow compared to Perl. Additionally my account has been
suspended for 7 days. <Unjustified Insult>.
Whilst I don't normally respond to trolls, I'm actually curious.
Do you have any non-trivial, properly benchmarked real-world examples
that this affects, remembering to use full Unicode support in Perl (as
Python has it by default)?
Remember to try on both major CPython versions, and PyPy -- all of
which are in large-scale usage. Remember not just to use the builtin
re module, as most people also use https://pypi.python.org/pypi/regex
and https://code.google.com/p/re2/ when they are appropriate, so
pathological cases for re aren't actually a concern anyone cares
about.
If you actually can satisfy these basic standards for a comparison (as
I'm sure any competent person with so much bravo could) I'd be willing
to converse with you. I'd like to see these results where Python compares
as "extremely slow". Note that, by your own wording, a 30% drop is irrelevant.
Steven D'Aprano said:That's by design. We don't want to make the same mistake as Perl, where
every problem is solved by a regular expression:
http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-prime-numbers/
so we deliberately make regexes as slow as possible so that programmers
will look for a better way to solve their problem. If you check the
source code for the re engine, you'll find that for certain regexes, it
busy-waits for anything up to 30 seconds at a time, deliberately wasting
cycles.
The same with Unicode. We hate French people, you see, and so in an
effort to drive everyone back to ASCII-only text, Python 3.3 introduces
some memory optimizations that ensures that Unicode strings work
correctly and are up to four times smaller than they used to be. You
should get together with jmfauth, who has discovered our dastardly plot
and keeps posting benchmarks showing how on carefully contrived micro-
benchmarks using a beta version of Python 3.3, non-ASCII string
operations can be marginally slower than in 3.2.
I cannot imagine why he would have done that.
Right. Why should you. And who cares about you?
Chris Angelico said:You do? And you haven't noticed the inferior performance of regular
expressions in Python compared to Perl? Then you obviously haven't
used them a lot.
That would be correct. Why have I not used them all that much? Because
Python has way better ways of doing many things. Regexps are
notoriously hard to debug, largely because a nonmatching regex can't
give much information about _where_ it failed to match, and when I
parse strings, it's more often with (s)scanf notation instead - stuff
like this (Pike example as Python doesn't, afaik, have scanf support):
(5) Result: 42data="Hello, world! I am number 42.";
sscanf(data,"Hello, %s! I am number %d.",foo,x); (3) Result: 2
foo; (4) Result: "world"
x;
Or a more complicated example:
sscanf(Stdio.File("/proc/meminfo")->read(),"%{%s: %d%*s\n%}",array data);
mapping meminfo=(mapping)data;
That builds up a mapping (Pike terminology for what Python calls a
dict) with the important information out of /proc/meminfo, something
like this:
([
"MemTotal": 2026144,
"MemFree": 627652,
"Buffers": 183572,
"Cached": 380724,
..... etc etc
])
So, no. I haven't figured out that Perl's regular expressions
outperform Python's or Pike's or SciTE's, because I simply don't need
them all that much. With sscanf, I can at least get a partial match,
which tells me where to look for the problem.
ChrisA
You're obviously trying hard to be funny. It fails miserably.
Or the Dutch.
Or us Brits.
Or us Brits.
Ahhh.... so this is pos, right? Telling the truth? Interesting.
That's by design. We don't want to make the same mistake as Perl, where
every problem is solved by a regular expression:
http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-prime-numbers/
so we deliberately make regexes as slow as possible so that programmers
will look for a better way to solve their problem. If you check the
source code for the re engine, you'll find that for certain regexes, it
busy-waits for anything up to 30 seconds at a time, deliberately wasting
cycles.
The same with Unicode. We hate French people,
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.