Perl / python regex / performance comparison

I

Ivan

Hello everyone,

I know this is not a direct python question, forgive me for that, but
maybe some of you will still be able to help me. I've been told that
for my application it would be best to learn a scripting language, so
I looked around and found perl and python to be the nice. Their syntax
and "way" is not similar, though.
So, I was wondering, could any of you please elaborate on the
following, as to ease my dilemma:

1. Although it is all relatively similar, there are differences
between regexes of these two. Which do you believe is the more
powerful variant (maybe an example) ?

2. They are both interpreted languages, and I can't really be sure how
they measure in speed. In your opinion, for handling large files,
which is better ?
(I'm processing files of numerical data of several hundred mb - let's
say 200mb - how would python handle file of such size ? As compared to
perl ?)

3. This last one is somewhat subjective, but what do you think, in the
future, which will be more useful. Which, in your (humble) opinion
"has a future" ?

Thank you for all the info you can spare, and expecially grateful for
the time in doing so.
-- Ivan
 
C

Ciprian Dorin, Craciun

Hello everyone,

I know this is not a direct python question, forgive me for that, but
maybe some of you will still be able to help me. I've been told that
for my application it would be best to learn a scripting language, so
I looked around and found perl and python to be the nice. Their syntax
and "way" is not similar, though.
So, I was wondering, could any of you please elaborate on the
following, as to ease my dilemma:

1. Although it is all relatively similar, there are differences
between regexes of these two. Which do you believe is the more
powerful variant (maybe an example) ?

2. They are both interpreted languages, and I can't really be sure how
they measure in speed. In your opinion, for handling large files,
which is better ?
(I'm processing files of numerical data of several hundred mb - let's
say 200mb - how would python handle file of such size ? As compared to
perl ?)

3. This last one is somewhat subjective, but what do you think, in the
future, which will be more useful. Which, in your (humble) opinion
"has a future" ?

Thank you for all the info you can spare, and expecially grateful for
the time in doing so.
-- Ivan

I could answer to your second question (will Python handle large
files). In my case I use Python to create statistics from some trace
files from a genetic algorithm, and my current size is up to 20MB for
about 40 files. I do the following:
* use regular expressions to identify each line type, extract the
information (as numbers);
* either create statistics on the fly, either load the dumped data
into an Sqlite3 database (which got up to a couple of hundred MB);
* everything works fine until now;

I've also used Python (better said an application built in Python
with cElementTree?), that took the Wikipedia XML dumps (7GB? I'm not
sure, but a couple of GB), then created a custom format file, from
which I've tried to create SQL inserts... And everything worked good.
(Of course it took some time to do all the processing).

So my conclusion is that if you try to keep your in-memory data
small, and use the smart (right) solution for the problem you could
use Python without (big) overhead.

Another side-note, I've also used Python (with NumPy) to implement
neural networks (in fact clustering with ART), where I had about 20
thousand training elements (arrays of thousands of elements), and it
worked remarkably good (I would better than in Java, and comparable
with C/C++).

I hope I've helped you,
Ciprian Craciun.

P.S. If you just need one regular expression transformation to
another, or you need regular expression searching, then just use sed
or grep as you would not get anything better than them.
 
T

Terry Reedy

Ivan said:
Hello everyone,

I know this is not a direct python question, forgive me for that, but
maybe some of you will still be able to help me. I've been told that
for my application it would be best to learn a scripting language, so
I looked around and found perl and python to be the nice. Their syntax
and "way" is not similar, though.
So, I was wondering, could any of you please elaborate on the
following, as to ease my dilemma:

Which way are *you* more comfortable with? There are people who
regularly use both, and many who do not.
1. Although it is all relatively similar, there are differences
between regexes of these two. Which do you believe is the more
powerful variant (maybe an example) ?

This is not relevant to your application below. In any case, the
differences are in rather esoteric details.
2. They are both interpreted languages, and I can't really be sure how
they measure in speed. In your opinion, for handling large files,
which is better ?
(I'm processing files of numerical data of several hundred mb - let's
say 200mb - how would python handle file of such size ? As compared to
perl ?)

For one file and simple processing, the time difference should be less
than the time you spent asking the question. For complex processing or
multiple files, a Python user might use numpy, scipy, or other
pre-written analysis extensions.
3. This last one is somewhat subjective, but what do you think, in the
future, which will be more useful. Which, in your (humble) opinion
"has a future" ?

Python ;-) at least for me.

Terry Jan Reedy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,816
Latest member
SapanaCarpetStudio

Latest Threads

Top