E
Edward Elliott
This is just anecdotal, but I still find it interesting. Take it for what
it's worth. I'm interested in hearing others' perspectives, just please
don't turn this into a pissing contest.
I'm in the process of converting some old perl programs to python. These
programs use some network code and do a lot of list/dict data processing.
The old ones work fine but are a pain to extend. After two conversions,
the python versions are noticeably shorter.
The first program does some http retrieval, sort of a poor-man's wget with
some extra features. In fact it could be written as a bash script with
wget, but the extra processing would make it very messy. Here are the
numbers on the two versions:
Raw -Blanks -Comments
lines chars lines chars lines chars
mirror.py 167 4632 132 4597 118 4009
mirror.pl 309 5836 211 5647 184 4790
I've listed line and character counts for three forms. Raw is the source
file as-is. -Blanks is the source with blank lines removed, including
lines with just a brace. -Comments removes both blanks and comment lines.
I think -Blanks is the better measure because comments are a function of
code complexity, but either works.
By the numbers, the python code appears roughly 60% as long by line and 80%
as long by characters. The chars percentage being (higher relative to line
count) doesn't surprise me since things like list comprehensions and
explicit module calling produce lengthy but readable lines.
I should point out this wasn't a straight line-for-line conversion, but the
basic code structure is extremely similar. I did make a number of
improvements in the Python version with stricter arg checks and better
error handling, plus added a couple minor new features.
The second program is an smtp outbound filtering proxy. Same categories as
before:
Raw -Blanks -Comments
lines chars lines chars lines chars
smtp-proxy.py 261 7788 222 7749 205 6964
smtp-proxy.pl 966 24110 660 23469 452 14869
The numbers here look much more impressive but it's not a fair comparison.
I wasn't happy with any of the cpan libraries for smtp sending at the time
so I rolled my own. That accounts for 150 raw lines of difference. Another
70 raw lines are logging functions that the python version does with the
standard library. The new version performs the same algorithms and data
manipulations as the original. I did do some major refactoring along the
way, but it wasn't the sort that greatly reduces line count by eliminating
redundancy; there is very little redundancy in either version. In any
case, these factors alone don't account for the entire difference, even if
you take 220 raw lines directly off the latter columns.
The two versions were written about 5 years apart, all by me. At the time
of each, I had about 3 years experience in the given language and would
classify my skill level in it as midway between intermediate and advanced.
IOW I'm very comfortable with the language and library reference docs (minus
a few odd corners), but generally draw the line at mucking with interpreter
internals like symbol tables.
I'd like to here from others what their experience converting between perl
and python is (either direction). I don't have the sense that either
language is particularly better suited for my problem domain than the
other, as they both handle network io and list/dict processing very well.
What are the differences like in other domains? Do you attribute those
differences to the language, the library, the programmer, or other
factors? What are the consistent differences across space and time, if
any? I'm interested in properties of the code itself, not performance.
And just what is the question to the ultimate answer to life, the universe,
and everything anyway?
it's worth. I'm interested in hearing others' perspectives, just please
don't turn this into a pissing contest.
I'm in the process of converting some old perl programs to python. These
programs use some network code and do a lot of list/dict data processing.
The old ones work fine but are a pain to extend. After two conversions,
the python versions are noticeably shorter.
The first program does some http retrieval, sort of a poor-man's wget with
some extra features. In fact it could be written as a bash script with
wget, but the extra processing would make it very messy. Here are the
numbers on the two versions:
Raw -Blanks -Comments
lines chars lines chars lines chars
mirror.py 167 4632 132 4597 118 4009
mirror.pl 309 5836 211 5647 184 4790
I've listed line and character counts for three forms. Raw is the source
file as-is. -Blanks is the source with blank lines removed, including
lines with just a brace. -Comments removes both blanks and comment lines.
I think -Blanks is the better measure because comments are a function of
code complexity, but either works.
By the numbers, the python code appears roughly 60% as long by line and 80%
as long by characters. The chars percentage being (higher relative to line
count) doesn't surprise me since things like list comprehensions and
explicit module calling produce lengthy but readable lines.
I should point out this wasn't a straight line-for-line conversion, but the
basic code structure is extremely similar. I did make a number of
improvements in the Python version with stricter arg checks and better
error handling, plus added a couple minor new features.
The second program is an smtp outbound filtering proxy. Same categories as
before:
Raw -Blanks -Comments
lines chars lines chars lines chars
smtp-proxy.py 261 7788 222 7749 205 6964
smtp-proxy.pl 966 24110 660 23469 452 14869
The numbers here look much more impressive but it's not a fair comparison.
I wasn't happy with any of the cpan libraries for smtp sending at the time
so I rolled my own. That accounts for 150 raw lines of difference. Another
70 raw lines are logging functions that the python version does with the
standard library. The new version performs the same algorithms and data
manipulations as the original. I did do some major refactoring along the
way, but it wasn't the sort that greatly reduces line count by eliminating
redundancy; there is very little redundancy in either version. In any
case, these factors alone don't account for the entire difference, even if
you take 220 raw lines directly off the latter columns.
The two versions were written about 5 years apart, all by me. At the time
of each, I had about 3 years experience in the given language and would
classify my skill level in it as midway between intermediate and advanced.
IOW I'm very comfortable with the language and library reference docs (minus
a few odd corners), but generally draw the line at mucking with interpreter
internals like symbol tables.
I'd like to here from others what their experience converting between perl
and python is (either direction). I don't have the sense that either
language is particularly better suited for my problem domain than the
other, as they both handle network io and list/dict processing very well.
What are the differences like in other domains? Do you attribute those
differences to the language, the library, the programmer, or other
factors? What are the consistent differences across space and time, if
any? I'm interested in properties of the code itself, not performance.
And just what is the question to the ultimate answer to life, the universe,
and everything anyway?