Python vs Ruby

Kent Johnson · Oct 22, 2005

Bryan said:
i would not say sion's ratio of 5:1 is dubious. for what it's worth,
i've written i pretty complex program in jython over the last year.
jython compiles to java source code and the number of generated java
lines to the jython lines is 4:1.

Ugh. The code generated by jythonc is *nothing like* the code you would write by hand to do the same thing. This is a meaningless comparison.

Kent

Ed Jensen · Oct 22, 2005

Bryan said:
i would not say sion's ratio of 5:1 is dubious. for what it's worth, i've
written i pretty complex program in jython over the last year. jython compiles
to java source code and the number of generated java lines to the jython lines
is 4:1.

Most code generators are known to produce non-optimal code.

Steven D'Aprano · Oct 23, 2005

I suspect it is considerably less than that, although it depends on the
specific code being written.

Perl is more like a CISC CPU. There are a million different commands.
Python is more RISC like.
Line count comparisons = pointless.

Not so.

Every line = more labour for the developer = more cost and time.
Every line = more places for bugs to exist = more cost and time.

I find it sometimes helps to imagine extreme cases. Suppose somebody comes
to you and says "Hi, I want you to develop a web scrapping application to
run on my custom hardware." You look at the project specifications and
realise that the hardware has no OS, no TCP/IP, no file manager, no
compiler. So you have to quote the potential customer on writing all these
layers of software, potentially tens of millions of lines of code.
Even porting an existing OS to the new hardware is not an insignificant
job. Think how much time and money it would take.

On the other extreme, the client comes to you and asks the same thing,
except the hardware is a stock-standard Linux-based PC. Your development
environment already contains an operating system, a file manager,
TCP/IP, compilers, frameworks... and wget. The work you need to do is
potentially as little as writing down the command "man wget" on a slip of
paper and pushing it across the table to your customer.

As programming languages go, C is closer to the first extreme, C++ a
little further away, Java further away still, because Java provides
more capabilities already built-in that the C programmer has to
create from scratch. For many tasks, Python provides even more
capabilities, in a language that demands less syntax scaffolding to make
things happen. Every line of code you don't have to write not only is a
bug that just can't happen, but it also saves time and labour.

Mike Meyer · Oct 23, 2005

Steven D'Aprano said:
Not so.

Every line = more labour for the developer = more cost and time.
Every line = more places for bugs to exist = more cost and time.

There were studies done in the 70s that showed that programmers
produced the same number of debugged lines of code a day no matter
what language they used. So a language that lets you build the same
program with fewer lines of code will let you build the program in
less time.

I find it sometimes helps to imagine extreme cases. Suppose somebody comes
to you and says "Hi, I want you to develop a web scrapping application to
run on my custom hardware." You look at the project specifications and
realise that the hardware has no OS, no TCP/IP, no file manager, no
compiler. So you have to quote the potential customer on writing all these
layers of software, potentially tens of millions of lines of code.
Even porting an existing OS to the new hardware is not an insignificant
job. Think how much time and money it would take.

Then factor in the profits to be reaped from selling the ported
OS/compilers

.

<mike

Alex Martelli · Oct 23, 2005

Mike Meyer said:
There were studies done in the 70s that showed that programmers
produced the same number of debugged lines of code a day no matter
what language they used. So a language that lets you build the same
program with fewer lines of code will let you build the program in
less time.

Of course, these results only apply where the "complexity" (e.g., number
of operators, for example) in a single line of code is constant. There
is no complexity advantage to wrapping up code to take fewer LINES, as
such -- e.g., in Python:

for item in sequence: blaap(item)

or

for item in sequence:
blaap(item)

are EXACTLY as easy (or hard) to write, maintain, and document -- it's
totally irrelevant that the number of lines of code has "doubled" in the
second (more standard) layout of the code!-)

This effect is even more pronounced in languages which allow or
encourage more extreme variation in "packing" of code over lines; e.g.,
C, where

for(x=0; x<23; x++) { a=seq[x]; zap(a); blup(a); flep(a); }

and

for(x=0;
x<23;
x++)
{
a=seq[x];
zap(a);
blup(a);
flep(a);
}

are both commonly used styles -- the order of magnitude difference in
lines of code is totally "illusory".

Alex

Mike Meyer · Oct 23, 2005

Of course, these results only apply where the "complexity" (e.g., number
of operators, for example) in a single line of code is constant.

I'm not sure what you're trying to say here. The tests ranged over
things from PL/I to assembler. Are you saying that those two languages
have the same "complexity in a single line"?

for item in sequence: blaap(item)

or

for item in sequence:
blaap(item)

are EXACTLY as easy (or hard) to write, maintain, and document -- it's
totally irrelevant that the number of lines of code has "doubled" in the
second (more standard) layout of the code!-)

The studies didn't deal with maintenance. They only dealt with
documentation in so far as code was commented.

On the other hand, studies of reading comprehension have shown that
people can read and comprehend faster if the line lengths fall within
certain ranges. While it's a stretch to assume those studies apply to
code, I'd personally be hesitant to assume they don't apply without
some reseach. If they do apply, then your claims about the difficulty
of maintaining and documenting being independent of the textual line
lengths are wrong. And since writing code inevitable involves
debugging it - and the studies specified debugged lines - then the
line length could affect how hard the code is to write as well.

<mike

Max M · Oct 23, 2005

Mike said:
There were studies done in the 70s that showed that programmers
produced the same number of debugged lines of code a day no matter
what language they used. So a language that lets you build the same
program with fewer lines of code will let you build the program in
less time.

In my experience the LOC count is *far* less significant than the levels
of indirections.

Eg. how many levels of abstraction do I have to understand to follow a
traceback, or to understand what a method relly does in a complex system.

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Alex Martelli · Oct 23, 2005

Mike Meyer said:
I'm not sure what you're trying to say here. The tests ranged over
things from PL/I to assembler. Are you saying that those two languages
have the same "complexity in a single line"?

Not necessarily, since PL/I, for example, is quite capable of usages at
extremes of operator density per line. So, it doesn't even have "the
same complexity as itself", if used in widely different layout styles.

If the studies imply otherwise, then I'm reminded of the fact that both
Galileo and Newton published alleged experimental data which can be
shown to be "too good to be true" (fits the theories too well, according
to chi-square tests etc)...

The studies didn't deal with maintenance. They only dealt with
documentation in so far as code was commented.

On the other hand, studies of reading comprehension have shown that
people can read and comprehend faster if the line lengths fall within
certain ranges. While it's a stretch to assume those studies apply to
code, I'd personally be hesitant to assume they don't apply without
some reseach. If they do apply, then your claims about the difficulty
of maintaining and documenting being independent of the textual line
lengths are wrong. And since writing code inevitable involves
debugging it - and the studies specified debugged lines - then the
line length could affect how hard the code is to write as well.

If time to code depends on textual line lengths, then it cannot solely
depend on number of lines at the same time. If, as you say, the studies
"prove" that speed of delivering debugged code depends strictly on the
LOCs in the delivered code, then those studies would also be showing
that the textual length of the lines is irrelevant to that speed (since,
depending on coding styles, in most languages one can trade off
textually longer lines for fewer lines).

OTOH, the following "mental experiment" shows that the purported
deterministic connection of coding time to LOC can't really hold:

say that two programmers, Able and Baker, are given exactly the same
task to accomplish in (say) language C, and end up with exactly the same
correct source code for the resulting function;

Baker, being a honest toiling artisan, codes and debugs his code in
"expansive" style, with lots of line breaks (as lots of programming
shops practice), so, given the final code looks like:
while (foo())
{
bar();
baz();
}
(etc), he's coding 5 lines for each such loop;

Able, being able, codes and debugs extremely crammed code, so the same
final code looks, when Able is working on it, like:
while (foo()) { bar(); baz(); }
so, Able is coding 1 line for each such loop, 5 times less than Baker
(thus, by hypothesis, Able must be done 5 times faster);

when Able's done coding and debugging, he runs a "code beautifier"
utility which runs in negligible time (compared to the time it takes to
code and debug the program) and easily produces the same "expansively"
laid-out code as Baker worked with all the time.

So, Able is 5 times faster than Baker yet delivers identical final code,
based, please note, not on any substantial difference in skill, but
strictly on a trivial trick hinging on a popular and widely used kind of
code-reformatting utility.

Real-life observation suggests that working with extremely crammed code
(to minimize number of lines) and beautifying it at the end is in fact
not a sensible coding strategy and cannot deliver such huge increases in
coding (and debugging) speed. Thus, either those studies or your
reading of them must be fatally flawed in this respect (most likely,
some "putting hands forward" footnote about coding styles and tools in
use was omitted from the summaries, or neglected in the reading).

Such misunderstandings have seriously damaged the practice of
programming (and managements of programming) in the past. For example,
shops evaluating coders' productivity in terms of lines of code have
convinced their coders to distort their style to emit more lines of code
in order to be measured as more productive -- it's generally trivial to
do so, of course, in many cases, e.g.
for i in range(100):
a = i*i
can easily become 100 lines "a[0] = 0" and so on (easily produced by
copy and paste or editor macros, or other similarly trivial means). At
the other extreme, some coders (particularly in languages suitable for
extreme density, such as Perl) delight in producing "one-liner"
(unreadable) ``very clever'' equivalents of straightforward loops that
would take up a few lines if written in the obvious way instead.

The textual measure of lines of code is extremely easy to obtain, and
pretty easy to adjust to account for some obvious first-order effects
(e.g., ignoring comments and whitespace, counting logical lines rather
than physical ones, etc), and that, no doubt, accounts for its undying
popularity -- but it IS really a terrible measurement for "actual
program size and complexity".

Moreover, even if you normalized "program size" by suitable language
specific factors (number of operators, decision points, cyclomatic
complexity, etc), the correlation between program size and time to code
it would still only hold within broadly defined areas, not across the
board. I believe "The mythical man-month" was the first widely read
work to point out how much harder it is to debug programs that use
unrestrained concurrency (in today's terms, think of multithreading
without any of the modern theory and helpers for it), which Brooks
called "system programs", compared to "ordinary" sequential code (which
Brooks called "application programs" -- the terminology is quite dated,
but the deep distinction remains valid). Also: one huge monolithic
program using global variables for everything is going to have
complexity (and time to delivery of debugged code) that grows way more
than linearly with program size; to keep a relation that's close to
linear (though in no case can exact linearity be repeatably achieved for
sufficiently large programming systems, I fear), we employ a huge
variety of techniques to make our software systems more modular.

It IS important to realize that higher level languages, by making
programs of equivalent functionality (and with comparable intrinsic
difficulty, modularity, etc) "textually smaller" (and thereby
"conceptually" smaller), raises program productivity. But using "lines
of code", without all the appropriate qualifications, for these
measurements, is not appropriate. Even the definition of a language's
level in terms of LOCs per function point is too "rough and ready" and
thus suffers from this issue (function points as a language-independent
measure of a coding task's "size" have their own issues, but much
smaller ones than LOCs as a measure of a delivered code's size).

Consider the analogy of measuring a writing task (in, say, English) by
number of delivered words -- a very popular measure, too. No doubt, all
other things being equal, it may take a writer about twice as much to
deliver 2000 copy-edited words than to deliver 1000. But... all other
things are rarely equal. To me, for example, it often comes most
natural to take about 500 words to explain and illustrate one concept;
but when I need to be concise, I will then take a lot of time to edit
and re-edit that text until just about all of the same issues are put
across in 200 words or less. It may take me twice as long to rewrite
the original 500 words into 200, as it took to put down the 500 words in
the first place -- which helps explains why many of my posts are so
long, as I don't take all the time to re-edit them, and why it taks so
long to write a "Nutshell" series book, where conciseness is crucial.

Nor it is only my own issue... remember Pascal's "Lettres Provinciales",
and the famous apology about "I am sorry that this letter is so long,
but I did not have the time to write a shorter one"!-)

Alex

bruno modulix · Oct 24, 2005

Michael said:
+1. Python is easily applicable to most of the problem domain of Java,
but solves problems much more elegantly. It just isn't shipped embedded
in all leading browsers :-(.

It's been a long time since I last saw a Java applet on a website.

bruno modulix · Oct 24, 2005

Scott said:
I would say Simula is the forefather of modern OOPLs, and Smalltalk is
the toofather.

Err... I'm afraid I don't understand this last word (and google have not
been of much help here)

bruno modulix · Oct 24, 2005

Alex Martelli wrote:
(snip)

Here's a tiny script showing some similarities and differences:

def f()
i = 0
while i < 1000000
j = 923567 + i
i += 1
end
end

f()

comment out the 'end' statements, and at colons
s/at/add/

at the end of the def
and while statements, and this is also valid Python.

Iain King · Oct 24, 2005

Tom said:
+1 QOTW

Someone should really try posting a similar question on c.l.perl and
seeing how they react ...

tom

SSshhhhhhhhh! Xah Lee might be listening!

Iain

Scott David Daniels · Oct 24, 2005

bruno said:
Err... I'm afraid I don't understand this last word (and google have not
been of much help here)

Sorry, I was being too cute by half. If Simula is the fore father
(4 away) then Smalltalk is half as far (2) away. Hence the "toofather."
"Toofather" by analogy with the homophones "fore" and "four" we use the
homophones "two" and "too".

--Scott David Daniels
(e-mail address removed)

Michele Simionato · Oct 24, 2005

Alex said:
... remember Pascal's "Lettres Provinciales",
and the famous apology about "I am sorry that this letter is so long,
but I did not have the time to write a shorter one"!-)

This observation applies to code too. I usually spend most of my time
in making short programs
that would have been long. This means:

cutting off non-essential features (and you can discover that a feature
is non essential only after having implemented it)

and/or

rethinking the problem to a superior level of abstraction (only
possible after you have implented
the lower level of abstraction).

Michele Simionato

Neil Hodgson · Oct 24, 2005

Scott David Daniels:

Sorry, I was being too cute by half. If Simula is the fore father
(4 away) then Smalltalk is half as far (2) away. Hence the "toofather."
"Toofather" by analogy with the homophones "fore" and "four" we use the
homophones "two" and "too".

We could smear the homophones further and say OCaml is the "nextfather".

Neil

Alex Martelli · Oct 24, 2005

Michele Simionato said:
This observation applies to code too. I usually spend most of my time
in making short programs
that would have been long. This means:

Absolutely true.

cutting off non-essential features (and you can discover that a feature
is non essential only after having implemented it)

This one is difficult if you have RELEASED the program with the feature
you now want to remove, sigh. You end up with lots of "deprecated"s...
somebody at Euro OSCON was saying that this was why they had dropped
Java, many years ago -- each time they upgraded their Java SDK they
found out that half their code used now-deprecated features.

Still, I agree that (once in a while) accepting backwards
incompatibility by removing features IS a good thing (and I look
forwards a lot to Python 3.0!-). But -- the "dream" solution would be
to work closely with customers from the start, XP-style, so features go
into the code in descending order of urgence and importance and it's
hardly ever necessary to remove them.

and/or

rethinking the problem to a superior level of abstraction (only
possible after you have implented
the lower level of abstraction).

Yep, this one is truly crucial.

But if I had do nominate ONE use case for "making code smaller" it would
be: "Once, And Only Once" (aka "Don't Repeat Yourself"). Scan your code
ceaselessly mercilessly looking for duplications and refactor just as
mercilessly when you find them, "abstracting the up" into functions,
base classes, etc...

Alex

Jorge Godoy · Oct 24, 2005

forwards a lot to Python 3.0!-). But -- the "dream" solution would be
to work closely with customers from the start, XP-style, so features go
into the code in descending order of urgence and importance and it's
hardly ever necessary to remove them.

We do that often with two of our customers here. After the first changes,
they asked for more. And them some other and when it finally ended, the
project was like we had suggested, but instead of doing this directly, the
client wanted to waste more money... :-( Even if we earnt more money, I'd
rather have the first proposal accepted instead of wasting time working on
what they called "essential features".

But if I had do nominate ONE use case for "making code smaller" it would
be: "Once, And Only Once" (aka "Don't Repeat Yourself"). Scan your code
ceaselessly mercilessly looking for duplications and refactor just as
mercilessly when you find them, "abstracting the up" into functions,
base classes, etc...

And I'd second that. Code can be drastically reduced this way and even
better: it can be made more generic, more useful and robustness is improved.

Michele Simionato · Oct 24, 2005

Alex Martelli:

Michele Simionato:

This one is difficult if you have RELEASED the program with the feature
you now want to remove, sigh.

Yeah, but I used the wrong word "features", which typically means "end
user features".
Instead, I had in mind "developer features", i.e. core features that
will be called later
in "client" code (I am in a framework mindset here).

Typically I start with a small class, then the class becomes larger as
I add features that will
be useful for client code, then I discover than the class has become
difficult to mantain.
So I cut the features and and I implement them outside the class and
the class becomes
short again.

However, I *cannot* know from the beginning what is the minimal set of
features
needed to make short the client code until I write a lot of client
code. I can make things short
only after I have made things long. I think this applies to me, to you,
to Pascal and to
everybody in general. It is impossible to start from the beginning with
the short program,
unless you already know the solution (that means, you have already
written the long
version in the past). Still, some naive people think there is a silver
bullet or an easy way
to avoid the hard work. They are naive

Michele Simionato

Alex Martelli · Oct 25, 2005

Jorge Godoy said:
We do that often with two of our customers here. After the first changes,
they asked for more. And them some other and when it finally ended, the
project was like we had suggested, but instead of doing this directly, the
client wanted to waste more money... :-( Even if we earnt more money, I'd
rather have the first proposal accepted instead of wasting time working on
what they called "essential features".

The customer is part of the team; if any player in the team is not
performing well, the whole team's performance will suffer -- that's
hardly surprising. You may want to focus more on _teaching_ the
customer to best play his part in the feature-selection game, in the
future... not easy, but important.

And I'd second that. Code can be drastically reduced this way and even
better: it can be made more generic, more useful and robustness is improved.

I'll second all of your observations on this!-)

Alex

bruno modulix · Oct 25, 2005

Scott said:
Sorry, I was being too cute by half.
tss...

If Simula is the fore father
(4 away) then Smalltalk is half as far (2) away. Hence the "toofather."
"Toofather" by analogy with the homophones "fore" and "four" we use the
homophones "two" and "too".

My my my...

C++ vs Python	3	Dec 22, 2021
OT! Python vs... Objective-C!	2	Jun 21, 2010
When deployed to Heroku, python setup.py egg info did not run successfully.	1	Jul 4, 2022
Multi-threading in Python vs Java	7	Oct 11, 2013
Python -Vs- Ruby: A regexp match to the death!	13	Aug 9, 2010
Python and Math	38	May 18, 2014
Hi Everyone!	2	Aug 24, 2023
Top down Python	4	Feb 12, 2014

Python vs Ruby

Kent Johnson

Ed Jensen

Steven D'Aprano

Mike Meyer

Alex Martelli

Mike Meyer

Max M

Alex Martelli

bruno modulix

bruno modulix

bruno modulix

Iain King

Scott David Daniels

Michele Simionato

Neil Hodgson

Alex Martelli

Jorge Godoy

Michele Simionato

Alex Martelli

bruno modulix

Ask a Question

Similar Threads

Staff online

Members online

Forum statistics

Latest Threads