Python Productivity Gain?

T

Terry Reedy

Harry George said:
It is common for a ComSci prof or grad student to crank up such a
study, using undergrad and grad students as the subjects. These
subjects can generally be coerced to participate ("it is required for
the course"). For "novice programmer" research, high school students
are often used. These tend to be self-selected, and ar not
representative for the general population.

The key feature that makes a study/experiment statistically analyzable is
randomization of subjects to treatments. So, to compare two languages
(simplest case), you have everyone write programs in both languages, but
randomize the order (half one way, the other half the other way) or you
have one half do language A and the other half B, again randomizing the
assigment. Also, if there is any subjectivity in the evaluation of
results, then the judges should not know the language when judging the
output.

Do the studies you speak of meet these criteria?

Terry J. Reedy
 
M

Matthias

Harry George said:
It is common for a ComSci prof or grad student to crank up such a
study, using undergrad and grad students as the subjects. These
subjects can generally be coerced to participate ("it is required for
the course"). For "novice programmer" research, high school students
are often used. These tend to be self-selected, and ar not
representative for the general population.

Are these studies published? I see the limitations, but still would
be interested in the results. So far, I found only the papers of Lutz
Prechtel et al. on the web.

Matthias
 
H

Harry George

Matthias said:
Are these studies published? I see the limitations, but still would
be interested in the results. So far, I found only the papers of Lutz
Prechtel et al. on the web.

Matthias

I was talking about "programmer productivity" studies in general, not
specifically Python. My point was that the experimental model is
convenient but not a very good representation of reality.

Google for "programmer productivity"
 
H

Harry George

Terry Reedy said:
The key feature that makes a study/experiment statistically analyzable is
randomization of subjects to treatments. So, to compare two languages
(simplest case), you have everyone write programs in both languages, but
randomize the order (half one way, the other half the other way) or you
have one half do language A and the other half B, again randomizing the
assigment. Also, if there is any subjectivity in the evaluation of
results, then the judges should not know the language when judging the
output.

Do the studies you speak of meet these criteria?

Terry J. Reedy

The key is to use a valid experimental model in the first place. All
the randomizing, DOE, double blind, ANOVA, discriminant analysis, factor
analysis, etc in the world doesn't help if the model is a poor
rendition of the intended target.

I'm not in this field myself, and only know of it from scanning the
literature on language selection some time ago. My impression was
that the academic papers using students as subjects did an honest job
of designing the experiment and probably got honest results for their
experimental population. And they usually discussed their concerns
that the model was not very representative of industry practice.
 
M

Matthias

Harry George said:
I was talking about "programmer productivity" studies in general, not
specifically Python. My point was that the experimental model is
convenient but not a very good representation of reality.

When people started to approach medicine scientifically I'm sure they
also were overwhelmed by the complexities of their endeavor at first.

I don't question that reality must be strongly simplified in order to
do experiments. But you always have to simplify in order to
comprehend, explain, communicate. The question is whether
simplifications are made explicitly and consciously or not.
Google for "programmer productivity"

I checked the first 30 out of the 16,000 pages found. It is about
what I expected: Mostly advertisement for products and "solutions".
Maybe one of the pages might qualify as an empirical study.

This industry is on a random walk... ;-))
 
P

Paul Prescod

Matthias said:
...

I checked the first 30 out of the 16,000 pages found. It is about
what I expected: Mostly advertisement for products and "solutions".
Maybe one of the pages might qualify as an empirical study.

This industry is on a random walk... ;-))

Despite the smiley I think it is worth putting this in perspective. Take
a random language implementation from 1975 and try implementing
something in it. I think that you will find strong anecdotal evidence
that Things Are Getting Better. It is a frustratingly slow and
inefficient but one that nevertheless seems to move in the right direction.

Paul Prescod
 
M

Magnus Lyck?

kbass said:
In different articles that I have read, persons have constantly eluded to
the productivity gains of Python. One person stated that Python's
productivity gain was 5 to 10 times over Java in some in some cases. The
strange thing that I have noticed is that there were no examples of this
productivity gain (i.e., projects, programs, etc.,...). Can someone give me
some real life examples of productivity gains using Python as opposed other
programming languages.

I don't think a tenfold programmer productivity increase over Java
is typical, but there are certainly examples of significant
productivity gains in converting to Python from some other language.

See here for instance:
http://www.thinkware.se/cgi-bin/thinki.cgi/PythonQuotes

E.g.

"I was amazed by the amount [of] flexibility and self-awareness that
Python had. When a 20,000 line project went to approximately 3,000
lines overnight, and came out being more flexible and robust once it
had been completed, I realized I was on to something really good."

--Glyph Lefkowitz (Developer of the Twisted network server
framework)

This 6-7-fold improvement was in going from C++ I think.

"...However, it did provide a hard measurement on the benefits of
using Python instead of C++: the lines of Python code was 10% of the
equivalent C++ code. ... From a software engineering standpoint, this
was a tremendous success. Bug counts are always proportional to the
number of lines of code, meaning that the Python version should have
10% of the bugs of the C++ version. Further, the fewer lines of code
meant that it would have a smaller and more understandable "footprint"
in the developers' minds. The Python code was arguably more
maintainable due to its improved readability and rapid edit-test cycle
(no compile and link step). Lastly, the server could also be shown to
be more robust - being entirely in Python, it was not subject to
memory-related coding errors such as null pointers, buffer
misallocation and overruns, or unfreed or doubly-freed memory..."

--Greg Stein, eShop (which was later sold to Microsoft)

Here you have a 10-fold gain, going from C++.

Another aspect of any X-fold programmer productivity improvement is
that there is a lot more than just programming going on in a project.
If requirements capture, analysis, design, testing, documentation,
planning etc takes the same amount of time, you will still not be
able to influence the total project cost a lot.

On the other hand, using a tool like Python doesn't just influence
the programmers, but the whole project! If prototyping and coding
in general becomes significantly faster, the trade off for how much
analysis and design you should do will change. Why spend weeks at
a conference table arguing about different design alternatives if
the programmers can supply several different implementations within
a day?

The sooner a prototype can be put in the hands of the end users, the
faster mistakes in the requirements gathering will be sorted out,
and new needs will be discovered and can be weighed in before it's
too late.

The ability to play interactively with Python objects and to develop
really rapidly means that the roundtrip from end user request, to
a new prototype for her to try out, can be reduced from hours to
minutes or from days to hours. There is no reason to even leave the
end users computer to add and demonstrate a new or changed feature.
All we need is there...

Testing, deployment, data conversion etc are also parts of software
development projects that can gain a lot from having a tool like
Python available.

Other examples of productivity boosts with python can be found here:
http://pythonology.org/success

Few mention numbers though. I guess that there are some organisations
that use the Capability Maturity Model for Software who would be able
to find useful metrics if they used Python for a project similar to
one where they had previously used Java, but a) few use CMM with such
rigor, and b) if they did, they might not want to tell! Let the
competition continue to waste their time coding Java. :) Finally, c)
organisations where CMM is popular are probably organisations where
static typing, waterfall development style and other rigid and archaic
ideas are more popular than agile methods and tools.

Still benchmarks always have a limited value.

I've looked a bit at benchmarks, such as The Great Computer Language
Shootout and looking at lines of code there, gives much smaller
differences than five to ten times. I compared C++ and Python, and C++
varied from 25% less to 500% more lines of code, on the average C++
programs were around 80% longer. Less than a twofold gain it seems...

But when we study the material in more detail, we see some relevant
things:

The more "realistic" the benchmarks are, the bigger the difference:
For plain algorithm tests and things like "nested loops", "call a
method", "instanciate an object" etc, there is almost no difference.
For things like "echo client/server", "spell check", file handling
etc, the difference is between 2.4 and 5.1 times.

I don't know how strong the Java standard library is, but several of
the benchmarks are about reimplementing builtin things in Python, such
as sorting and random number generation. Completely meaningless! A
real life implementation would be a much shorter, since most of the
needed code is already in a standard library module!

For fun, I've made Python programs that achieved the same end result
without trying to use the same (meaningless) methods as the other
programs in the benchmark, and they are often 10 times shorter.
Sometimes they are also much faster and scalable.

There are three big reasons that Python programs are typically short
and easy to read.
* The Python syntax and data types are at a higher level of
abstraction, and don't have a lot of noise. It's also designed
with the objective of making it easy to do the right thing, rather
than making it difficult to do the wrong thing.
* The dynamic nature of Python makes it easy to write very flexible
code, and avoid a lot of code redundancy and twisting that is
common as you have to fight against the limitations of more static
and low level languages.
* Python's standard library is rich and reasonably easy to use.

In addition to that, the absence of compile and link steps in Python
makes development much faster than languages like C++. I've worked
in large C++ projects where building major applications could take
an hour due to the extensive dependencies. In Python this is a non-
issue. I don't know enough about Java development to compare with
that.

Since Python programs are shorter and easier to understand, and
faster to write than programs written in most other languages, it's
usually viable to change or rewrite code which is slow or otherwise
non-optimal, while it's common in projects using other languages
that code which is known to be bad is kept because it's too costly
to fix it.

Anyway, if you try Python in some project, I geuss your will form
an educated opinion. I'm sure there is no one objective truth about
this. Different people have different preferences, and different
languages have different sweetspots. If you are writing drivers for
hardware, or encryption code, Python is probably not main language,
but it might still be very useful for various tools and one shot
hacks that you do while you develop the "real" software in some
other language.

Some people think Ruby is "purer" and "prettier" than Python.
Personally, I find the Perl-like features, such as various #@$%
etc and mixing regular expressions in the syntax of the language
rather awkward.
 
R

Robin Munn

Peter Hansen said:
Harry George wrote in a thought-provoking post:

My background is (roughly in order) APL, FORTRAN, BASIC, Assembly, C,
university :), Pascal, C++, Object Pascal, Java, LabVIEW, and Python
(with a dozen others I forget) and I'm telling you Python is a really
great language. I've also dumped my previously favourite languages
(to wit, BASIC, C, C++, Delphi, and Java) to focus on Python.

Now all you need are 19 others and we'll have a significant data point.
(Signifying what? That's what I want to know. ;-)

I would say "Signifying nothing", but that would mean that all this is
"a tale told by an idiot, full of sound and fury," if you believe the
Bard. I don't feel furious, and I don't think I'm an idiot, so never
mind. (Although if you have other opinions on the latter subject, feel
free *not* to let me know.) <very big grin>

Anyway -- I started with BASIC at age six. Learned Pascal and C on my
own. Also learned C++, but never really grokked object-oriented
programming until much later in my programmer's development. Next came
assembler, which on register-starved Intel hardware was way more
complicated than it really needed to be. In college I was exposed to
Java and Lisp, but never did much with them beyond that one class,
because I had discovered Perl! Here was a language with arrays that
automatically re-sized themselves to fit the amount of data you put in
them -- bliss! And built-in hash table data types were pretty cool too;
they made several different algorithms a lot easier to code up. I also
discovered PHP, which I also thought was very cool.

Then I got out of college and started using these languages every day,
for a living. I still liked Perl, but it was beginning to get a bit hard
for me to read my own code six months later. Not that I was writing
"write-only" code, but the multiplicity of $ and @ symbols mixed in with
my code was actually distracting me from the code's meaning. I found I
was having to devote part of my brainpower to parsing the syntax, and
that was slowing me down. And then a co-worked introduced me to Python.
I was weirded out by the "indentation thing" at first, but quickly
learned to like not having to look for braces. And Python just "felt"
clean. I can't explain it very well, but Python's syntax just never got
in the way of my reading, which left me free to concentrate all my
attention on what the code was actually doing. I dumped my previously
favorite language, Perl, in favor of Python and haven't looked back
since.
 
M

Matthias

Robin Munn said:
Peter Hansen said:
Now all you need are 19 others and we'll have a significant data point.
(Signifying what? That's what I want to know. ;-)

I would say "Signifying nothing", but that would mean that all this is
"a tale told by an idiot, full of sound and fury," if you believe the
Bard. I don't feel furious, and I don't think I'm an idiot, so never
mind. (Although if you have other opinions on the latter subject, feel
free *not* to let me know.) <very big grin>

Anyway -- I started with BASIC at age six. Learned Pascal and C on my
own. Also learned C++, but never really grokked object-oriented
[...]

Just in case you guys are interested: A similar thing (people telling
personal stories how they got to language X after having suffered
under languages A, B, and Y) is going on in the Lisp community for a
while. The project page is http://alu.cliki.net/RtL Highlight Film

I admit that even if it's not science it's an interesting read.
 
M

Matthias

Paul Prescod said:
Despite the smiley I think it is worth putting this in
perspective. Take a random language implementation from 1975 and try
implementing something in it. I think that you will find strong
anecdotal evidence that Things Are Getting Better. It is a
frustratingly slow and inefficient but one that nevertheless seems to
move in the right direction.

In 1975 I would have chosen Scheme to implement something. Had people
in 1975 added indentation-based syntax and some useful-and-documented
libs to it... ;-)

But I agree that I wouldn't want to switch my hardware/software
environment against one from 25 years ago. Or 3 for that matter.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,183
Messages
2,570,966
Members
47,516
Latest member
ChrisHibbs

Latest Threads

Top