Alan said:
And how long does a typical do-nothing browser HTTP transaction and
CGI invocation need in comparison?
the need is to review the overall process, including server
invocation from the client and the subsequent CGI process creation,
which I'm afraid your benchmarks don't do.
What the need is depends on what you are actually trying to measure.
The conclusion I make out from my benchmark is that the *absolute*
time it takes to parse a query string is significant if you use
CGI.pm, while it's negligible if you use my code. Whether the factor
is 20, 30 or 50 is something I pay little regard to since, as you
point out, I did not measure the whole process.
My program supports a certain kind of web application, and is
typically used on web sites that are hosted on shared servers.
Sometimes it's used in a way that results in thousands of calls per day.
Now, if you have a busy web site on a shared hosting account, there is
always a limit where the hosting provider says: "This is too much, our
other cusomers are affected adversely." That's why I'm anxious to
watch the server load, and to me, 0.2 seconds appears to be
significant if there are thousands of daily calls.
mod_perl is of course suitable in order to further reduce the server
load. It's just that it's very unusual that mod_perl is availabe on
shared web hosting accounts. Of course, you can always say that the
program should have been written in PHP instead. However, that's not
the case.
In fact with some rough benchmarking of the overall process (using
LWP::Simple to run the tests against a local webserver), it seemed
to me as if our (otherwise lightly-loaded) server could run about
14 invocations per second of wallclock with your economy-model
script, compared with some 7 per second with CGI.pm, so - a factor
of around 2 (wallclock) overall, compared with your measurement of
30 (cpu) for some portion of the process. On Windows, I even got a
factor approaching 3 between them for the overall process. (Server
in both cases was a version of Apache 1.3.*, on linux and on
Win2000 respectively).
While I must admit the factor is somewhat larger than I had
expected, this does rather put your measurement of a factor of 30
into a rather more realistic context, I feel.
As regard "realistic", see above.
It surprises me that your server would allow 7 invocations per second
with CGI.pm when you run the whole process, while I found that it
would allow 5 times when only the Perl part is taken into account.
Maybe the server you used is significantly faster. Btw, are you sure
that you captured the compilation time?
Anyway, this is interesting additional info. Thanks, Alan! I suppose
it indicates that, provided that the factor is 2, I would double the
server load by starting to use CGI.pm. The difference appears to be
significant also when you look at it from this angle.
These benchmarks demonstrate that the design of CGI.pm is surprisingly
'expensive'.
I will concede that any rule can have exceptions. What I usually
say about CGI.pm is that those who have genuinely got the expertise
to *not* use CGI.pm will know why they are doing that, and will
need no advice from me. On the other hand anyone who's in a
position to seek advice is going to get my best advice (and you
know what that's going to be, in the overwhelming proportion of
cases).
I'm clearly aware that CGI.pm is in no way magical - the code
doesn't do anything that one couldn't just as well code for
oneself. And the author admits that it's grown too big, and might
benefit from being modularised. I've found the odd bug in it
myself on occasion. So this is not the uncritical adulation that
some trolls accuse us of. Nevertheless, it's overall the best thing
available for doing CGI in Perl, because the author is actively
working on it and is actively adapting it to the changing
situation, to encapsulate the gathered knowledge of browser bugs,
workarounds etc.
A proportion of extra CPU cycles isn't usually too high a price to
pay for that. And as we've seen - if it _is_ too high a price to
pay, then the most productive place to make real savings is
elsewhere.
Can we call a truce on this, then?
I hear what you say.
And it makes much sense.
Let me try to summarize my view on it out from a different angle:
Good advice is a good thing, and using Perl modules is a convenient
way to reuse code. Personally I use several modules, but when I'm able
to do something with just a couple of lines of Perl code, I sometimes
do so instead of loading hundreds of lines of code by using a module.
I don't feel that I risk getting bashed for doing so, and nobody
demands that I *prove* that my choices are right.
That is, there is one exception: The 'sacred cow' CGI.pm. Even if you
say that "the code doesn't do anything that one couldn't just as well
code for oneself" and "it's grown too big", your reasoning above
presupposes that you are able to explicitly justify your decision if
you choose to not use CGI.pm for parsing CGI data. That makes little
sense to me. The presumption that people don't know what they are
doing if they don't use CGI.pm is patronizing.
If the explanation is the security implications with CGI, I'd like to
see the focus moved to the desirable that you
- *learn* about the implied risks with CGI scripts,
- don't use code copied from random sources if you don't understand
how it works,
- carefully consider the risks with your own applications, and
validate the data accordingly, and
- enable taint mode.
I feel that these things, which I take for granted that we can agree
upon, tend to be forgotten in the 'campaign' for using CGI.pm.