J
John Doty
I realized that I have a little job on the table that is a fine test of
the Python versus Standard Forth code availability and reusability issue.
Note that I have little experience with either Python or Standard Forth
(but I have much experience with a very nonstandard Forth). I've noodled
around a bit with both gforth and Python, but I've never done a serious
application in either. In my heart, I'm more of a Forth fan: Python is a
bit too much of a black box for my taste. But in the end, I have work to
get done.
The problem:
I have a bunch of image files in FITS format. For each raster row in
each file, I need to determine the median pixel value and subtract it
from all of the pixels in that row, and then write out the results as
new FITS files.
This is a real problem I need to solve, not a made-up toy problem. I was
originally thinking of solving it in C (I know where to get the pieces
in that language), but it seemed like a good test problem for the Python
versus Forth issue.
I looked to import FITS reading/writing, array manipulation, and median
determination. From there, the solution should be pretty easy.
So, first look for a median function in Python. A little googling finds:
http://www.astro.cornell.edu/staff/loredo/statpy/
Wow! This is good stuff! An embarrassment of riches here! There are even
several FITS modules, and I wasn't even looking for those yet. And just
for further gratification, the page's author is an old student of mine
(but I'll try not to let this influence my attitude). So, I followed the
link to:
http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html
From there, I downloaded stats.py, and the two other modules the page
says it requires, and installed them in my site-packages directory. Then
"from stats import median" got me a function to approximately determine
the median of a list. It just worked. The approximation is good enough
for my purposes here.
Pyfits required a little more resourcefulness, in part because STSCI's
ftp server was down yesterday, but I got it installed too. It helps that
when something is missing, the error message gives you a module name. It
needs the numarray module, so I got array manipulation as a side effect.
I haven't finished the program, but I've tried out the pieces and all
looks well here.
OK, now for Forth. Googling for "forth dup swap median" easily found:
http://www.taygeta.com/fsl/library/find.seq
At first blush, this looked really good for Forth. The search zeroed in
on just what I wanted, no extras. The algorithm is better than the one
in the Python stats module: it gives exact results, so there's no need
to check that an approximation is good enough. But then, the
disappointment came.
What do you do with this file? It documents the words it depends on, but
not where to get them. I'm looking at a significant effort to assemble
the pieces here, an effort I didn't suffer through with Python. So, my
first question was: "Is it worth it?".
The answer came from searching for FITS support in Forth. If it exists
in public, it must be really well hidden. That's a "show stopper", so
there was no point in pursuing the Forth approach further.
In the end, it was like comparing a muzzle-loading sharpshooter's rifle
with a machine gun: Forth got off one really good shot, but Python just
mowed the problems down.
The advocates of the idea that Standard Forth has been a boon to code
reusability seem mostly to be people with large private libraries of
Forth legacy code. No doubt to them it really has been a boon. But I
think this little experiment shows that for the rest of us, Python has a
published base of reusable code that puts Forth to shame.
the Python versus Standard Forth code availability and reusability issue.
Note that I have little experience with either Python or Standard Forth
(but I have much experience with a very nonstandard Forth). I've noodled
around a bit with both gforth and Python, but I've never done a serious
application in either. In my heart, I'm more of a Forth fan: Python is a
bit too much of a black box for my taste. But in the end, I have work to
get done.
The problem:
I have a bunch of image files in FITS format. For each raster row in
each file, I need to determine the median pixel value and subtract it
from all of the pixels in that row, and then write out the results as
new FITS files.
This is a real problem I need to solve, not a made-up toy problem. I was
originally thinking of solving it in C (I know where to get the pieces
in that language), but it seemed like a good test problem for the Python
versus Forth issue.
I looked to import FITS reading/writing, array manipulation, and median
determination. From there, the solution should be pretty easy.
So, first look for a median function in Python. A little googling finds:
http://www.astro.cornell.edu/staff/loredo/statpy/
Wow! This is good stuff! An embarrassment of riches here! There are even
several FITS modules, and I wasn't even looking for those yet. And just
for further gratification, the page's author is an old student of mine
(but I'll try not to let this influence my attitude). So, I followed the
link to:
http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html
From there, I downloaded stats.py, and the two other modules the page
says it requires, and installed them in my site-packages directory. Then
"from stats import median" got me a function to approximately determine
the median of a list. It just worked. The approximation is good enough
for my purposes here.
Pyfits required a little more resourcefulness, in part because STSCI's
ftp server was down yesterday, but I got it installed too. It helps that
when something is missing, the error message gives you a module name. It
needs the numarray module, so I got array manipulation as a side effect.
I haven't finished the program, but I've tried out the pieces and all
looks well here.
OK, now for Forth. Googling for "forth dup swap median" easily found:
http://www.taygeta.com/fsl/library/find.seq
At first blush, this looked really good for Forth. The search zeroed in
on just what I wanted, no extras. The algorithm is better than the one
in the Python stats module: it gives exact results, so there's no need
to check that an approximation is good enough. But then, the
disappointment came.
What do you do with this file? It documents the words it depends on, but
not where to get them. I'm looking at a significant effort to assemble
the pieces here, an effort I didn't suffer through with Python. So, my
first question was: "Is it worth it?".
The answer came from searching for FITS support in Forth. If it exists
in public, it must be really well hidden. That's a "show stopper", so
there was no point in pursuing the Forth approach further.
In the end, it was like comparing a muzzle-loading sharpshooter's rifle
with a machine gun: Forth got off one really good shot, but Python just
mowed the problems down.
The advocates of the idea that Standard Forth has been a boon to code
reusability seem mostly to be people with large private libraries of
Forth legacy code. No doubt to them it really has been a boon. But I
think this little experiment shows that for the rest of us, Python has a
published base of reusable code that puts Forth to shame.