1st program -- how would you improve this?

  • Thread starter Esmail Bonakdarian
  • Start date
E

Esmail Bonakdarian

Hi,

I just wrote my first Ruby program and I would be very interested in
some feedback as to style and how to write things more in the
Ruby-way.

I know I could use more methods and/or possibly classes, and that
will happen, but for now I would like to hear what betrays my
programming style coming from other languages.

Thanks,
Esmail



Briefly, this tiny program grabs a HTML for a random date within the
last 10 years from the Astro Picture of the Day (APOD) site
(http://antwrp.gsfc.nasa.gov/apod/archivepix.html).

It then parses the HTML file to extract the name of the larger .jpg or
..gif file (there is usually a smaller version too), generates the
correct URL for the image, and then fires up eog (a Linux program to
display images) to fetch and display the image.

Works .. but it's not pretty, at least in terms of lack of methods
etc. Other ways to improve this?


----------------

#!/usr/bin/ruby
require 'net/http'

# method found on-line, slightly modified

def random_date(years_back=5)
year = Time.now.year - rand(years_back) - 1
month = rand(12) + 1
day = rand(31) + 1
year = year.to_s.gsub(/20|19/,'')
sprintf("%02s%02d%02d", year,month,day)
end

r_date=random_date(10) # go back 10 years

puts r_date

#
# get the html file from the apod site
#

text=Net::HTTP.start( 'antwrp.gsfc.nasa.gov', 80 ) do |http|
http.get( "/apod/ap#{r_date}.html" ).body
end

#
# get the HTML and process contents line by line
#

found_title = false
found_image = false

text.each { |line|

#
# find the title of the current pic
#

if !found_title && line =~ /<TITLE>/i
title = $'

puts "----------------------------------------------"
printf("%s\n", title.strip)
puts "----------------------------------------------\n\n"

found_title = true
end



#
# now start searching for the image name
#

if !found_image && line =~ /\.(jpg|gif) *\">/i

if line.include? "jpg" # is there a way to case do insensitve comparison here?
image = $`+".jpg"
else
image = $`+".gif"
end

url="http://apod.nasa.gov/apod/"
image=image.gsub(/<a +href="/i,'')
fetch_url=url+image

printf("%s\n", fetch_url)

found_image = true
system("eog " + fetch_url + " >& /dev/null &")
end

}
 
S

Sander Land

Hi,

I just wrote my first Ruby program and I would be very interested in
some feedback as to style and how to write things more in the
Ruby-way.

I know I could use more methods and/or possibly classes, and that
will happen, but for now I would like to hear what betrays my
programming style coming from other languages.

Thanks,
Esmail
...
Works .. but it's not pretty, at least in terms of lack of methods
etc. Other ways to improve this?

Hi Esmail,

Here's my shot at rewriting your program.
Note that this is simply how I would write it, not everything changed
is necessarily an 'improvement'.

require 'open-uri'
def random_date(years_back=5)
Time.at(Time.now - rand(years_back * 365 * 24 * 3600)).strftime('%y%m%d')
end

r_date=random_date(10)
puts r_date

open("http://antwrp.gsfc.nasa.gov/apod/ap#{r_date}.html") {|f|
text = f.read
if text =~ /<TITLE>([^<]*)/i
puts "----------------------------------------------"
puts $1.strip
puts "----------------------------------------------\n\n"
end

if text =~ /href=\"([^\"]+(jpg|gif))\">/i
image = $1
fetch_url = "http://apod.nasa.gov/apod/" + image
puts fetch_url
# system(...)
end
}

Comments:

- your random_date generated invalid dates sometimes, 01/30/02 and
such. Using timestamps for everything is a bit more stable (Although
slightly off if lots of leap years are in the range).
- puts does to_s automatically, so printf("%s\n",..) is never needed
- reading the text per line was meant as an optimization? This is
probably not necessary.
- $' / $` and such are rarely used in Ruby, try to avoid them when the
more readable $1 or String#scan will do.
- more methods / classes are not really needed for such a small program.
 
D

David Vallner

--------------enigA6EB6D28A0F0C3339CE66AE7
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I'd probably have used OpenURI and then

text =3D open('http://www.example.com/').read

which I find a little more readable.

Oh, and probably also a HTML parser like Hpricot to make getting the
title tag more foolproof.

I also strongly disapprove of the Perl-ish use of global variables that
get set as a side-effect of a regex match. Notably because that's a)
relying on a side-effect of a function on b) a global variable. But
coding everything explicitly is a personal pet peeve of mine.

David Vallner


--------------enigA6EB6D28A0F0C3339CE66AE7
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (MingW32)

iD4DBQFFJX4Gy6MhrS8astoRAotyAJdV+x2/G7GKUEec9wTwdJHkWwzbAJ4qkHq/
iWyBrjiMynIk03lUQPXGEw==
=GZOx
-----END PGP SIGNATURE-----

--------------enigA6EB6D28A0F0C3339CE66AE7--
 
T

Timothy Goddard

Esmail said:
Hi,

I just wrote my first Ruby program and I would be very interested in
some feedback as to style and how to write things more in the
Ruby-way.

I know I could use more methods and/or possibly classes, and that
will happen, but for now I would like to hear what betrays my
programming style coming from other languages.

Thanks,
Esmail



Briefly, this tiny program grabs a HTML for a random date within the
last 10 years from the Astro Picture of the Day (APOD) site
(http://antwrp.gsfc.nasa.gov/apod/archivepix.html).

It then parses the HTML file to extract the name of the larger .jpg or
.gif file (there is usually a smaller version too), generates the
correct URL for the image, and then fires up eog (a Linux program to
display images) to fetch and display the image.

Works .. but it's not pretty, at least in terms of lack of methods
etc. Other ways to improve this?
<<snip>>

You could use Hpricot to do this much more easily and reliably. I've
whipped up an example of using it for this. Note also that your random
dat function could produce a date in the future up to the end of the
current year. Error checking when using any of the network libraries is
also a must.

For a first program that's pretty good though! You obviously have the
hang of using Ruby, you just need to pick up on a few of the libraries
out there and how best to use them.

Here's how I would do it:

# Remove the rubygems require if you manually installed hpricot
require 'rubygems'
require 'hpricot'
require 'net/http'

class Time
def self.random(years_back = 10)
# Set start and end times
end_time = Time.now
start_time = Time.mktime(end_time.year - 10, end_time.month,
end_time.day, 0, 0,0,0)

Time.at(start_time + rand(end_time - start_time))
end
end

# Select a date
r_date = Time.random.strftime("%y%m%d")
puts r_date

# Retrieve the page
response = Net::HTTP.get_response("antwrp.gsfc.nasa.gov",
"/apod/ap#{r_date}.html")
unless (200..299) === response.code.to_i
puts "There was an error retrieving the document"
puts "Response code: #{response.code}"
exit
end

# Parse the document
doc = Hpricot(response.body)

# Extract title
title = (doc / "title").first.inner_html
puts "Title: #{title}"

# Extact all images
images = (doc/"img")
if images.length == 0
puts "Could not extract the image from the document."
exit
end

# Assume the first image is the one we want.
image = images.first["src"]

puts "Running eog (whatever that is) to load image #{image}."

# Execute the program. Note that it depends on the image path being
relative.
system("eog http://apod.nasa.gov/apod/#{image}")
 
G

gwtmp01

I also strongly disapprove of the Perl-ish use of global variables
that
get set as a side-effect of a regex match. Notably because that's a)
relying on a side-effect of a function on b) a global variable. But
coding everything explicitly is a personal pet peeve of mine.

Just because they start with a $ doesn't mean they are global. I
believe
$1, $2 and so on are actually per thread local variables.

Gary Wright
 
E

Esmail Bonakdarian

David said:
I'd probably have used OpenURI and then

text = open('http://www.example.com/').read
which I find a little more readable.

ah .. cool .. I got what I had before from some sample
code I found either in a book or on line. Lots of ways
to do things, hence my query, and goal to find out :)
Oh, and probably also a HTML parser like Hpricot to make getting the
title tag more foolproof.

Some one else had mentioned Hpricot .. but that's an add-on, right?
Ie not part of the standard Ruby distribution? My goal is to write
code that will run pretty much everywhere w/o special requirements,
but I will look into Hpricot since it seems very useful.
I also strongly disapprove of the Perl-ish use of global variables that
get set as a side-effect of a regex match. Notably because that's a)
relying on a side-effect of a function on b) a global variable. But
coding everything explicitly is a personal pet peeve of mine.

Oh, I'm all for explicitly coding too .. (ie don't rely on default
initial values for variables, always explicitly initialize them myself
- good habit in case you work with a language that doesn't do that for
you, like C. or use parens etc.)

So, by globals, are you referring to the $` and $' ?
 
M

MonkeeSage

David said:
I also strongly disapprove of the Perl-ish use of global variables that
get set as a side-effect of a regex match. Notably because that's a)
relying on a side-effect of a function on b) a global variable. But
coding everything explicitly is a personal pet peeve of mine.

Hi David,

Ruby embraces the side-effect. This doesn't mean unpredictability as
many functional programmers assume; it means that the conditions which
*always cause the same side-effect* should be incorporated into the
control flow, leading to the same output from the same input, always.

s = 'tree'
if s =~ /(.ree)/
p $1
end
# => "tree"
if s =~ /(free)/
p $1
end
p $1 # => nil

$1 will *always* be non-nil after a grouped match, and nil after a
non-match or an ungrouped match. So the side-effect is fully
predictable.

Regarding global variables; I don't see anything wrong with them *in
concept*; granted you don't always need them, and you can explicitly
pass around a local variable (or use an object attribute like an
instance variable) to get the same effect in most cases, but that is
the same thing *conceptually* as using the global scope to implicity
pass the value. Also, granted that you should not clutter up the global
namespace with a bunch of junk, especially not vast quantities of
large, short-lived data; but the result of a regexp match which is
released after the next match attempt doesn't seem to be in that
category.

But those issues aside, I also favor being explicit for the sake of
clarity.

m = /(.ree)/.match('tree')
if m and m[1]
p m[1]
end

But even then, $1 still gets set. ;)

Regards,
Jordan
 
D

David Vallner

--------------enig2775682B8151561AC39C888D
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Just because they start with a $ doesn't mean they are global. I belie= ve
$1, $2 and so on are actually per thread local variables.
=20

There's still the readability problem with that. You're communicating
information in the program via an implicitly assumed point instead of an
explicitly declared one.

David Vallner


--------------enig2775682B8151561AC39C888D
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (MingW32)

iD8DBQFFJZlBy6MhrS8astoRAh9YAJ4u7rBoZLsU2n0KyUWJ+u/zsH8SuQCbBwCm
RcuYxfkperaGyuqQFKR4obo=
=UseD
-----END PGP SIGNATURE-----

--------------enig2775682B8151561AC39C888D--
 
E

Esmail Bonakdarian

Sander said:
Here's my shot at rewriting your program.
Note that this is simply how I would write it, not everything changed
is necessarily an 'improvement'.

Hi

I do appreciate the feedback!
require 'open-uri'
def random_date(years_back=5)
Time.at(Time.now - rand(years_back * 365 * 24 * 3600)).strftime('%y%m%d')
end

ah yes, I thought about that after I posted this. Didn't know about
strftime, hence my C sprintf standby :) I may slightly adjust this
to start with a date in 1995 and work forward to the present.
r_date=random_date(10)
puts r_date

open("http://antwrp.gsfc.nasa.gov/apod/ap#{r_date}.html") {|f|
text = f.read
if text =~ /<TITLE>([^<]*)/i
puts "----------------------------------------------"
puts $1.strip
puts "----------------------------------------------\n\n"
end

if text =~ /href=\"([^\"]+(jpg|gif))\">/i
image = $1
fetch_url = "http://apod.nasa.gov/apod/" + image
puts fetch_url
# system(...)
end
}

nice ...
Comments:

- your random_date generated invalid dates sometimes, 01/30/02 and
such. Using timestamps for everything is a bit more stable (Although
slightly off if lots of leap years are in the range).
- puts does to_s automatically, so printf("%s\n",..) is never needed
- reading the text per line was meant as an optimization? This is
probably not necessary.

no, not really, just thought the string matching would work
better with line-by-line.
- $' / $` and such are rarely used in Ruby, try to avoid them when the
more readable $1 or String#scan will do.
- more methods / classes are not really needed for such a small program.

Thanks for taking time to make these suggestions/comments, they are
very useful and just exactly what I was looking for.

Esmail
 
E

Esmail Bonakdarian

Timothy said:
You could use Hpricot to do this much more easily and reliably. I've
whipped up an example of using it for this. Note also that your random
dat function could produce a date in the future up to the end of the
current year. Error checking when using any of the network libraries is
also a must.

Agreed .. more error checking needs to be done .. as you mention
below, I need to browse the libraries and find out what is available
and use it :)
For a first program that's pretty good though! You obviously have the
hang of using Ruby, you just need to pick up on a few of the libraries
out there and how best to use them.

I've been programming for quite a while, but each language has
its own idiomatic ways for doing things ... which is why this
feedback is so valuable.
Here's how I would do it:

# Remove the rubygems require if you manually installed hpricot
require 'rubygems'
require 'hpricot'
require 'net/http'

class Time
def self.random(years_back = 10)
# Set start and end times
end_time = Time.now
start_time = Time.mktime(end_time.year - 10, end_time.month,
end_time.day, 0, 0,0,0)

Time.at(start_time + rand(end_time - start_time))
end
end

# Select a date
r_date = Time.random.strftime("%y%m%d")
puts r_date

# Retrieve the page
response = Net::HTTP.get_response("antwrp.gsfc.nasa.gov",
"/apod/ap#{r_date}.html")
unless (200..299) === response.code.to_i
puts "There was an error retrieving the document"
puts "Response code: #{response.code}"
exit
end

# Parse the document
doc = Hpricot(response.body)

# Extract title
title = (doc / "title").first.inner_html
puts "Title: #{title}"

# Extact all images
images = (doc/"img")
if images.length == 0
puts "Could not extract the image from the document."
exit
end

# Assume the first image is the one we want.
image = images.first["src"]

puts "Running eog (whatever that is) to load image #{image}."

# Execute the program. Note that it depends on the image path being
relative.
system("eog http://apod.nasa.gov/apod/#{image}")

Thank you!

Esmail
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,189
Members
46,735
Latest member
HikmatRamazanov

Latest Threads

Top