Mechanize select list help...?

A

Andy Pipes

Hi.

I'm using the excellent WWW::Mechanize to screen scrape a site for UK
frost dates (don't ask ;)

there's a lot of issues with the HTML not being grand, so I thought
that's where I am going wrong in my code, but I'd be really grateful if
somebody could give me a steer on this as I've been trying for hours,
and the documentation only gets me half-way :)

Here's the code. All I want to do is select each of the 100 or so towns
in the select list, follow the link via the submit button and scrape the
first and last frost dates from the resulting page.

Here's the code:

require 'rubygems'
require 'mechanize'

agent = WWW::Mechanize.new
page = agent.get('http://www.gardenaction.co.uk/main/weather1.asp')


town_results = page.form_with:)action => 'create_cookie.asp') do |e|
e.fields.name('Town').options.each do |s|
s.select
end
end.submit

p town_results.search("/<p align=\"left\">HOME TOWN:(.*)<Form
Method=Post Action=\"create_cookie.asp\">/")

I think I'm actually getting as a result the page itself back not the
results page (which should be
http://www.gardenaction.co.uk/main/weather1-results.asp)

Can anyone give me some advice here? It should be obvious I'm new to
Ruby and OO so am fully expecting to have gone wrong here with instance
variables or the like :)

thanks in advance.

andy
 
M

Mark Thomas

Hi.

I'm using the excellent WWW::Mechanize to screen scrape a site for UK
frost dates (don't ask ;)

there's a lot of issues with the HTML not being grand, so I thought
that's where I am going wrong in my code, but I'd be really grateful if
somebody could give me a steer on this as I've been trying for hours,
and the documentation only gets me half-way :)

Here's the code. All I want to do is select each of the 100 or so towns
in the select list, follow the link via the submit button and scrape the
first and last frost dates from the resulting page.

Here's the code:

require 'rubygems'
require 'mechanize'

agent = WWW::Mechanize.new
page = agent.get('http://www.gardenaction.co.uk/main/weather1.asp')

town_results = page.form_with:)action => 'create_cookie.asp') do |e|
  e.fields.name('Town').options.each do |s|
    s.select
  end
end.submit

p town_results.search("/<p align=\"left\">HOME TOWN:(.*)<Form
Method=Post Action=\"create_cookie.asp\">/")

I think I'm actually getting as a result the page itself back not the
results page (which should behttp://www.gardenaction.co.uk/main/weather1-results.asp)

Can anyone give me some advice here? It should be obvious I'm new to
Ruby and OO so am fully expecting to have gone wrong here with instance
variables or the like :)

I don't think it's the ruby; you need to think it through a bit more.
How many times will you need to submit the form? Once per town,
correct? Therefore, the submit and parse should be inside the loop.

Try this for starters:

agent = WWW::Mechanize.new
page = agent.get('http://www.gardenaction.co.uk/main/weather1.asp')

form = page.form_with:)action => 'create_cookie.asp')
form.fields.name('Town').options.each do |town|
form['Town'] = town
page2 = form.submit
puts page2.body
exit #remove when you're ready to process them all
end
 
A

Andy Pipes

Thanks for the help Mark. You're right I needed to think it through a
bit more. Plus, I was unnecessarily using the select method.

Now I've got to find a way to grab the stuff from the proceeding pages
that I need...on to the docs again.

cheers, andy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top