scrubyt scraper help

Corey Watts · Oct 1, 2010

Hello all. I'm trying to build a simple web scraper to mine some data
off the yellow pages. Specifically, this link:
http://www.yellowpages.com/santa-barbara-ca/restaurants

I'm scraping all the information that I need correctly. I'm very
pleased about that! However, I'm only able to scrape the first page. I
want my script to automatically go to the next page after the first one
has been scraped, and the next after that. Scrubyt's "next_page"
function can do this, but it can only use a full URL. On this website,
however, the "Next" link at the bottom is a relative link. Is there any
way I might be able to grab the URL of the website and add the relative
link onto it, and then go to the next page? Or is there another way of
doing it? I really appreciate the help! Thanks so much.

My code is as follows:

require 'rubygems'
require 'scrubyt'

yellowpages_data = Scrubyt::Extractor.define do

#Perform the action(s)
fetch 'http://www.yellowpages.com/santa-barbara-ca/restaurants'

# This part does the scraping
listing "//div[@class='listing_content']" do
name "Pascucci"
#street "792 State St,"
street "//span[@class='street-address']"
city "//span[@class='locality']"
state "//span[@class='region']"
zip_code "//span[@class='postal-code']"
phone "//span[@class='business-phone phone']"

# This is the function I was talking about. It needs a full
link to work, but I only have a relative one!
next_page "Next", :limit => 2
end
end

puts yellowpages_data.to_xml.write($stdout, 1)

Help : Error in scrubyt	0	Feb 18, 2010
scRUBYt! Next Page e fill_textfield... Who can help me?	0	Apr 22, 2010
[ANN] scRUBYt! 0.4.1	1	Dec 11, 2008
Problems with scRUBYt	2	Dec 30, 2008
Problem while using scrubyt	0	Oct 8, 2008
selecting text in scrubyt	0	Oct 31, 2008
[ANN] scRUBYt! 0.2.8	4	Apr 19, 2007
Help with Screen Scraper!	0	Nov 19, 2008

scrubyt scraper help

Corey Watts

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads