C
Corey Watts
Hey there everyone. I'm having a slight problem using Mechanize. I'm
trying to scrape the yellowpages.com, and extract information about each
business listing. I'm extracting all the information I want, except for
one small portion: the business's website. It is the href inside of a
link that I am trying to scrape. As far as I know, I'm following the
correct xpath rules, but I can't seem to get the part I want. One
tricky thing that I've had to deal with is that not every listing has a
website. The website link and the "learn more" link are very similar,
xpath-wise, so I have to use an if statement to check the inner text of
both of them to make sure that I'm extracting the xpath one.
I'm scraping from
http://yellowpages.com/santa-barbara-ca/restaurants?page=1 and my code
is attached.
Thanks so much for your help!
trying to scrape the yellowpages.com, and extract information about each
business listing. I'm extracting all the information I want, except for
one small portion: the business's website. It is the href inside of a
link that I am trying to scrape. As far as I know, I'm following the
correct xpath rules, but I can't seem to get the part I want. One
tricky thing that I've had to deal with is that not every listing has a
website. The website link and the "learn more" link are very similar,
xpath-wise, so I have to use an if statement to check the inner text of
both of them to make sure that I'm extracting the xpath one.
I'm scraping from
http://yellowpages.com/santa-barbara-ca/restaurants?page=1 and my code
is attached.
Thanks so much for your help!