M
Marc Farber
Ruby newbie here
Have successfully used hpricot to scrape correct <div> from desired page
http://www.montgomeryadvertiser.com/section/obits using
doc = Hpricot(uri above)
...
@grab1 = doc.search("//div[@class='article-bodytext']")
target data is in following logical form
<div>
<h3>name of funeral home</h3>
<p>deceased1</p>
<div>advertising crap</div>
<h3>funeral home 2</h3>
<p>deceased 2</p>
<p>deceased 3</p>
</div>
I'm struggling to iterate thru this div, plucking a array or hash where
I can feed a database with each record being a funeral home and person.
I was thinking I could go thru each of the @grab1 elements and process
according to tag type and establish the "record" logic thru simple
knowing that a new record starts with each new h3 tag.
Any help for a newbie with first Ruby script?
Thx
Have successfully used hpricot to scrape correct <div> from desired page
http://www.montgomeryadvertiser.com/section/obits using
doc = Hpricot(uri above)
...
@grab1 = doc.search("//div[@class='article-bodytext']")
target data is in following logical form
<div>
<h3>name of funeral home</h3>
<p>deceased1</p>
<div>advertising crap</div>
<h3>funeral home 2</h3>
<p>deceased 2</p>
<p>deceased 3</p>
</div>
I'm struggling to iterate thru this div, plucking a array or hash where
I can feed a database with each record being a funeral home and person.
I was thinking I could go thru each of the @grab1 elements and process
according to tag type and establish the "record" logic thru simple
knowing that a new record starts with each new h3 tag.
Any help for a newbie with first Ruby script?
Thx