A
Adam Akhtar
Hi im starting to use hrpicot and im having problems extracting
descriptions of various concert events from a page. Here is a sample of
the html
<p>
<a name="concerts"/>
<span class="heading">Concerts</span>
<br/>
<span class="subheading">POPULAR</span>
<br/>
<br/>
<span class="textbold">Middle Field! Vol.4</span >
<br/>
Featuring electric-pop band The Stealth, Mac and Masaru, and others. Mar
28, 7pm, ¥2,500 (adv)/ ¥3,000 (door). Shibuya O-Nest. Tel: 03-3498-9999.
<br/>
<br/>
<span class="textbold">Philip Woo featuring Brenda Vaughn</span>
<br/>
Japanese pianist and soul singer performing with Andy Wulf and Kaori
Kobayashi. Mar 28 & 29, 7 & 9:30pm, ¥3,150. Cotton Club, Marunouchi.
Tel: 03-3215-1555.
<br/>
...
...
...
etc
I can get the artist band names fine using
names = doc.search("//span[@class='textbold']")
but i cant get teh descriptions. In fact the descriptions aren't
indvidually wrapped up in any tags but rather just clumped together
under the paragraph tab with line breaks <br/>
So I thought id just try
descriptions =
doc.search("/html/body/div/table/tbody/tr[4]/td/table/tbody/tr/td[2]/table/tbody/tr/td/span/p")
but when i try to puts descriptions nothing is printed to the screen.
How would i go about getting this info??? any tips or ideas?
Thanks
descriptions of various concert events from a page. Here is a sample of
the html
<p>
<a name="concerts"/>
<span class="heading">Concerts</span>
<br/>
<span class="subheading">POPULAR</span>
<br/>
<br/>
<span class="textbold">Middle Field! Vol.4</span >
<br/>
Featuring electric-pop band The Stealth, Mac and Masaru, and others. Mar
28, 7pm, ¥2,500 (adv)/ ¥3,000 (door). Shibuya O-Nest. Tel: 03-3498-9999.
<br/>
<br/>
<span class="textbold">Philip Woo featuring Brenda Vaughn</span>
<br/>
Japanese pianist and soul singer performing with Andy Wulf and Kaori
Kobayashi. Mar 28 & 29, 7 & 9:30pm, ¥3,150. Cotton Club, Marunouchi.
Tel: 03-3215-1555.
<br/>
...
...
...
etc
I can get the artist band names fine using
names = doc.search("//span[@class='textbold']")
but i cant get teh descriptions. In fact the descriptions aren't
indvidually wrapped up in any tags but rather just clumped together
under the paragraph tab with line breaks <br/>
So I thought id just try
descriptions =
doc.search("/html/body/div/table/tbody/tr[4]/td/table/tbody/tr/td[2]/table/tbody/tr/td/span/p")
but when i try to puts descriptions nothing is printed to the screen.
How would i go about getting this info??? any tips or ideas?
Thanks