extracing the URL from hpricot element

Thread starter Nikita Ratlos
Start date Dec 10, 2008

Nikita Ratlos

Dec 10, 2008

I want to get a list of URLs from a webpage as follows:

First I create the Hpricot element as follows
doc = Hpricot(open(searchurl))

links = doc/"//html//body//div[6]//div[2]//a[@id='p-1']" +#

Next I want to append the URLs to an array as such:

results << links.map.each{|link| puts link.attributes['href'] }

The line nicely prints out the URLs how I need them, but then
puts the whole HTML link in the results array.

Any ideas how to get the URLs (without the HTML) into my results array ?

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

HTML parser using Hpricot	0	Jan 8, 2010
How to paint an element in screen using javascript?	1	Jan 11, 2023
[ANN] Hpricot 0.6 -- the swift, delightful HTML parser	0	Jun 16, 2007
Please correct my Hpricot troubles.	0	Nov 1, 2008
Hpricot question	0	Jan 30, 2008
Hpricot scraping returns nil	4	Nov 20, 2008
Scraping 3rd element with hpricot	2	Dec 9, 2008
Hpricot Help	0	Aug 25, 2006

Facebook Twitter Reddit Pinterest Tumblr WhatsApp Email Link

Members online

GeraldMann

Total: 107 (members: 1, guests: 106)
Robots: 333

Forum statistics

Threads: 473,968

Messages: 2,570,152

Members: 46,697

Latest member: AugustNabo

Latest Threads

Trying to use clangd with VSCodium, CMake_World_COMPILER not set
- Started by scassowary
- Today at 4:44 AM
Mql5 programming - expert bot source code
- Started by GeraldMann
- Yesterday at 10:15 PM
Problems in creating libraries
- Started by Riccardo 'Taro'
- Sunday at 12:11 PM
Java or C#
- Started by Ulvi_465
- Sunday at 9:39 AM
Shoelace Formula
- Started by Blue JProgramme
- Sunday at 6:41 AM
PWM Issues
- Started by MacGyver
- Saturday at 6:01 PM
Search function
- Started by v_darius
- Saturday at 12:43 PM
Replace Arrays and Const by getElementById() method ?
- Started by bobkuspe
- Saturday at 12:35 AM
BITCOIN PROGRAMMING - CODE INCLUDED - needs slight modification in linux terminal - NSA please do not block
- Started by python3bitcoin
- Saturday at 12:33 AM
Best Way To Convert PST To HTML Format
- Started by justinchapman
- Friday at 8:07 AM

Top