J
Josh Cheek
[Note: parts of this message were removed to make it a legal post.]
I'm trying to write a script that pulls out an image from a yfrog page
So this is what I have
require 'rubygems'
require 'hpricot'
require 'open-uri'
url = 'http://yfrog.com/03gssacj'
doc = Hpricot(open(url))
(doc%"#main_image").attributes['src'] # => "/img3/7036/gssac.jpg"
The problem is that the path is relative.
I've done a little googling, queried my ruby and rails ML archives, glanced
at hpricot code, and looked through the method lists for open-uri and
hpricot.
So far, I don't see anything that looks very useful.
Is there a way to have it give me the absolute path so that I can reference
the picture later?
The only thing I've found that works so far involves string manipulation,
which seems like a brittle workaround to replace something that probably
exists if I could just find it.
url = 'http://yfrog.com/03gssacj'
page = open(url)
base = page.base_uri.to_s[ /(?:http:\/\/)?[^\/]*\// ] # => "
http://img3.yfrog.com/"
relative = (Hpricot(page)%"#main_image").attributes['src'] # =>
"/img3/7036/gssac.jpg"
absolute = URI.join( base , relative )
absolute.to_s # => "http://img3.yfrog.com/img3/7036/gssac.jpg"
Anyone know of a better solution?
I'm trying to write a script that pulls out an image from a yfrog page
So this is what I have
require 'rubygems'
require 'hpricot'
require 'open-uri'
url = 'http://yfrog.com/03gssacj'
doc = Hpricot(open(url))
(doc%"#main_image").attributes['src'] # => "/img3/7036/gssac.jpg"
The problem is that the path is relative.
I've done a little googling, queried my ruby and rails ML archives, glanced
at hpricot code, and looked through the method lists for open-uri and
hpricot.
So far, I don't see anything that looks very useful.
Is there a way to have it give me the absolute path so that I can reference
the picture later?
The only thing I've found that works so far involves string manipulation,
which seems like a brittle workaround to replace something that probably
exists if I could just find it.
url = 'http://yfrog.com/03gssacj'
page = open(url)
base = page.base_uri.to_s[ /(?:http:\/\/)?[^\/]*\// ] # => "
http://img3.yfrog.com/"
relative = (Hpricot(page)%"#main_image").attributes['src'] # =>
"/img3/7036/gssac.jpg"
absolute = URI.join( base , relative )
absolute.to_s # => "http://img3.yfrog.com/img3/7036/gssac.jpg"
Anyone know of a better solution?