K
Kjell Olsen
I'm trying to make requests to google translate [http://google.com/=20
translate_t] to translate words through a little ruby script. I can't =20=
get open-uri to work on URL's with accented characters (=E5=E9=EE=F2=FC =
etc). =20
Just calling open() gives:
URI::InvalidURIError: bad URI(is not URI?): http://google.com/=20
translate_t?langpair=3Den|fr&text=3D=E9lire
from /usr/local/lib/ruby/1.8/uri/common.rb:432:in `split'
from /usr/local/lib/ruby/1.8/uri/common.rb:481:in `parse'
from /usr/local/lib/ruby/1.8/open-uri.rb:29:in `open'
from (irb):2
from :0
URI.encode()ing the url breaks the characters down into gunk (=E9 =3D> %=20=
C3%A9) which the translator doesn't understand.
I've tried fooling with $KCODE and switching to net/http, neither to =20
any avail. Anyone have a hand to give me? I'll paste the code in full =20=
below in case anyone wants to see it.
-kjell
--- translate.rb ------------
#!/usr/local/bin/ruby
%w[rubygems open-uri hpricot active_support readline].each {|lib| =20
require lib}
$KCODE =3D 'u'
# can't handle =E2/=E9/utf8 chars because open-uri won't let us have =20
special chars in a query, nor will net/http
class GoogleTranslator
attr_reader :doc
@@langs =3D 'fr|en'
def initialize(text, langs=3D@@langs)
@text, @langs =3D text.chomp, langs
@Doc =3D Hpricot(open(URI.encode("http://google.com/translate_t?=20
langpair=3D#{@langs}&text=3D#{@text}")))
end
def result
@result ||=3D @doc.search('#result_box').inner_html
write_history_line
@result
end
def write_history_line # keep a file with all the translations, =20
just for kicks.
`echo "#{"[#{@langs}]\t#{@text}\t\t->\t#{@result}"}" | cat >> '/=20
Users/kjell/mess/2007/03/translation-history.txt'`
end
class << self
include Readline
def prompt; "[#{@@langs}] > "; end
def interact!
while text =3D readline(prompt, true)
(text =3D~ /lang: (.*)/) ? @@langs =3D $1 : puts("#=20
{GoogleTranslator.new(text).result}") # either switch languages or =20
spit out a translation
end
end
end
end
ARGV[0] ? puts(GoogleTranslator.new(*ARGV[0..1]).result) : =20
GoogleTranslator.interact! # if called with an argument, translate =20
that argument; else set up for interaction
translate_t] to translate words through a little ruby script. I can't =20=
get open-uri to work on URL's with accented characters (=E5=E9=EE=F2=FC =
etc). =20
Just calling open() gives:
URI::InvalidURIError: bad URI(is not URI?): http://google.com/=20
translate_t?langpair=3Den|fr&text=3D=E9lire
from /usr/local/lib/ruby/1.8/uri/common.rb:432:in `split'
from /usr/local/lib/ruby/1.8/uri/common.rb:481:in `parse'
from /usr/local/lib/ruby/1.8/open-uri.rb:29:in `open'
from (irb):2
from :0
URI.encode()ing the url breaks the characters down into gunk (=E9 =3D> %=20=
C3%A9) which the translator doesn't understand.
I've tried fooling with $KCODE and switching to net/http, neither to =20
any avail. Anyone have a hand to give me? I'll paste the code in full =20=
below in case anyone wants to see it.
-kjell
--- translate.rb ------------
#!/usr/local/bin/ruby
%w[rubygems open-uri hpricot active_support readline].each {|lib| =20
require lib}
$KCODE =3D 'u'
# can't handle =E2/=E9/utf8 chars because open-uri won't let us have =20
special chars in a query, nor will net/http
class GoogleTranslator
attr_reader :doc
@@langs =3D 'fr|en'
def initialize(text, langs=3D@@langs)
@text, @langs =3D text.chomp, langs
@Doc =3D Hpricot(open(URI.encode("http://google.com/translate_t?=20
langpair=3D#{@langs}&text=3D#{@text}")))
end
def result
@result ||=3D @doc.search('#result_box').inner_html
write_history_line
@result
end
def write_history_line # keep a file with all the translations, =20
just for kicks.
`echo "#{"[#{@langs}]\t#{@text}\t\t->\t#{@result}"}" | cat >> '/=20
Users/kjell/mess/2007/03/translation-history.txt'`
end
class << self
include Readline
def prompt; "[#{@@langs}] > "; end
def interact!
while text =3D readline(prompt, true)
(text =3D~ /lang: (.*)/) ? @@langs =3D $1 : puts("#=20
{GoogleTranslator.new(text).result}") # either switch languages or =20
spit out a translation
end
end
end
end
ARGV[0] ? puts(GoogleTranslator.new(*ARGV[0..1]).result) : =20
GoogleTranslator.interact! # if called with an argument, translate =20
that argument; else set up for interaction