find a button using mechanize

L

Li Chen

Hi all,

I wonder how I find a button in a page(buttons
#<WWW::Mechanize::Form::Button:0x3a69908 @name=nil, @value="Go">})
with mechanize.

Thanks,

Li
 
L

Lex Williams

Li said:
Hi all,

I wonder how I find a button in a page(buttons
#<WWW::Mechanize::Form::Button:0x3a69908 @name=nil, @value="Go">})
with mechanize.

Thanks,

Li

Well , you should iterate through the fields of each form :

my_form = nil

catch:)FoundForm) do
agent.page.forms.each do |form|
form.fields.each do |field|
if(field.value == "Go")
my_form = form
throw FoundForm
end
end
end
end

and now you would have the form containing the button , and could
operate on it.
 
L

Lex Williams

Lex said:
Well , you should iterate through the fields of each form :

my_form = nil

catch:)FoundForm) do
agent.page.forms.each do |form|
form.fields.each do |field|
if(field.value == "Go")
my_form = form
throw FoundForm
end
end
end
end

and now you would have the form containing the button , and could
operate on it.

I'm sorry , instead of :
throw FoundForm
you should have :
throw :FoundForm

So sorry about that.
 
L

Li Chen

Hi Lex,
I follow your script but it prints out nothing. I guess the inner loop
is not right. Accroding to this line :{buttons
#<WWW::Mechanize::Form::Button:0x3a6702c @name=nil, @value="Go">}

I change the inner loop.I get all the buttons and I am able to set the
value for @name or @value. But I still have a problem: I can't print
the whole line out as
{buttons
#<WWW::Mechanize::Form::Button:0x3a6702c @name=nil, @value="Go">}

I only print out the some part of it as
#<WWW::Mechanize::Form::Button:0x3a6702c>



BTW: where can I find some info about relationship among
form/fields/buttons and other backgrounds ?

Thanks,

Li

######################
page.forms.each do |form|
form.buttons.each do |button|
if(button.value == 'Go')
puts button.name
puts button.value
button.name='1'
puts button.name
puts button
end
end
end



###output

nil
Go
1
#<WWW::Mechanize::Form::Button:0x3ab4868>
nil
Go
1
#<WWW::Mechanize::Form::Button:0x3a9ac24>
nil
Go
1
#<WWW::Mechanize::Form::Button:0x3a807ac>
nil
Go
1
 
L

Lex Williams

Could you please post the link to the site so that I might see where the
script is going wrong ?
 
L

Lex Williams

You must have missed my update on that method , if you're using the code
you pasted above . It should throw :FoundForm , instead of FoundForm .
Like this:

my_form = nil

catch:)FoundForm) do
agent.page.forms.each do |form|
form.fields.each do |field|
if(field.value == "Go")
my_form = form
throw :FoundForm
end
end
end
end
 
L

Li Chen

Lex said:
You must have missed my update on that method , if you're using the code
you pasted above . It should throw :FoundForm , instead of FoundForm .
Like this:

Lex,

No I don't. I try both of your scripts but they don't work. For me
exception handling is not a priority and without it Ruby script still
works very well.(Not like Java...).

Here is the webpage:
http://www.ask.com/web?qsrc=2352&o=0&l=dir&dm=&q=dictionary

I try to 1) type a word such as "abacus"
2) click the "Go" button.(There are for "Go" button but I am
only interested in the first one)
3) retrieve the definition( I might use Hpricot to do that)


Thanks for the follow-up,

Li
 
L

Lex Williams

Li , I wouldn't try to find a form by searching after it's button .
Rather , try to search after the name of the input field associated to
the button . Here is how I automated
http://www.ask.com/web?qsrc=2352&o=0&l=dir&dm=&q=dictionary :

require "rubygems"
require "mechanize"

mech = WWW::Mechanize.new
mech.user_agent_alias = "Windows IE 6"
mech.get("http://www.ask.com/web?qsrc=2352&o=0&l=dir&dm=&q=dictionary")

form = nil

form = mech.page.forms.select {|form| form.has_field?("aj_text1")}.first

if(form == nil)
abort "could not find form"
end

form.aj_text1 = "abacus"
new_page = mech.submit(form)
puts new_page.body
 
L

Li Chen

Hi Lex,

Thank you very much.
Now the script is working. But I want to use hpricot to extract some
info from the retrieved page. How can I do that?


Li
 
L

Lex Williams

Li , please post examples . What is it you want to extract ? From what
page ? Why hpricot ( i assume you want it for xpath , but I can't be
sure ) ?
 
L

Li Chen

Lex said:
Li , please post examples . What is it you want to extract ? From what
page ? Why hpricot ( i assume you want it for xpath , but I can't be
sure ) ?

I need to 1) extract the definition of 'abucus'(see the following) from
this page:

abacus (n.) A manual computing device consisting of a frame holding
parallel rods strung with movable counters.
abacus (n.) A slab on the top of the capital of a column.

2)download the 'wav' file for this word and save it to my computer(so
that I can play it later).

Since I have a little bit experience with Hpricot I think using Hpricot
might help me extract the definition of 'abacus'. But I am not sure if
Mechanize can do the same thing. This is the reason I want to try
Hpricot.

Thanks,


Li
 
L

Lex Williams

Li , here is the script that downloads the wav file . It's kinda late
here , and I wasn't really in the mood of extracting definitions right
now . Maybe tommorrow . Here's the code :

require "rubygems"
require "mechanize"

mech = WWW::Mechanize.new
mech.user_agent_alias = "Windows IE 6"
mech.get("http://www.ask.com/web?qsrc=2352&o=0&l=dir&dm=&q=dictionary")

form = nil

form = mech.page.forms.select {|form| form.has_field?("aj_text1")}.first

if(form == nil)
abort "could not find form"
end

form.aj_text1 = "abacus"
new_page = mech.submit(form)

wav_link = mech.page.links.select {|link| link.href =~/\.wav$/i}.first
puts "downloading #{wav_link.href}"
mech.get(wav_link).save_as(File.basename(wav_link.href))
 
L

Li Chen

Hi Lex,

I run your script and download a wav file of 'abacus'. I find that I
cannot play the wav file with WMP. But I can play play it if it is
downloaded directly from IE browser. This is bizarre. I wonder how to
fix it.

Thanks,

Li
 
M

Martin DeMello

Hi Lex,

I run your script and download a wav file of 'abacus'. I find that I
cannot play the wav file with WMP. But I can play play it if it is
downloaded directly from IE browser. This is bizarre. I wonder how to
fix it.

Compare the two files and see if they are the same. The site might
have some sort of referrer-based protection against automated
downloads.

martin
 
L

Li Chen

Martin said:
Compare the two files and see if they are the same. The site might
have some sort of referrer-based protection against automated
downloads.

martin

The file downloaded directly is about 9 kb and the one using script is
44 kb.

Li
 
M

Martin DeMello

The file downloaded directly is about 9 kb and the one using script is
44 kb.

It's probably not the wav file, then. Does opening it up in notepad
reveal anything?

martin
 
L

Li Chen

Martin said:
It's probably not the wav file, then. Does opening it up in notepad
reveal anything?

martin

The one downloaded directly is a binary file and the one using script is
a HTML page.

Since I use the following script( by Lex) to download the file I wonder
how to fix it.

Thanks,

Li


##################
require "rubygems"
require "mechanize"

mech = WWW::Mechanize.new
mech.user_agent_alias = "Windows IE 6"
mech.get("http://www.ask.com/web?qsrc=2352&o=0&l=dir&dm=&q=dictionary")

form = nil

form = mech.page.forms.select {|form| form.has_field?("aj_text1")}.first

if(form == nil)
abort "could not find form"
end

form.aj_text1 = "abacus"
new_page = mech.submit(form)

wav_link = mech.page.links.select {|link| link.href =~/\.wav$/i}.first
puts "downloading #{wav_link.href}"
mech.get(wav_link).save_as(File.basename(wav_link.href))
 
M

Martin DeMello

puts "downloading #{wav_link.href}"
mech.get(wav_link).save_as(File.basename(wav_link.href))

wgetting the href displayed works, so I'm guessing it's the call to
mech.get that's failing. there's probably a cleaner way to do it via
mechanize, but this works:

File.open(File.basename(wav_link.href), 'w') {|f|
f.puts(mech.get_file(wav_link.href))}

martin
 
L

Li Chen

Martin said:
wgetting the href displayed works, so I'm guessing it's the call to
mech.get that's failing. there's probably a cleaner way to do it via
mechanize, but this works:

File.open(File.basename(wav_link.href), 'w') {|f|
f.puts(mech.get_file(wav_link.href))}

martin


Hi Martin,

Now the wav file can be played properly.

I have another question: the file downloaded via script is 9.36 kb and
the one from browser is 9.35kb. What causes the discrepancy?

Another question: I also need to retrieve the definition of the word
corresponding to the wav file from the same page. Can Mechanize do that?
I plan to use Hpricot to extract the info. I wonder if you or others
have any suggestions.

Thank you very much,

Li
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,818
Latest member
SapanaCarpetStudio

Latest Threads

Top