I've been trying to programmatically issue MSN Searches and processing
the results. I'm having a hell of a time doing it and was wondering
whether anyone had some coding or debugging advice. =20
First, I tried wsdl2ruby to generate some classes to work with, but it
pukes:
C:\temp>wsdl2ruby.rb --wsdl
http://soap.search.msn.com/webservices.asmx?wsdl --type client --force
ignored element: {
http://www.w3.org/2001/XMLSchema}list
ignored attr: {}default
ignored attr: {
http://schemas.xmlsoap.org/ws/2004/08/addressing}Action
I, [2006-10-10T16:36:52.259000 #2608] INFO -- app: Creating class
definition.
W, [2006-10-10T16:36:52.259000 #2608] WARN -- app: File 'default.rb'
exists but overrides it.
F, [2006-10-10T16:36:52.275000 #2608] FATAL -- app: Detected an
exception. Stopping ... incomplete simpleType (ArgumentError)
C:/program files/ruby/lib/ruby/1.8/wsdl/xmlSchema/simpleType.rb:33:in
`base'
C:/program files/ruby/lib/ruby/1.8/wsdl/soap/classDefCreator.rb:217:in=20
[snip]
(BTW, wsdl2ruby works with
http://api.google.com/GoogleSearch.wsdl.)
Second, I tried this code:
require 'soap/wsdlDriver'
wsdl_url =3D '
http://soap.search.msn.com/webservices.asmx?wsdl'
soap =3D SOAP::WSDLDriverFactory.new( wsdl_url ).create_rpc_driver
msn_params =3D { 'AppID' =3D> '1064081Cxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
'Query' =3D> 'ruby programming language',
'CultureInfo' =3D> 'en-US',
'SafeSearch' =3D> 'Strict',
'Flags' =3D> 'None',
'Requests' =3D> {
'SourceRequest' =3D> { =20
'Source' =3D> 'Web',
'Offset' =3D> 0,
'Count' =3D> 10,
'ResultFields' =3D> 'All'
}
}
}
=20
soap.search
Request =3D> msn_params)
And got this:
irb(main):020:0* soap.search
Request =3D> msn_params)
ArgumentError: incomplete simpleType from c:/Program F
iles/ruby/lib/ruby/1.8/wsdl/xmlSchema/simpleType.rb:25:in
`check_lexical_fo
rmat'
from c:/Program
Files/ruby/lib/ruby/1.8/soap/mapping/wsdlliteralregistry.rb:113:in
`simpleob
j2soap'
[snip]
Note: the Python equivalent of this code works just fine, so I think it
has something to do with the way Ruby is processing SOAP.
Third, I tried to do it without SOAP:
require 'rubygems'
require 'open-uri'
require 'rubyful_soup'
url =3D
"
http://search.live.com/results.aspx?q=3Druby+programming+language&mkt=3D=
en-
us&FORM=3DLVSP&go.x=3D0&go.y=3D0&go=3DSearch"
page =3D open(url)
page_content =3D page.read
soup =3D BeautifulSoup.new(page_content)
and I get this:
irb(main):007:0> soup =3D BeautifulSoup.new(page_content)
ArgumentError: invalid value for Integer: "0183"
from c:/Program
Files/ruby/lib/ruby/gems/1.8/gems/htmltools-1.10/lib/html/sgml-parser.rb
:335
:in `Integer'
from c:/Program
Files/ruby/lib/ruby/gems/1.8/gems/htmltools-1.10/lib/html/sgml-parser.rb
:335
:in `handle_charref'
from c:/Program
Files/ruby/lib/ruby/gems/1.8/gems/htmltools-1.10/lib/html/sgml-parser.rb
:159
:in `goahead'
My next step is do to HTree/REXML, but I'd much rather use SOAP or
BeautifulSoup to do this. Anyone got ideas?