hand crafting soap for google api's

H

Hudson

#!/usr/bin/perl -w
use strict;
Hi all, today I went looking to do something with the google api's.
But...all the examples I found used soap::light and that is not
in my @INC on my server.

So I banged out this example...doesn't seem so hard. Is this all there
is too it, any advice/critiques, etc or feed back?

must name nph-somefile.cgi -> nonparsed headers

#!/usr/bin/perl -w
use CGI;
use CGI::Carp qw(fatalsToBrowser);
use IO::Socket;

my $sleep_time = 1;

my $query = new CGI;
print "HTTP/1.0 200 OK\n";
print "Content-Type: text/html\r\n\r\n";
print "<html><head><body>";
print "<hr noshade width=\"65%\">";
print "<center><h2>Google's 10 ten URL's via web services (SOAP)<\/h2>";
print "Enter one keyword or phrase per line (limit of 1,000 searches a day)";
&print_prompt($query);
&do_work($query);

print $query->end_html;

# --------------------------------------------------------- subroutines: cgi.pm

sub print_prompt {

my($query) = @_;
print $query->startform;
print $query->textarea(-name=>'domains',
-rows=>8,
-columns=>30);
print "<br><br>";
print $query->submit(-name=>'find those dirty dogs');
print $query->endform;
}


sub do_work {
unless ($query->param) {
return;
}
my $query_string = $query->param('domains');
my @domains = split(/\n/, $query_string);

metatags (@domains);
print "</body></html>";

}


# -------------------------------------------------------- subroutine: get data

sub metatags {
$|=1;

my @keywords = @_;
foreach my $keyword (@keywords) {
$keyword =~ s/\s+$//;

my $soap_request .= "<SOAP-ENV:Envelope xmlns:SOAP-ENV=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi=\"http://www.w3.org/1999/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/1999/XMLSchema\">\n";
$soap_request .= "<SOAP-ENV:Body>\n";
$soap_request .= "<ns1:doGoogleSearch xmlns:ns1=\"urn:GoogleSearch\" \n";
$soap_request .= "SOAP-ENV:encodingStyle=\"http://schemas.xmlsoap.org/soap/encoding/\">\n";
$soap_request .= "<key xsi:type=\"xsd:string\">***_your_key_here_***</key>\n";
$soap_request .= "<q xsi:type=\"xsd:string\">$keyword</q>\n";
$soap_request .= "<start xsi:type=\"xsd:int\">0</start>\n";
$soap_request .= "<maxResults xsi:type=\"xsd:int\">10</maxResults>\n";
$soap_request .= "<filter xsi:type=\"xsd:boolean\">true</filter>\n";
$soap_request .= "<restrict xsi:type=\"xsd:string\"></restrict>\n";
$soap_request .= "<safeSearch xsi:type=\"xsd:boolean\">false</safeSearch>\n";
$soap_request .= "<lr xsi:type=\"xsd:string\"></lr>\n";
$soap_request .= "<ie xsi:type=\"xsd:string\">latin1</ie>\n";
$soap_request .= "<oe xsi:type=\"xsd:string\">latin1</oe>\n";
$soap_request .= "</ns1:doGoogleSearch>\n";
$soap_request .= "</SOAP-ENV:Body>\n";
$soap_request .= "</SOAP-ENV:Envelope>";

my $request = "POST /search/beta2 HTTP/1.1\n";
$request .= "Host: api.google.com\n";
$request .= "Accept-Encoding: identity\n";
$request .= "Content-length: ".length($soap_request)."\n";
$request .= "SOAPAction: \"urn:GoogleSearchAction\"\n";
$request .= "Content-Type: text/xml\; charset=utf-8\n\n";
$request .= "$soap_request";

my $socket = IO::Socket::INET->new(PeerAddr => "api.google.com",
PeerPort => 80,
Proto => "tcp",
Type => SOCK_STREAM,
Timeout => 5);
print $socket $request;

print "<b>$keyword:</b><br>";

while (<$socket>) {

if ($_ =~ /<URL xsi:type=\"xsd:string\">(.*)<\/URL>/) {
print "<a href=\"$1\">$1</a><br>";
}
}
print "<hr>";
close($socket);
sleep $sleep_time;
}
}
 
H

Hudson

whoops...kind of messed up on that post...here is the code:

must name nph-somefile.cgi -> nonparsed headers

#!/usr/bin/perl -w
use strict;
use CGI;
use CGI::Carp qw(fatalsToBrowser);
use IO::Socket;

my $sleep_time = 1;

my $query = new CGI;
print "HTTP/1.0 200 OK\n";
print "Content-Type: text/html\r\n\r\n";
print "<html><head><body>";
print "<hr noshade width=\"65%\">";
print "<center><h2>Google's 10 ten URL's via web services (SOAP)<\/h2>";
print "Enter one keyword or phrase per line (limit of 1,000 searches a day)";
&print_prompt($query);
&do_work($query);

print $query->end_html;

# --------------------------------------------------------- subroutines: cgi.pm

sub print_prompt {

my($query) = @_;
print $query->startform;
print $query->textarea(-name=>'domains',
-rows=>8,
-columns=>30);
print "<br><br>";
print $query->submit(-name=>'find those dirty dogs');
print $query->endform;
}


sub do_work {
unless ($query->param) {
return;
}
my $query_string = $query->param('domains');
my @domains = split(/\n/, $query_string);

metatags (@domains);
print "</body></html>";

}


# -------------------------------------------------------- subroutine: get data

sub metatags {
$|=1;

my @keywords = @_;
foreach my $keyword (@keywords) {
$keyword =~ s/\s+$//;

my $soap_request .= "<SOAP-ENV:Envelope xmlns:SOAP-ENV=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi=\"http://www.w3.org/1999/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/1999/XMLSchema\">\n";
$soap_request .= "<SOAP-ENV:Body>\n";
$soap_request .= "<ns1:doGoogleSearch xmlns:ns1=\"urn:GoogleSearch\" \n";
$soap_request .= "SOAP-ENV:encodingStyle=\"http://schemas.xmlsoap.org/soap/encoding/\">\n";
$soap_request .= "<key xsi:type=\"xsd:string\">***_your_key_here_***</key>\n";
$soap_request .= "<q xsi:type=\"xsd:string\">$keyword</q>\n";
$soap_request .= "<start xsi:type=\"xsd:int\">0</start>\n";
$soap_request .= "<maxResults xsi:type=\"xsd:int\">10</maxResults>\n";
$soap_request .= "<filter xsi:type=\"xsd:boolean\">true</filter>\n";
$soap_request .= "<restrict xsi:type=\"xsd:string\"></restrict>\n";
$soap_request .= "<safeSearch xsi:type=\"xsd:boolean\">false</safeSearch>\n";
$soap_request .= "<lr xsi:type=\"xsd:string\"></lr>\n";
$soap_request .= "<ie xsi:type=\"xsd:string\">latin1</ie>\n";
$soap_request .= "<oe xsi:type=\"xsd:string\">latin1</oe>\n";
$soap_request .= "</ns1:doGoogleSearch>\n";
$soap_request .= "</SOAP-ENV:Body>\n";
$soap_request .= "</SOAP-ENV:Envelope>";

my $request = "POST /search/beta2 HTTP/1.1\n";
$request .= "Host: api.google.com\n";
$request .= "Accept-Encoding: identity\n";
$request .= "Content-length: ".length($soap_request)."\n";
$request .= "SOAPAction: \"urn:GoogleSearchAction\"\n";
$request .= "Content-Type: text/xml\; charset=utf-8\n\n";
$request .= "$soap_request";

my $socket = IO::Socket::INET->new(PeerAddr => "api.google.com",
PeerPort => 80,
Proto => "tcp",
Type => SOCK_STREAM,
Timeout => 5);
print $socket $request;

print "<b>$keyword:</b><br>";

while (<$socket>) {

if ($_ =~ /<URL xsi:type=\"xsd:string\">(.*)<\/URL>/) {
print "<a href=\"$1\">$1</a><br>";
}
}
print "<hr>";
close($socket);
sleep $sleep_time;
}
}
 
T

Tad McClellan

Hudson said:
But...all the examples I found used soap::light and that is not
in my @INC on my server.


That's OK, it does not need to be in (the default) @INC on your server.


perldoc -q module

How do I keep my own module/library directory?
 
T

Tad McClellan

&print_prompt($query);


perldoc -q "&"

What's the difference between calling a function as &foo and foo()?

my $query_string = $query->param('domains');
my @domains = split(/\n/, $query_string);


You don't need a temporary variable:

my @domains = split(/\n/, $query->param('domains'));

my $soap_request .= "<SOAP-ENV:Envelope xmlns:SOAP-ENV=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi=\"http://www.w3.org/1999/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/1999/XMLSchema\">\n";
$soap_request .= "<SOAP-ENV:Body>\n";
$soap_request .= "<ns1:doGoogleSearch xmlns:ns1=\"urn:GoogleSearch\" \n";
$soap_request .= "SOAP-ENV:encodingStyle=\"http://schemas.xmlsoap.org/soap/encoding/\">\n";
$soap_request .= "<key xsi:type=\"xsd:string\">***_your_key_here_***</key>\n";
$soap_request .= "<q xsi:type=\"xsd:string\">$keyword</q>\n";
$soap_request .= "<start xsi:type=\"xsd:int\">0</start>\n";
$soap_request .= "<maxResults xsi:type=\"xsd:int\">10</maxResults>\n";
$soap_request .= "<filter xsi:type=\"xsd:boolean\">true</filter>\n";
$soap_request .= "<restrict xsi:type=\"xsd:string\"></restrict>\n";
$soap_request .= "<safeSearch xsi:type=\"xsd:boolean\">false</safeSearch>\n";
$soap_request .= "<lr xsi:type=\"xsd:string\"></lr>\n";
$soap_request .= "<ie xsi:type=\"xsd:string\">latin1</ie>\n";
$soap_request .= "<oe xsi:type=\"xsd:string\">latin1</oe>\n";
$soap_request .= "</ns1:doGoogleSearch>\n";
$soap_request .= "</SOAP-ENV:Body>\n";
$soap_request .= "</SOAP-ENV:Envelope>";



Holy crap!

You really need to learn about "here-doc"s, see perlop.pod.

Look Ma! No backslashes!

my $soap_request =<<ENDSOAP;
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<SOAP-ENV:Body>
<ns1:doGoogleSearch xmlns:ns1="urn:GoogleSearch"
....
ENDSOAP
 
H

Hudson

That's OK, it does not need to be in (the default) @INC on your server.


ah...but I like doing it by hand! ;-)

(oh, and you should of seen me last week installing openssl in my
directory trying to get net::slleay working...hehe)
 
H

Hudson

perldoc -q "&"

What's the difference between calling a function as &foo and foo()?

to tell you the truth, I really don't understand the cgi.pm and rather
like doing it by hand, but this is old code from last month.
Holy crap!

You really need to learn about "here-doc"s, see perlop.pod.

Look Ma! No backslashes!

my $soap_request =<<ENDSOAP;
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<SOAP-ENV:Body>
<ns1:doGoogleSearch xmlns:ns1="urn:GoogleSearch"
...
ENDSOAP

yes, but with the ".=" method, you get to see all your request on one
page ;-)

I agree, through...backslashes suck....
 
T

Tad McClellan

Hudson said:
to tell you the truth, I really don't understand the cgi.pm and rather
like doing it by hand, but this is old code from last month.


That's nice, but it has nothing to do with the text that you quoted.

I didn't say anything about CGI.pm or modules or even CGI programming.

You are calling your functions incorrectly.

yes, but with the ".=" method, you get to see all your request on one
page ;-)


Huh?

You can see _more_ of the string with a here-doc than with cat-equals,
so I dunno what you are talking about there...
 
T

Tad McClellan

Hudson said:
I am much better off parsing the cgi myself.


No you're not.

Get some liability insurance if you are getting paid for
this home-rolled form parsing.
 
U

Uri Guttman

H> that's not true...I understand CGI parsing very well. CGI.pm I
H> don't understand, but understand that it is not so great. I don't
H> use it. So the code I posted I wrote last month before I got into
H> CGI.

i doubt you do. there is much more to it than just splitting query
strings. you have admitted you are a total beginner, so please drop the
attitude about how easy everything is. you don't know anything and
accept that truth for a long while. getting a simple setup like cgi or
soap to work on one shot bases is one thing. getting them to work for
all possible cases is another. but at least you are using CGI.pm now.

H> I know how to taint and untaint and what the dangers are...it is no
H> big deal.

H> Stop being a troll with me...thank you.

hahah, i will stop helping you is what you mean. you don't know from
trolls either. just read moronzilla's stuff to see a real troll.

goodbye little kiddie,

uri
 
H

Hudson

Huh?

You can see _more_ of the string with a here-doc than with cat-equals,
so I dunno what you are talking about there...

Hmmmm...I am sure you are probably right and I am just being
strong-headed because I am in love with my own methods ;-)
 
T

Tassilo v. Parseval

Also sprach Hudson:
that's not true...I understand CGI parsing very well. CGI.pm I don't
understand, but understand that it is not so great. I don't use it. So
the code I posted I wrote last month before I got into CGI.

The decision to hand-roll your own solution is amongst the poorest you
can make. And you really think it takes less time to write a robust and
standard-compliant parser yourself than learning the relevant bits of
CGI.pm? Which are actually just:

use CGI qw/param/;

my @params = param(); # all parameters in the query string
for my $p (@params) {
my $val = param($p); # single value for parameter $p
my @vals = param($p); # all values for parameter $p
}

That's not too complicated, is it?

But naturally, you are free to write your own parser. Please post it
here when you are done with it so that we can all have a good laughter.
I know how to taint and untaint and what the dangers are...it is no big deal.

Tainting is a different issue altogether. CGI.pm doesn't handle
taintedness for you at all.
Stop being a troll with me...thank you.

The most reliable way to get dropped in many killfiles is calling a
well-respected regular a troll.

Tassilo
 
H

Hudson

you have admitted you are a total beginner

stop it, uri..........!!

I am not a total beginner...ok, so I am not the expert either, but I think you
should knock off this crap.

Thanks,

Steve
 
H

Hudson

but at least you are using CGI.pm now

I never, ever use CGI.pm...you are being the total troll with me! I used it last
month before I did tons of research into doing it by hand.

Geez.....hehe...I am starting to hate you, uri ;-)
 
T

Tassilo v. Parseval

Also sprach Hudson:
this is basically what I am using

Ah, eventually some code!
my %kv = (); # key-value kv
my @form_varibles = qw(form_value_1
form_value_2
etc
);

read (STDIN, my $input, $ENV{'CONTENT_LENGTH'});

At this point you are lacking the distinction between GET and POST. This
only handles POST.
my @kv = split (/&/, $input);

Should be: (/[&;]/, $input);
for my $kv (@kv) {
(my $key, my $value) = split (/=/, $kv);

if (length($value) < 1) { zero_value ($key) }
if (length($value) > 255) { string_too_big ($key) }

Not sure what zero_value() and string_too_big() do (exception
handling?). However, a query string such as 'key1=&key2=val' is
valid.
$value =~ tr/+/ /;
$value =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg;

for my $varible (@form_varibles) {
if ($key eq "$varible") {
$kv{$key} = $value;
}

This will only return the last value in a query string such as

key=val1&key=val2

So what you need to add is the infrastructure to handle multiple
occurances of the same parameter.

No, sorry, right now it's not good enough. It has some security holes in
that you don't have means to restrict CONTENT_LENGTH. Adding to that new
forms of exploits (as the one recently discussed on the p5porters list,
search for "Algorimic complexity attacks", which perl5.8.1 will guard
you against, though), you make it relatively easy for malicious
attackers to launch a denial-of-service attack against servers running
the above code.

A glance at the source of CGI.pm would probably give you an idea how
much work Lincoln put into doing it both correctly and securely. A
re-write is certainly possible but it requires more than a few lines of
code.
if you look back, he called me a total beginner, a troll and a script
kiddie...so I was just responding in kind

Live with it. Not taking the sometimes rough tone in technical groups
personal is amongst the first things to learn. The attitude of a message
should however not distract you from the content it tries to convey.

Tassilo
 
K

Keith Keller

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tassilo v. Parseval ([email protected]) wrote on
MMMDCXXXVII September MCMXCIII in <URL:||
|| This will only return the last value in a query string such as
||
|| key=val1&key=val2
||
|| So what you need to add is the infrastructure to handle multiple
|| occurances of the same parameter.


No. All you need to do is make sure your forms don't have multiple
inputs with the same name. ;-)

Unless that's the behaviour you want from the client. :)

e.g.: a "Delete marked" button, with a submission like

delete=2;delete=4;delete=7

There are probably many other scenarios where multiple values for
the same key in the query would be useful.

- --keith

- --
(e-mail address removed)-francisco.ca.us
(try just my userid to email me)
AOLSFAQ=http://wombat.san-francisco.ca.us/cgi-bin/fom

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iEYEARECAAYFAj8+yGEACgkQhVcNCxZ5ID/nFQCfQZVd/3Uqbuo4tA6fzzGGv2SC
4IQAnAjUvY8UbgqZV3+q0zi6gV/WtQop
=7L0Z
-----END PGP SIGNATURE-----
 
A

Alan J. Flavell

On Sat, Aug 16, Uri Guttman inscribed on the eternal scroll:

Abigail quoth:
A> No. All you need to do is make sure your forms don't have multiple
A> inputs with the same name. ;-)

checkboxes are allowed to have multiple values but only one name. and
they are required to send them all as key/value pairs.

Perfectly correct: but you're aiming to implement a general-purpose
script (as any self-respecting professional would do), whereas the
kiddie is trying to write a partial implementation, thinking that
it'll be as much as they'd need for the present, and ignoring that
many related tasks will arise later and need the script expanding in
various ways.

Oh, c'mon, you don't seriously think Abigail doesn't know all this;
and if you're of a mind to quote specifications, then let's have a
decent one, not that fly-blown old carcass of HTML3.2 (spit).
so supporting duplicate names is required

If you're a professional - of course it is.
any cgi parser that doesn't do that is broken or a toy or only for play
by the author.

Well, that's what we have been dealing with all along on this thread,
as if you hadn't already noticed ;-)

Now please, let's stop baiting the would-be troll, and get on with
something more productive.
 
H

hudson

H> read (STDIN, my $input, $ENV{'CONTENT_LENGTH'});

bug alert, no checking if read worked.

thank you uri for all your bug alerts...I'm a little burnt out on the
whole topic...but I did read your reply and thanks for the input
 
T

Tassilo v. Parseval

Also sprach Abigail:
Tassilo v. Parseval ([email protected]) wrote on
MMMDCXXXVII September MCMXCIII in
<URL:|| Also sprach Hudson:
||
|| > for my $varible (@form_varibles) {
|| > if ($key eq "$varible") {
|| > $kv{$key} = $value;
|| > }
||
|| This will only return the last value in a query string such as
||
|| key=val1&key=val2
||
|| So what you need to add is the infrastructure to handle multiple
|| occurances of the same parameter.


No. All you need to do is make sure your forms don't have multiple
inputs with the same name. ;-)

Thanks for this little piece of satire. :) And now it's time to
dispense some ultimate truths about programming...

Incomplete or partial solutions are ok as long as they are flagged to
solve the problem only partially. However, most often something else
happens: a problem is targetted, solved unsatisfactorily and after that
the job description of the solution is tweaked to match the
implementation. It should be the other way round: make an initial
specification of what is required and then implement something that does
exactly that. Naturally, turning the sequence around is much more
convenient. It therefore seems to be part of a beginner's mindset.

Tassilo
 
H

hudson

Thanks for this little piece of satire. :) And now it's time to
dispense some ultimate truths about programming...


you, my friend, are the little piece of satire...think about it a
little bit and you will understand what I mean ;-)
 
H

hudson

Perfectly correct: but you're aiming to implement a general-purpose
script (as any self-respecting professional would do), whereas the
kiddie is trying to write a partial implementation, thinking that
it'll be as much as they'd need for the present, and ignoring that
many related tasks will arise later and need the script expanding in
various ways.
"kiddie"


Oh, c'mon, you don't seriously think Abigail doesn't know all this;
and if you're of a mind to quote specifications, then let's have a
decent one, not that fly-blown old carcass of HTML3.2 (spit).


If you're a professional - of course it is.


Well, that's what we have been dealing with all along on this thread,
as if you hadn't already noticed ;-)
hmmm

Now please, let's stop baiting the would-be troll, and get on with
something more productive.

now...just who is baiting who ;-)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,991
Messages
2,570,217
Members
46,805
Latest member
ClydeHeld1

Latest Threads

Top