X
xfx.publishing
Heya. I'm up against what is turning out to be a tough one...
I'm trying to get a Perl script with LWP to get information from AOL
profiles.
What I'm doing is I have a LWP useragent logging in with my AIM
screenname and password at
https://my.screenname.aol.com/_cqr/login/login.psp, grabbing the
cookies, and then requesting
http://memberdirectory.aol.com/aolus/profile?sn=$screenname
The login part is working, and I'm caching the cookies fine it looks
like, but the results I get back from
http://memberdirectory.aol.com/aolus/profile?sn=$screenname are not the
same thing I'm getting in my real browser (firefox) -- and not getting
the info I want out to parse.
logger I could find, so XP it was -- but I'm doing the code under
Debian Linux just so you know) and I'm getting cookies being sent that
I don't have -- which is giving the strong impression that some
Javascript somewhere is making these cookies up on the fly. This gives
me the strong impression that AOL takes the opposite philosophy of
Perl, and possibly has occasional meetings where people get together
and decide 'We don't have enough code -- we need more code to do the
same thing. Let's try and complicate things more.' I bet they have
buzzwords like 'complexify'. It's asinine to do so much stupid crap
just to set up login validation.
But, rant aside, if someone can figure out what's going on with this...
Here's the intended subroutine -- you will notice commented bits of
hair-pulling in places and debuggy bits, no doubt
But in theory this should be able to set up a hashref in the
$obj->{profile} key that contains all their AOL user info. In theory.
And I just can't get back the right info (and yes I double checked my
AIM password and stuff, and got distinctly different results putting in
a nonsense password just to make sure there wasn't manglement going on
there).
Anyway, well... this definitely counts as a challenge, if nothing else.
I'd like to have this working, but for the tme being, I just can't get
the right page back.
sub SetInfo {
my $obj = shift;
my $user = shift;
my $aol_login = $obj->{aol_login};
my $aol_pass = $obj->{aol_pass};
unless ($aol_login and $aol_pass) {
$obj->{errstr} = 'AOL or AIM Screen Name and Password required
to '.
'access.';
return undef;
}
# We is a BIG LIAH saying we's Firefox. In British. On XP. Just
what
# they expect. But some of these sites act like assholes so screw
it
# better that than this potentially not working if they block us
my $web = LWP::UserAgent->new('Mozilla/5.0 (Windows; U; Windows NT
5.1;'.
' en-GB; rv:1.8.0.3) Gecko/20060426 '.
'Firefox/1.5.0.3');
$web->cookie_jar(HTTP::Cookies->new(file =>
"/usr/local/WPCookies.lwp",
autosave => 1,
ignore_discard => 1));
#$web->cookie_jar({});
my $login =
$web->post('https://my.screenname.aol.com/_cqr/login/login.psp',
[sitedomain =>
'memberdirectory-beta.estage.aol.com',
siteId => '',
lang => 'en',
locale => 'us',
authLev => '1',
siteState =>
"OrigUrl%3Dhttp%253A%252F%252Fmemberdirectory.aol.com%252Faolus%252Fprofile%253Fsn%253D$user",
isSiteStateEncoded => 'true',
mcState => 'initialized',
usrd => '1889976',
loginId => $aol_login,
password => $aol_pass,
rememberMe => 'off']);
unless ($login->is_success) {
$obj->{errstr} = 'AOL login failed:'. $login->status_line;
return undef;
}
$obj->{login_page} = $login->content;
$obj->{ua} = $web;
#=stop
$obj->{login_headers} = $login->as_string;
$obj->{login_headers} =~ s/\n\n.*//gsm;
$obj->{login_cookies} = [];
for my $h (split /[\r\n]+/, $obj->{login_headers}) {
my ($k, $v) = split /:\s+/, $h;
push @{$obj->{login_cookies}}, {$k => $v} if lc $k eq
'set-cookie';
}
#=cut
if ($obj->{login_page}
=~ /You have entered an invalid Screen Name or password/) {
$obj->{errstr} = 'AOL login failed: invalid login info.';
return undef;
}
$obj->{pull_success} = "Didn't try yet.";
my $url_base = 'http://memberdirectory.aol.com/aolus/profile?sn=';
my $req = HTTP::Request->new('GET',
"$url_base$user",
[@{$obj->{login_cookies}},
);
my $resp = $web->request($req);
if ($resp->is_success) {
my $prof;
$obj->{pull_success} = "Successful.";
$obj->{page} = $resp->content;
my $p = HTML:ullParser->new(doc => $resp->content,
start => 'tagname, event, attr',
end => 'tagname, event,
skipped_text',
ignore_elements => [qw(script style
applet embed
object)],
report_tags => ['script']);
while (my $token = $p->get_token) {
my $type = $token->[1];
next unless ($type eq 'end');
my $script = $token->[2];
if ($script =~ /var\s+nameString\s*=/) {
# this is the right script with the data in it
# that is easy to read
$script =~ /var\s+memMessage\s*=\s*"I am (\w+)\."/;
$prof->{online} = $1;
$script =~ /var\s+nameDetails\s*=\s*"([^"]*)"/;
$prof->{name} = $1;
$script =~ /var\s+locDetails\s*=\s*"([^"]*)"/;
$prof->{loc} = $1;
$script =~ /var\s+genderDetails\s*=\s*"([^"]*)"/;
$prof->{gender} = $1;
$script =~ /var\s+maritalDetails\s*=\s*"([^"]*)"/;
$prof->{marital} = $1;
$script =~ /var\s+hobbiesDetails\s*=\s*"([^"]*)"/;
$prof->{hobbies} = $1;
$script =~ /var\s+gadgetsDetails\s*=\s*"([^"]*)"/;
$prof->{gadgets} = $1;
$script =~ /var\s+occDetails\s*=\s*"([^"]*)"/;
$prof->{occ} = $1;
$script =~ /var\s+quoteDetails\s*=\s*"([^"]*)"/;
$prof->{quote} = $1;
$script =~ /var\s+linksDetails\s*=\s*"([^"]*)"/;
$prof->{links} = $1;
}
for my $k (keys %{$prof}) {
# Strip out annoying HTML tags in profiles
$prof->{$k} =~ s/<[^>]*>//gsm;
}
$obj->{profile} = $prof;
}
}
else {
$obj->{pull_success} = 'Failed';
$obj->{errstr} = "Can't retrieve AOL member page for
$obj->{user}.\n";
return undef;
}
}
I'm trying to get a Perl script with LWP to get information from AOL
profiles.
What I'm doing is I have a LWP useragent logging in with my AIM
screenname and password at
https://my.screenname.aol.com/_cqr/login/login.psp, grabbing the
cookies, and then requesting
http://memberdirectory.aol.com/aolus/profile?sn=$screenname
The login part is working, and I'm caching the cookies fine it looks
like, but the results I get back from
http://memberdirectory.aol.com/aolus/profile?sn=$screenname are not the
same thing I'm getting in my real browser (firefox) -- and not getting
the info I want out to parse.
Javascripting going on. I set up a logger on my XP box (first free HTTPFrom what I can tell there's some weird and utterly pointless
logger I could find, so XP it was -- but I'm doing the code under
Debian Linux just so you know) and I'm getting cookies being sent that
I don't have -- which is giving the strong impression that some
Javascript somewhere is making these cookies up on the fly. This gives
me the strong impression that AOL takes the opposite philosophy of
Perl, and possibly has occasional meetings where people get together
and decide 'We don't have enough code -- we need more code to do the
same thing. Let's try and complicate things more.' I bet they have
buzzwords like 'complexify'. It's asinine to do so much stupid crap
just to set up login validation.
But, rant aside, if someone can figure out what's going on with this...
Here's the intended subroutine -- you will notice commented bits of
hair-pulling in places and debuggy bits, no doubt
But in theory this should be able to set up a hashref in the
$obj->{profile} key that contains all their AOL user info. In theory.
And I just can't get back the right info (and yes I double checked my
AIM password and stuff, and got distinctly different results putting in
a nonsense password just to make sure there wasn't manglement going on
there).
Anyway, well... this definitely counts as a challenge, if nothing else.
I'd like to have this working, but for the tme being, I just can't get
the right page back.
sub SetInfo {
my $obj = shift;
my $user = shift;
my $aol_login = $obj->{aol_login};
my $aol_pass = $obj->{aol_pass};
unless ($aol_login and $aol_pass) {
$obj->{errstr} = 'AOL or AIM Screen Name and Password required
to '.
'access.';
return undef;
}
# We is a BIG LIAH saying we's Firefox. In British. On XP. Just
what
# they expect. But some of these sites act like assholes so screw
it
# better that than this potentially not working if they block us
my $web = LWP::UserAgent->new('Mozilla/5.0 (Windows; U; Windows NT
5.1;'.
' en-GB; rv:1.8.0.3) Gecko/20060426 '.
'Firefox/1.5.0.3');
$web->cookie_jar(HTTP::Cookies->new(file =>
"/usr/local/WPCookies.lwp",
autosave => 1,
ignore_discard => 1));
#$web->cookie_jar({});
my $login =
$web->post('https://my.screenname.aol.com/_cqr/login/login.psp',
[sitedomain =>
'memberdirectory-beta.estage.aol.com',
siteId => '',
lang => 'en',
locale => 'us',
authLev => '1',
siteState =>
"OrigUrl%3Dhttp%253A%252F%252Fmemberdirectory.aol.com%252Faolus%252Fprofile%253Fsn%253D$user",
isSiteStateEncoded => 'true',
mcState => 'initialized',
usrd => '1889976',
loginId => $aol_login,
password => $aol_pass,
rememberMe => 'off']);
unless ($login->is_success) {
$obj->{errstr} = 'AOL login failed:'. $login->status_line;
return undef;
}
$obj->{login_page} = $login->content;
$obj->{ua} = $web;
#=stop
$obj->{login_headers} = $login->as_string;
$obj->{login_headers} =~ s/\n\n.*//gsm;
$obj->{login_cookies} = [];
for my $h (split /[\r\n]+/, $obj->{login_headers}) {
my ($k, $v) = split /:\s+/, $h;
push @{$obj->{login_cookies}}, {$k => $v} if lc $k eq
'set-cookie';
}
#=cut
if ($obj->{login_page}
=~ /You have entered an invalid Screen Name or password/) {
$obj->{errstr} = 'AOL login failed: invalid login info.';
return undef;
}
$obj->{pull_success} = "Didn't try yet.";
my $url_base = 'http://memberdirectory.aol.com/aolus/profile?sn=';
my $req = HTTP::Request->new('GET',
"$url_base$user",
[@{$obj->{login_cookies}},
);
my $resp = $web->request($req);
if ($resp->is_success) {
my $prof;
$obj->{pull_success} = "Successful.";
$obj->{page} = $resp->content;
my $p = HTML:ullParser->new(doc => $resp->content,
start => 'tagname, event, attr',
end => 'tagname, event,
skipped_text',
ignore_elements => [qw(script style
applet embed
object)],
report_tags => ['script']);
while (my $token = $p->get_token) {
my $type = $token->[1];
next unless ($type eq 'end');
my $script = $token->[2];
if ($script =~ /var\s+nameString\s*=/) {
# this is the right script with the data in it
# that is easy to read
$script =~ /var\s+memMessage\s*=\s*"I am (\w+)\."/;
$prof->{online} = $1;
$script =~ /var\s+nameDetails\s*=\s*"([^"]*)"/;
$prof->{name} = $1;
$script =~ /var\s+locDetails\s*=\s*"([^"]*)"/;
$prof->{loc} = $1;
$script =~ /var\s+genderDetails\s*=\s*"([^"]*)"/;
$prof->{gender} = $1;
$script =~ /var\s+maritalDetails\s*=\s*"([^"]*)"/;
$prof->{marital} = $1;
$script =~ /var\s+hobbiesDetails\s*=\s*"([^"]*)"/;
$prof->{hobbies} = $1;
$script =~ /var\s+gadgetsDetails\s*=\s*"([^"]*)"/;
$prof->{gadgets} = $1;
$script =~ /var\s+occDetails\s*=\s*"([^"]*)"/;
$prof->{occ} = $1;
$script =~ /var\s+quoteDetails\s*=\s*"([^"]*)"/;
$prof->{quote} = $1;
$script =~ /var\s+linksDetails\s*=\s*"([^"]*)"/;
$prof->{links} = $1;
}
for my $k (keys %{$prof}) {
# Strip out annoying HTML tags in profiles
$prof->{$k} =~ s/<[^>]*>//gsm;
}
$obj->{profile} = $prof;
}
}
else {
$obj->{pull_success} = 'Failed';
$obj->{errstr} = "Can't retrieve AOL member page for
$obj->{user}.\n";
return undef;
}
}