Using Perl to get data from website

F

fiazidris

Previously, I have written a perl script to access data from this URL:

http://www.bangkokflightservices.com/our_cargo_track.php

Some sample: MAWB - Master Airwaybill Number

724-26332482
724-61480672
724-61441122

and this was the final URL:

http://203.151.118.123:8090/showc_track.php?m_prefix=724&m_sn=26332482&h_prefix=HWB&h_sn=

But, now there is a change on the website and I couldn't extract
through the same script. One change I noticed is the URL has changed
to:

<iframe src="http://203.151.118.123:8090/showc_track.php?
m_prefix=724&m_sn=26332482&h_prefix=HWB&h_sn=&ecy=e076438db64c6190f7b9689a379b7f7093368f1652d14db65fee1ab916713f3f5f4030f53369cb1f669614312c4748899c272f4d976a2b299274a21ad80fc072b1bab2ab1c181d08c670188722e51ec162f9ae337e3f2f132c88d249133815558d241ce8a4e9b3fa75c144268b9e901037c2c7257142ee42ff9b2bf2767f57ed62b94fd938ea4dd2b28c53fea6af74be&ch=
" frameborder="0" scrolling="yes" height="700" width="100%"> </iframe>

How can I programmatically obtain data for a list of MAWBs.

Here is a sample script that I wrote which previously worked:

#!/usr/bin/perl

while (<>) {
chomp;

$mprefix = substr($_, 0, 3);
$msn = substr($_, 4, 8);

if (length($mprefix) ne 3) { next; }

$currurl = 'http://203.151.118.123:8090/showc_track.php?
m_prefix=' . $mprefix . '&m_sn=' . $msn .
'&h_prefix=HWB&h_sn=&ecy=e076438db64c6190f7b9689a379b7f7093368f1652d14db65fee1ab916713f3f5f4030f53369cb1f669614312c4748899c272f4d976a2b299274a21ad80fc072b1bab2ab1c181d08c670188722e51ec162f9ae337e3f2f132c88d249133815558d241ce8a4e9b3fa75c144268b9e901037c2c7257142ee42ff9b2bf2767f57ed62b94fd938ea4dd2b28c53fea6af74be&ch=
';


$currresult = qx{curl -s '$currurl'};

while ( $currresult=~ m#(.*)#g ) {
$currline=$1;

if ($currline =~ m#style12#i) {

$currline =~ m#.*>(.*?)<.*#i;
$result = $result . " / " . $1;
}

}
print "***$result\n";
$result = '';
}
 
B

Ben Morrow

Quoth fiazidris said:
Previously, I have written a perl script to access data from this URL:

http://www.bangkokflightservices.com/our_cargo_track.php

Some sample: MAWB - Master Airwaybill Number

724-26332482
724-61480672
724-61441122

and this was the final URL:

http://203.151.118.123:8090/showc_track.php?m_prefix=724&m_sn=
26332482&h_prefix=HWB&h_sn=

But, now there is a change on the website and I couldn't extract
through the same script. One change I noticed is the URL has changed
to:
[url trimmed]
<iframe src="http://203.151.118.123:8090/showc_track.php?
m_prefix=724&m_sn=26332482&h_prefix=HWB&h_sn=&ecy=e076438db64c61..."
frameborder="0" scrolling="yes" height="700" width="100%"> </iframe>

How can I programmatically obtain data for a list of MAWBs.

Yuck, what a horrible page. <input> without <form>... I would use
something like

#!/usr/bin/perl

use WWW::Mechanize;

my $baseurl =
'http://www.bangkokflightservices.com/our_cargo_track&trace.php';
my $hawb = 'h_prefix=HAWB&h_sn=';

my $M = WWW::Mechanize->new(auto_check => 1);

while (<>) {
chomp;

my ($mprefix, $msn) = /(...)(........)/ or do {
warn "invalid MAWB: '$_'";
next;
};

$M->get("$baseurl?m_prefix=$mprefix&m_sn=$msn&$hawb");
$M->follow_link(url_regex => qr/showc_track/);
my $content = $M->content;

# process $content as before
}

You may need to adjust the follow_link call if there are several links on
the same page that match that regex; see perldoc WWW::Mechanize for the
arguments. If the server checks the Referer, you may also need to ->get
/our_cargo_track.php first.

Ben
 
I

ifiaz

You may need to adjust the follow_link call if there are several links
on
the same page that match that regex; see perldoc WWW::Mechanize for
the
arguments. If the server checks the Referer, you may also need to -
/our_cargo_track.php first.

Ben
----

Thank you for your prompt response.

When I used the code with minor modifications, I still have the
problem that I can't access the data as the process throws me to
another page as below.

This is what the $content contains:

<script> window.open ('http://www.bangkokflightservices.com/
our_cargo_track.php') ;
setTimeout("window.close();", 10);
</script>

How to get to the actual data page. Please guide me here as I am a
newbie.

I don't know how to implement Referer and all that.


### This is the complete code I used.
#!/usr/bin/perl

use WWW::Mechanize;

my $baseurl =
'http://www.bangkokflightservices.com/our_cargo_track&trace.php';
my $hawb = 'h_prefix=HAWB&h_sn=';

my $M = WWW::Mechanize->new(auto_check => 1);

## Added code for testing Only
my $F = WWW::Mechanize->new(auto_check => 1);
$F->get("http://www.bangkokflightservices.com/our_cargo_track.php");
my $contentF = $F->content;
#print "$contentF\n";
#$M->add_header("Referer => 'http://www.bangkokflightservices.com/
our_cargo_track.php'" )

while (<>) {
chomp;

my ($mprefix, $msn) = /(...)-(........)/ or do {
warn "invalid MAWB: '$_'";
next;
};

print "$mprefix $msn\n";

$M->get("$baseurl?m_prefix=$mprefix&m_sn=$msn&$hawb");
$M->follow_link(url_regex => qr/showc_track/);
my $content = $M->content;

print "$content\n"; # for debugging

# process $content as before
#
while ( $content =~ m#(.*)#g ) {
$currline=$1;

if ($currline =~ m#style12#i) {

$currline =~ m#.*>(.*?)<.*#i;
$result = $result . " / " . $1;
}
}
print "***$result\n";
$result = '';
}
 
F

fiazidris

Also, please so you know,

my $baseurl =
'http://www.bangkokflightservices.com/our_cargo_track&trace.php';
my $hawb = 'h_prefix=HAWB&h_sn=';

h_prefix should be HWB and not HAWB.

I have fixed that in my code and still the same problem that it throws
me to a different page.

I have reached to a level where the following URL works on a browser:
prefix and serials can be changed.

http://203.151.118.123:8090/showc_t...94fd938ea4dd2b28c53fea6af74be&ch=%A0%A0%A0%A0

but this URL doesn't return results using perl or curl.

Ben Morrow, please help.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top