perl curl get data from website

S

SVCitian

These 3 URLs work on a browser.. and return the same results... both
Firefox and IE.

But, I want to retrieve this programmatically using curl or perl..
with the prefix and sn serial number changed each time... How can i
make it work..

Can you provide a simple curl command line .. or perl get http.. to
demonstrate the retrieval.. thanks.


http://www.bangkokflightservices.co...94fd938ea4dd2b28c53fea6af74be&ch=%A0%A0%A0%A0

http://www.bangkokflightservices.co...efix=176&m_sn=75064953&h_prefix=HWB&h_sn=&ch=

http://www.bangkokflightservices.co...m_prefix=176&m_sn=75064953&h_prefix=HWB&h_sn=
 
J

Jürgen Exner

SVCitian said:
These 3 URLs work on a browser.. and return the same results... both
Firefox and IE.

But, I want to retrieve this programmatically using curl or perl..
with the prefix and sn serial number changed each time... How can i
make it work..

Can you provide a simple curl command line .. or perl get http.. to
demonstrate the retrieval.. thanks.

See the FAQ: perldoc -q "HTML file"
"How do I fetch an HTML file?"

jue
 
S

SVCitian

See the FAQ: perldoc -q "HTML file"
        "How do I fetch an HTML file?"

jue

Actually i know how to use curl with in perl or use perl html
commands.

But, the problem is the above URL doesn't work even in the simplest
case of:

curl "http://www.google.com/url?sa=D&q=http://
www.bangkokflightservices.com/our_cargo_track%26trace.php%3Fm_prefix%3D176%26m_sn%3D75064953%26h_prefix%3DHWB%26h_sn%3D&usg=AFQjCNFh02ikp7CSs9lxi_S7ec0Edw9m5g"

I even tried to user "tamper data" firefox add to get behind the
scenes of GET, POST, etc... but I can't proceed any further than the
URLs given above.

why? that may be something to do with ajax, cookie, user agent, or
whatever. I have tried some combinations, but none works.

It works on the browser just right out of the box.. even changing the
prefix and serial numbers.

So, i want to find out what i am missing that hinders the data
retrieval.
 
J

Jürgen Exner

SVCitian said:
But, the problem is the above URL doesn't work even in the simplest
case of:

curl "http://www.google.com/url?sa=D&q=http://
www.bangkokflightservices.com/our_cargo_track%26trace.php%3Fm_prefix%3D176%26m_sn%3D75064953%26h_prefix%3DHWB%26h_sn%3D&usg=AFQjCNFh02ikp7CSs9lxi_S7ec0Edw9m5g"

I even tried to user "tamper data" firefox add to get behind the
scenes of GET, POST, etc... but I can't proceed any further than the
URLs given above.

An HTTP request using that URL above returns

<!-- This page yong codeing. Please don't copy idea or code before owne
argee. if you copy this code than see it's you code and you project. you
is fucking man -->
<script> window.open
('http://www.bangkokflightservices.com/our_cargo_track.php') ;
setTimeout("window.close();", 10);
</script>

Were you expecting something different?
It works on the browser just right out of the box.. even changing the
prefix and serial numbers.

Please define "works"/"doesn't work".
I am getting just a blank page (FireFox 3.6.8; yeah, I am going to
update now) which is not surprising given the HTTP response above.

To me that is "doesn't work", but of course YMMV.
So, i want to find out what i am missing that hinders the data
retrieval.

Are you getting something different from your script?

jue
 
S

SVCitian

An HTTP request using that URL above returns

<!--  This page yong codeing. Please don't copy idea or code before owne
argee. if you copy this code than see it's you code and you project. you
is fucking man  -->
                <script> window.open
('http://www.bangkokflightservices.com/our_cargo_track.php') ;
                        setTimeout("window.close();", 10);
                </script>

Were you expecting something different?


Please define "works"/"doesn't work".
I am getting just a blank page (FireFox 3.6.8; yeah, I am going to
update now) which is not surprising given the HTTP response above.

To me that is "doesn't work", but of course YMMV.


Are you getting something different from your script?

jue

i am afraid we are not having the same response in our browsers.. due
to cookies or whatever.
Please try this,
shortened version: http://goo.gl/FlGU
full version:
http://www.google.com/url?sa=D&q=ht...%A0%A0&usg=AFQjCNHTMCnorOy2WILngV1qdOWYyp-gkg

what you expect to see is:
about 7 lines of some transaction records from 14/10/2010 05:37pm to
09:54pm.. if you don't get this result in your end then you will have
to start from scratch.. then i suggest you go to the

Homepage: http://www.bangkokflightservices.com/our_cargo_track.php

and put 176 - 75064953 in MAWB suffix and prefix and click "Search"


I want to get the exact same results of the resulting page through
curl or perl http.

It doesn't work for me.. when I put the above URL (this is how far I
have reached).. using
curl "... above url..."
but, it works for me in firefox with the same URL.

Let me know if you need more clarification.


Yes.. it returns this in some occasions... but this is not what i
expect.. i expect about 7 lines of transaction records. You will know
what I mean when you start from scratch with the URL above and put
prefix and suffix yourselves.

<!-- This page yong codeing. Please don't copy idea or code before
owne
argee. if you copy this code than see it's you code and you project.
you
is fucking man -->


Thanks.
 
I

Ilya Zakharevich

Don't overlook the BOLD text on the page I linked to.

It's easy to do, I overlooked it at first too :)

A wonderful example of steganography. And very well balanced - if it
were 4 times longer, I would run a "Search" on the page...

Thanks,
Ilya
 
S

SVCitian

You might want to try it with the Web Scraping Proxy:

   http://www2.research.att.com/sw/tools/wsp/

which is nice because it logs the traffic in the form of
Perl code that you can copy/paste/modify to suit your needs.

--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.

I have tried this long time back, and I couldn't make it work and also
failed with the attempt.

This in itself generated a whole search in forums for making it work.

If anyone out there who has used wsp (and still have it on their
computers), could you run my site through it and advise your findings.
I think it just takes few minutes of your time if you have already
made the wsp work for you.

Will appreciate your assistance.

Thank you.
 
S

sln

I have tried this long time back, and I couldn't make it work and also
failed with the attempt.

This in itself generated a whole search in forums for making it work.

If anyone out there who has used wsp (and still have it on their
computers), could you run my site through it and advise your findings.
I think it just takes few minutes of your time if you have already
made the wsp work for you.

Will appreciate your assistance.

Thank you.

I think the key to using a buggy wsp.pl is to install openssl. Even then,
its buggy as there's so much dependency on browser settings and caches.
Might have to use a seperate machine for the proxy. I used it locally 127.0.0.1
and enabled browser lowest security/privacy, disabled all advanced options.
Still buggy, have to end process in task manager.

After disabling everything in advanced options in IE6 (it had problems with
png file downloads), this was captured with obtuse line breaks and possible
unknown encoding (probably utf-8).

I'm sure this won't help.

-sln

--- Proxy server running on rcx port: 5364

# Request:
http://www.bangkokflightservices.com/TrackTrace/showc_track.php?m_prefix=176&m_s
n=75064953&h_prefix=HWB&h_sn=&ecy=e076438db64c6190f7b9689a379b7f7093368f1652d14d
b65fee1ab916713f3f5f4030f53369cb1f669614312c4748899c272f4d976a2b299274a21ad80fc0
72b1bab2ab1c181d08c670188722e51ec162f9ae337e3f2f132c88d249133815558d241ce8a4e9b3
fa75c144268b9e901037c2c7257142ee42ff9b2bf2767f57ed62b94fd938ea4dd2b28c53fea6af74
be&ch=%A0%A0%A0%A0
# Cookie (NO Set-Cookie): 'PHPSESSID', '1831c0a805050e73bff5a54e0fa017d5
'
$request = new HTTP::Request('GET' =>
"http://www.bangkokflightservices.com/TrackTrace/showc_track.php?m_prefix=176&m_
sn=75064953&h_prefix=HWB&h_sn=&ecy=e076438db64c6190f7b9689a379b7f7093368f1652d14
db65fee1ab916713f3f5f4030f53369cb1f669614312c4748899c272f4d976a2b299274a21ad80fc
072b1bab2ab1c181d08c670188722e51ec162f9ae337e3f2f132c88d249133815558d241ce8a4e9b
3fa75c144268b9e901037c2c7257142ee42ff9b2bf2767f57ed62b94fd938ea4dd2b28c53fea6af7
4be&ch=%A0%A0%A0%A0");
# Table 1: 11 rows; table nesting: 5
# Saving web page as w4

# Request:
http://www.bangkokflightservices.com/TrackTrace/search_awb.php?m_prefix=176&m_sn
=75064953&h_prefix=HWB&h_sn=&ch= &id=0.015485021941311072
# Referer:
http://www.bangkokflightservices.com/TrackTrace/showc_track.php?m_prefix=176&m_s
n=75064953&h_prefix=HWB&h_sn=&ecy=e076438db64c6190f7b9689a379b7f7093368f1652d14d
b65fee1ab916713f3f5f4030f53369cb1f669614312c4748899c272f4d976a2b299274a21ad80fc0
72b1bab2ab1c181d08c670188722e51ec162f9ae337e3f2f132c88d249133815558d241ce8a4e9b3
fa75c144268b9e901037c2c7257142ee42ff9b2bf2767f57ed62b94fd938ea4dd2b28c53fea6af74
be&ch=%A0%A0%A0%A0
# Cookie: 'PHPSESSID', '1831c0a805050e73bff5a54e0fa017d5
'
$request = new HTTP::Request('GET' =>
"http://www.bangkokflightservices.com/TrackTrace/search_awb.php?m_prefix=176&m_s
n=75064953&h_prefix=HWB&h_sn=&ch= &id=0.015485021941311072");
# Table 1: 5 rows
# Table 2: 9 rows
# Saving web page as w5
 
S

SVCitian

I have no clue of how to make heads or tails of the result.

If you could post the result in a more helpful format.. I would
appreciate it.

Thanks.
 
S

sln

I have no clue of how to make heads or tails of the result.

If you could post the result in a more helpful format.. I would
appreciate it.

Thanks.

Read the 1 page DDJ article, then the simple WSP page.

http://www.drdobbs.com/184405362;jsessionid=0YIUE10HQDGWRQE1GHPSKHWATMY32JVN?queryText=wsp
http://www2.research.att.com/sw/tools/wsp/

The emitted LWP 'Get' => "long ass string"); lines (there are 2) are used to get the web page
information you were looking for.

Simple as that.

If you want to run the wsp proxy yourself, its not that hard to do.
You need to install the OpenSSL binary (the source is available), the
Net::SSLeay (via ppm repository because it has other module dependencies),
then run the wsp (version 2) downloaded from above site.
You can run it locally. Set the browser lan connection to proxy, give
it 127.0.0.1 (local host) with the default port wsp.pl runs on (5364).

Clear your browser cache and cookies, set advanced options to not
download pictures, and for the scary part, lower all security and privacy settings.
If you use a gigantic hosts file (its a text file) that filters spam ip's,
back it up, then clear the original.

The main reason the proxy scraper is valuable is it lets the page's javascript
do its thing (especially with cookies) then creating the result in a POST/GET
lwp commands, bypassing the need to worry about js. Plus it takes the guess work
out of duplicating the sequence so it can then be automated with LWP.

-sln
 
S

sln

I have no clue of how to make heads or tails of the result.

If you could post the result in a more helpful format.. I would
appreciate it.

Thanks.

This may be better, my first afternoon with LWP.

-sln

----------------------
use strict;
use warnings;

use HTML::TableExtract;
use HTTP::Cookies;
use HTTP::Request::Common qw(POST GET);
use LWP::UserAgent;

my $show_content = 0;
my ($content1, $content2);

# Create cookies
my $jar = HTTP::Cookies->new();

# Create user agent
my $ua = LWP::UserAgent->new();
$ua->timeout( 10 );
$ua->cookie_jar( $jar );
$ua->agent("Microsoft Internet Explorer/6.0");


# Create a first request: "get track table"
# ---------

my $request = HTTP::Request->new('GET' =>
join '', qw{
http://www.bangkokflightservices.com/TrackTrace/showc_track.php?m_prefix=176&m_s
n=75064953&h_prefix=HWB&h_sn=&ecy=e076438db64c6190f7b9689a379b7f7093368f1652d14d
b65fee1ab916713f3f5f4030f53369cb1f669614312c4748899c272f4d976a2b299274a21ad80fc0
72b1bab2ab1c181d08c670188722e51ec162f9ae337e3f2f132c88d249133815558d241ce8a4e9b3
fa75c144268b9e901037c2c7257142ee42ff9b2bf2767f57ed62b94fd938ea4dd2b28c53fea6af74
be&ch=%A0%A0%A0%A0 } );

# Pass request to agent

my $res = $ua->request( $request );
if ( $res->is_success ) {
print "\nContent-1 .. OK\n\n";
if ($show_content) {
print $res->content, "\n\n";
}
$content1 = $res->content;
}
else {
print "Request-1 Failed\n";
print $res->status_line, "\n\n";
die;
}

print '='x20, "\n\n";


# Create asecond request: "get search table"
# ---------

$request = HTTP::Request->new('GET' =>
join '', qw{
http://www.bangkokflightservices.com/TrackTrace/search_awb.php?m_prefix=176&m_s
n=75064953&h_prefix=HWB&h_sn=&ch= } );

# Pass the request to agent

$res = $ua->request( $request );
if ( $res->is_success ) {
print "Content-2 .. OK\n\n";
if ($show_content) {
print $res->content, "\n\n";
}
$content2 = $res->content;
}
else {
print "Request-2 Failed\n";
print $res->status_line, "\n\n";
die;
}

print '='x20, "\n\n";
print "Done!\n\n\n";
print "Content 1 tables:\n", '-'x20, "\n\n";
print_tables( $content1 );
print "\nContent 2 tables:\n", '-'x20, "\n\n";
print_tables( $content2 );

exit;

## Table extract Util from wsp
##
sub print_tables {
my ($table, $row, $cell);
my $tc = 0;
my $table_extractor = HTML::TableExtract->new();
$table_extractor->parse($_[0]);
foreach $table ($table_extractor->table_states) {
print "TABLE $tc:\n"; $tc++;
my $rc = 0;
foreach $row ($table->rows) {
print "ROW $rc:\n"; $rc++;
foreach $cell ( @$row ) {
$cell = '' unless defined $cell;
$cell =~ s/\n/ /g;
$cell =~ s/[ \t]+/ /g;
$cell =~ s/^[ \t]//;
$cell =~ s/[ \t]$//;
$cell =~ s/ *<\/td *//g;
print "$cell|";
}
print "\n";
}
}
}
__END__


Content-1 .. OK

====================

Content-2 .. OK

====================

Done!


Content 1 tables:
--------------------

TABLE 0:
ROW 0:
|á||
TABLE 1:
ROW 0:
á|||
ROW 1:
á||á|
ROW 2:
á|||
TABLE 2:
ROW 0:
|
ROW 1:
|
TABLE 3:
ROW 0:
á|
ROW 1:
|
TABLE 4:
ROW 0:
|
ROW 1:
|
TABLE 5:
ROW 0:
|

Content 2 tables:
--------------------

TABLE 0:
ROW 0:
á||||
ROW 1:
á|Enter Master Air Waybill (MAWB)|
ROW 2:
Optional (For Import MAWB Only)|
ROW 3:
á||||
ROW 4:
||* Master Air Waybill number example 123 - 12345678||
TABLE 1:
ROW 0:
|||||||||||
ROW 1:
Item|AWB No|Flight No|Flight Date|Origin|Dest|ULD No|Status|Pieces|Weight|Time|
ROW 2:
1|176-75064953|EK 419|Oct 15 2010|BKK|DXB|Flight Changeá|Export Transshipment|3|
743.00|Oct 14 2010 5:37PM|
ROW 3:
2|176-75064953|EK 419|Oct 15 2010|BKK|DXB|á|Accepted|3|743.00|Oct 14 2010 5:37PM
|
ROW 4:
3|176-75064953|EK 373|Oct 15 2010|BKK|DXB|Flight Changeá|Export Transshipment|3|
743.00|Oct 14 2010 6:12PM|
ROW 5:
4|176-75064953|EK 373|Oct 15 2010|BKK|DXB|SHCá|Export Transshipment|3|743.00|Oct
14 2010 6:12PM|
ROW 6:
5|176-75064953|EK 373|Oct 14 2010|BKK|DXB|Flight Changeá|Export Transshipment|3|
743.00|Oct 14 2010 6:42PM|
ROW 7:
6|176-75064953|EK 373|Oct 14 2010|BKK|DXB|PMC31131EKá|Manifested|3|743.00|Oct 14
2010 6:57PM|
ROW 8:
7|176-75064953|EK 373|Oct 14 2010|BKK|DXB|á|Departed|3|743.00|Oct 14 2010 9:54PM
|
 
S

sln

This may be better, my first afternoon with LWP.

-sln

----------------------
use strict;
use warnings;

use HTML::TableExtract;
use HTTP::Cookies;
use HTTP::Request::Common qw(POST GET);
use LWP::UserAgent;

my $show_content = 0;
my ($content1, $content2);

[snip code]
# Create asecond request: "get search table"
# ---------

$request = HTTP::Request->new('GET' =>
join '', qw{
http://www.bangkokflightservices.com/TrackTrace/search_awb.php?m_prefix=176&m_s
n=75064953&h_prefix=HWB&h_sn=&ch= } );

## Or, to create a variable AWB lookup
# my $WBNprefix = '176';
# my $WBN = '75064953';
# $request = HTTP::Request->new('GET' =>
# "http://www.bangkokflightservices.com/TrackTrace/search_awb.php?m_prefix=" .
# $WBNprefix. "&m_sn=" . $WBN . "&h_prefix=HWB&h_sn=&ch= ");

-sln
 
S

SVCitian

This may be better, my first afternoon with LWP.
use HTML::TableExtract;
use HTTP::Cookies;
use HTTP::Request::Common qw(POST GET);
use LWP::UserAgent;
my $show_content = 0;
my ($content1, $content2);

[snip code]
# Create asecond request: "get search table"
# ---------
$request = HTTP::Request->new('GET' =>
 join '', qw{
http://www.bangkokflightservices.com/TrackTrace/search_awb.php?m_pref...
n=75064953&h_prefix=HWB&h_sn=&ch= } );

## Or, to create a variable AWB lookup
#  my $WBNprefix = '176';
#  my $WBN       = '75064953';
#   $request = HTTP::Request->new('GET' =>
#  "http://www.bangkokflightservices.com/TrackTrace/search_awb.php?m_prefix=" .
#   $WBNprefix. "&m_sn=" . $WBN . "&h_prefix=HWB&h_sn=&ch= ");

-sln

Thank you for your efforts... I will try out the perl code tomorrow.

By the way, I am not on a linux machine.. I am on Windows XP using
cygwin / perl.

So, I don't know if the proxy and all the rest of it could work. Any
way I will try if I ever get successful.

Any other helpful pointers for Windows / Cygwin / Perl... will be
appreciated too.

Thanks.
 
S

SVCitian

[snip code]
# Create asecond request: "get search table"
# ---------
$request = HTTP::Request->new('GET' =>
 join '', qw{
http://www.bangkokflightservices.com/TrackTrace/search_awb.php?m_pref....
n=75064953&h_prefix=HWB&h_sn=&ch= } );
## Or, to create a variable AWB lookup
#  my $WBNprefix = '176';
#  my $WBN       = '75064953';
#   $request = HTTP::Request->new('GET' =>
#  "http://www.bangkokflightservices.com/TrackTrace/search_awb.php?m_prefix=" .
#   $WBNprefix. "&m_sn=" . $WBN . "&h_prefix=HWB&h_sn=&ch= ");

Thank you for your efforts... I will try out the perl code tomorrow.

By the way, I am not on a linux machine.. I am on Windows XP using
cygwin / perl.

So, I don't know if the proxy and all the rest of it could work. Any
way I will try if I ever get successful.

Any other helpful pointers for Windows / Cygwin / Perl... will be
appreciated too.

Thanks.

Thank you sln.

First I had a hard time making cpan work with Windows 7 / cygwin /
perl.. the problem was found to be cpan.pm module and mirror site.
made at that work.. and installed the required modules.

And, the next step was to run your code... When i copied and pasted
your code from the google groups.. I had issue with some "..." which
the html was not formatted correctly... so, i had to make the best
judgment and fixed the m_prefix and m_sn number correctly.

another sample:
m_prefix=081&m_sn=75133844

And it works just as per the needs. Thank you so much.

I have no idea why my initial direct "curl" execution cannot execute
correctly... Can you please explain why a direct GET doesn't work with
the URL.

and why your code had to be instead.. what does the web site developer
do to avoid getting direct GET result. Is it mainly to do with the
cookie, or user agent or some form ajax issues, etc.??

Also, can you please explain a bit about your code and what it does..
just some comments.

Thank you.. much appreciated.
 
S

sln

Thank you sln.

First I had a hard time making cpan work with Windows 7 / cygwin /
perl.. the problem was found to be cpan.pm module and mirror site.
made at that work.. and installed the required modules.

And, the next step was to run your code... When i copied and pasted
your code from the google groups.. I had issue with some "..." which
the html was not formatted correctly... so, i had to make the best
judgment and fixed the m_prefix and m_sn number correctly.

another sample:
m_prefix=081&m_sn=75133844

And it works just as per the needs. Thank you so much.

I have no idea why my initial direct "curl" execution cannot execute
correctly... Can you please explain why a direct GET doesn't work with
the URL.

and why your code had to be instead.. what does the web site developer
do to avoid getting direct GET result. Is it mainly to do with the
cookie, or user agent or some form ajax issues, etc.??

Also, can you please explain a bit about your code and what it does..
just some comments.

Thank you.. much appreciated.

Not a problem. I'm learning as I go.

Whats going on with this is that it is using JavaScript and Ajax.
The first GET is to load a minimal html page that has embedded JS
that calls Ajax layer. At this time it also establishes a session id
that is only good as long as the page is loaded.

The html is sparse, and contains a table "container". One of the elements,
a single <td> with an id of 'output', is being used as a placeholder into
which more html/JS will be added dynamically with the next GET call.
This is called a html/code fragment, its not a new page, its just the
dynamic loading of table data. Each new WBN sent in subsequent GETs will
return data (html fragment) for that <td> element (id="output").

So, rendering the full page is at least a two-step process.
Loading the main html frame, then loading html code fragment (table data
for the Air Bill). Subsequent GETs (without leaving the page) just updates
the table data to contain the information for a new Air Bill.

Thats the way it works in the browser. In the browser, Java Script is run.
It takes the url input and "constructs" a new url. The "new" url is formulated
into a new request called XMLHttpRequest() object (similar to LWP request).
This Ajax request object goes out and does a normal GET. Whats returned is a
fragment of html, in this case table data containing info about the luggage
for the particular Way Bill.

So thats the reason it didn't work in LWP, the main page is just a shell for
the dynamic data loaded later.
WSP however, see's two requests, one for the main page, the other for the data
fragment. WSP doesen't need to execute JS/Ajax, it just records the result of the
interaction between the client/server.

On the bottom of the main html page, we see this:

<script>
searchajax2('./search_awb.php?m_prefix=176&m_sn=75064953&h_prefix=HWB&h_sn=&ch=    ');
reloadpage();
</script>

This is the first thing that is run.

We see that the function searchajax2() first creates an Ajax request object
(using that url). Then it asigns the ajax response reference to the
<td> id="output" elements innerHTML. The ajax request is opened then sent:

ajaxRequest.open("GET", url , true);
ajaxRequest.send(null);

Finally, searchajax2() function returns, then reloadpage() is called to render
the DOM.

Apparently, with regard to the LWP, its a two step process. First to load the
main page skeleton, establish a cookie, then do sucessive calls to load
each fragment with a new WBN info. The html fragments returned each contain
specific information (mostly table data html) related to the WBN.

I hope I am clear, trying not to overload the noise on the group.
I am new to this too, but it doesn't look line rocket science.

-sln

Ps. Here is fleshed out example with some comments and added constuct
to fetch mutilple Way Bills' data.
To see the content, set $show_content = 1;
and maybe redirect the output to a file:
perl lwp.pl > mycapture.txt

-----------------------------------------------

use strict;
use warnings;

use HTML::TableExtract;
use HTTP::Cookies;
use HTTP::Request::Common qw(POST GET);
use LWP::UserAgent;

my $show_content = 0; # 1 = shows response content (html)
my ( $content1, $content2 );

# Create cookies
my $jar = HTTP::Cookies->new();

# Create user agent
my $ua = LWP::UserAgent->new();
$ua->timeout( 10 );
$ua->cookie_jar( $jar );
$ua->agent( "Microsoft Internet Explorer/6.0" );


# Create a first request: "get track table framework"
# Note - this will establish a session with the server.
# ---------

my $request = HTTP::Request->new('GET' =>
join '', qw{
http://www.bangkokflightservices.com/TrackTrace/showc_track.php?m_prefix=176&m_s
n=75064953&h_prefix=HWB&h_sn=&ecy=e076438db64c6190f7b9689a379b7f7093368f1652d14d
b65fee1ab916713f3f5f4030f53369cb1f669614312c4748899c272f4d976a2b299274a21ad80fc0
72b1bab2ab1c181d08c670188722e51ec162f9ae337e3f2f132c88d249133815558d241ce8a4e9b3
fa75c144268b9e901037c2c7257142ee42ff9b2bf2767f57ed62b94fd938ea4dd2b28c53fea6af74
be&ch=%A0%A0%A0%A0 &id=1.2405164500620218} );

# Pass request to agent
# Note - the response is just Java Script/Ajax laced
# html document with a skeleton table. One of the table's element <td> has
# an Id = "output" that recieves the real table data from the next request.
# Apparently this establishes a cookie.

my $res = $ua->request( $request );
if ( $res->is_success ) {
print "\nHtml main Content .. OK\n\n";
if ($show_content) {
print $res->content, "\n\n";
}
$content1 = $res->content;
}
else {
print "Request (Html main Content) Failed\n";
print $res->status_line, "\n\n";
die;
}

print '='x20, "\n\n";

# Create a second request: "get track table body"
# Note - When running as an html document, JS/Ajax are used
# to dynamically load table data (html) to put in <td id="output" ..>
# already loaded with the first request (the main html).
# The html that is returned is Dynamic Html fragment. This contails
# the table data for a single prefix/serial no.
# ---------

# Loop, get the data for a couple of Way Bill Numbers.

my %wbhash = ( '176'=>'75064953', '081'=>'75133844' );

while (my ( $WBNprefix, $WBN ) = each %wbhash)
{
$request = HTTP::Request->new('GET' =>
join '', (
"http://www.bangkokflightservices.com/TrackTrace/search_awb.php?",
"m_prefix=$WBNprefix",
"&m_sn=$WBN",
"&h_prefix=HWB",
"&h_sn=&ch= ")
);

# Pass request to agent

$res = $ua->request( $request );
if ( $res->is_success ) {
print "\nWay Bill fragment .. OK\n";
if ($show_content) {
print $res->content, "\n\n";
}
$content2 = $res->content;
}
else {
print "Request (Way Bill html fragment Content) Failed\n";
print $res->status_line, "\n\n";
die;
}
print "Way Bill ($WBNprefix - $WBN) Content tables:\n", '-'x20, "\n\n";
print_tables( $content2 );
print "\n";
}

print '='x20, "\n\n";
print "Done!\n\n\n";

exit;

## Table extract Util from wsp
##
sub print_tables {
my ( $table, $row, $cell );
my $tc = 0;
my $table_extractor = HTML::TableExtract->new();
$table_extractor->parse( $_[0] );
foreach $table ( $table_extractor->table_states ) {
print "TABLE $tc:\n"; $tc++;
my $rc = 0;
foreach $row ( $table->rows ) {
print "ROW $rc:\n"; $rc++;
foreach $cell ( @$row ) {
$cell = '' unless defined $cell;
$cell =~ s/\n/ /g;
$cell =~ s/[ \t]+/ /g;
$cell =~ s/^[ \t]//;
$cell =~ s/[ \t]$//;
$cell =~ s/ *<\/td *//g;
print "$cell|";
}
print "\n";
}
}
}
__END__

Html main Content .. OK

====================


Way Bill fragment .. OK
Way Bill (081 - 75133844) Content tables:
--------------------

TABLE 0:
ROW 0:
á||||
ROW 1:
á|Enter Master Air Waybill (MAWB)|
ROW 2:
Optional (For Import MAWB Only)|
ROW 3:
á||||
ROW 4:
||* Master Air Waybill number example 123 - 12345678||
TABLE 1:
ROW 0:
||||||||||
ROW 1:
Item|AWB No|Flight No|Flight Date|Origin|Dest|Status|Pieces|Weight|Time|
ROW 2:
1|081-75133844|JQ 029|Oct 19 2010|MEL|BKK|Delivered|2|1,480.00|Oct 20 2010 - 125
5|


Way Bill fragment .. OK
Way Bill (176 - 75064953) Content tables:
--------------------

TABLE 0:
ROW 0:
á||||
ROW 1:
á|Enter Master Air Waybill (MAWB)|
ROW 2:
Optional (For Import MAWB Only)|
ROW 3:
á||||
ROW 4:
||* Master Air Waybill number example 123 - 12345678||
TABLE 1:
ROW 0:
|||||||||||
ROW 1:
Item|AWB No|Flight No|Flight Date|Origin|Dest|ULD No|Status|Pieces|Weight|Time|
ROW 2:
1|176-75064953|EK 419|Oct 15 2010|BKK|DXB|Flight Changeá|Export Transshipment|3|
743.00|Oct 14 2010 5:37PM|
ROW 3:
2|176-75064953|EK 419|Oct 15 2010|BKK|DXB|á|Accepted|3|743.00|Oct 14 2010 5:37PM
|
ROW 4:
3|176-75064953|EK 373|Oct 15 2010|BKK|DXB|Flight Changeá|Export Transshipment|3|
743.00|Oct 14 2010 6:12PM|
ROW 5:
4|176-75064953|EK 373|Oct 15 2010|BKK|DXB|SHCá|Export Transshipment|3|743.00|Oct
14 2010 6:12PM|
ROW 6:
5|176-75064953|EK 373|Oct 14 2010|BKK|DXB|Flight Changeá|Export Transshipment|3|
743.00|Oct 14 2010 6:42PM|
ROW 7:
6|176-75064953|EK 373|Oct 14 2010|BKK|DXB|PMC31131EKá|Manifested|3|743.00|Oct 14
2010 6:57PM|
ROW 8:
7|176-75064953|EK 373|Oct 14 2010|BKK|DXB|á|Departed|3|743.00|Oct 14 2010 9:54PM
|

====================

Done!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top