LWP::Useragent and Javascript function

F

Francis Sylvester

Dear all,

I'm trying to scrape a site with LWP::Useragent which contains a javascript
function (__doPostBack) within the links. From investigation - this appears
to be an ASP.NET function to validate the client. Does anybody know if a
Javascript module exists for perl? ...or if anybody is familiar with
scraping this function or could point me to a site with more info I'd really
appreciate it. The function appears to post an __EVENTTARGET and an
__EVENTARGUMENT to the server. I've tried setting these in the LWP header
but no joy.

Thanks,
Francis
 
B

Brian Wakem

Francis said:
Dear all,

I'm trying to scrape a site with LWP::Useragent which contains a
javascript function (__doPostBack) within the links. From investigation -
this appears to be an ASP.NET function to validate the client. Does
anybody know if a Javascript module exists for perl? ...or if anybody is
familiar with scraping this function or could point me to a site with more
info I'd really appreciate it. The function appears to post an
__EVENTTARGET and an __EVENTARGUMENT to the server. I've tried setting
these in the LWP header but no joy.


Automating stuff on ASP.NET sites is a nightmare.

I think last time I did it I captured the __VIEWSATE and __EVENT... stuff
with regexs and then POSTed the data to the appropriate URL, which worked.

MS are changing the way this works in newer versions of .NET, as they
realised it's bloody stupid to post a 6KB VIEWSTATE to a page which does
very little.

The whole idea is ridiculous anyhow as ~10% of people wont have JS and
therefore can't use any of these sites.
 
T

Tad McClellan

Francis Sylvester said:
I'm trying to scrape a site with LWP::Useragent which contains a javascript
function (__doPostBack) within the links. From investigation - this appears
to be an ASP.NET function to validate the client.

Does anybody know if a
Javascript module exists for perl?


You don't need one if you can do in Perl what the Javascript does.

...or if anybody is familiar with
scraping this function or could point me to a site with more info I'd really
appreciate it.


You first do a GET, and then do a POSTback to the same URL with...

The function appears to post an __EVENTTARGET and an
__EVENTARGUMENT to the server.


....some values that were added in the GET's response.

I've tried setting these in the LWP header
but no joy.


Show code if you want help fixing the code.

Or, just have this write the Perl code for you:

Web Scraping Proxy

http://www.research.att.com/~hpk/wsp/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,379
Messages
2,571,945
Members
48,806
Latest member
LizetteRoh

Latest Threads

Top