How to screen scrape for results?

  • Thread starter Swanand Mokashi
  • Start date
S

Swanand Mokashi

Hi all --

I would like to create an application(call it Application "A") that I would
like to mimic exactly as a form on a foreign system (Application "F").
Application "F" is on the web (so basically I can not control it). I will
have a form exactly on Application "A" as that of Application "F".
Application "A" will submit to the url of the application "F". I would like
to do a screen scrape of the confirmation obtained after submitting the form
on application "F".

I can easily do the screen scraping of a static page but am not sure how to
screen scrape a form result ?

Any ideas?

TIA
Swanand
 
S

sloan

Here are the hints you need:

You will not have the FormPostCollection and FormPostItem objects.
These are simple objects. the FormPostItem basically has the key and value
you want to post.
FormPostCollection is a implemented CollectionBase, which keeps a collection
of FormPostItems

However, minus that, you can figure out what is going on.

This took me about 3 days to figure out (the GET and querystring was easy,
the form/post was the tough one).
So post a "thank you" for this one.

I got the code for everything outside of the FORM/POST part from another
developer, but the form/post stuff was what I Figured out in the equation.




public void WriteTextFile(string Url, string FilePath, long BufferSize )
{

try
{



//create a web request
HttpWebRequest oHttpWebRequest = null;
oHttpWebRequest = (HttpWebRequest) System.Net.WebRequest.Create(Url);

//set the connection timeout
oHttpWebRequest.Timeout = 100;//m_ConnectTimeout;


this.postDataToHttpWebRequest ( oHttpWebRequest ,
myCollectionOfFormPostValues );



//create a response object that we can read a stream from
HttpWebResponse oHttpResponse = (HttpWebResponse)
oHttpWebRequest.GetResponse();


long workingbuffersize = 1;

//if we don't get back anything from the response, throw and exception
if (oHttpResponse == null)
{
throw new Exception("Url is missing or invalid.");
}

//Define the encoding type
try
{
//see if the page will give us back an encoding type
if (oHttpResponse.ContentEncoding.Length > 0)
m_enc = Encoding.GetEncoding(oHttpResponse.ContentEncoding);
else
m_enc = Encoding.GetEncoding(1252);
}
catch
{
// *** Invalid encoding passed
m_enc = Encoding.GetEncoding(1252);
}


//create a stream reader grabbing text we get over HTTP
StreamReader sr = new
StreamReader(oHttpResponse.GetResponseStream(),m_enc);

//set the variable that we will use as a buffer to store characters in
while the file is downloading
char[] DownloadedCharChunk = new char[BufferSize];

//go ahead and create our streamwriter to write our file
StreamWriter sw = new StreamWriter(FilePath,false,m_enc);

sw.AutoFlush = false;

//when the working buffer size hits 0 then we know that the file has
finished downloading
while (workingbuffersize > 0)
{
//set the working buffer size based on the length of characters we
receive from the stream
//we will also set DownloadedCharChunk to the set of characters we
recieve from the stream
workingbuffersize = sr.Read(DownloadedCharChunk,0,(int) BufferSize);

if (workingbuffersize > 0)
{
//write DownloadedCharChunk to the file on disk
sw.Write(DownloadedCharChunk,0,(int) workingbuffersize );
}

} // while


sr.Close();
sw.Close();

}
catch(Exception e)
{
throw e;
}


}


private string buildPostString ( FormPostCollection formPostCollec)
{

StringBuilder sb = new StringBuilder();

foreach (FormPostItem fpi in formPostCollec)
{
//string postValue = Encode(Request.Form(postKey));
sb.Append( string.Format("{0}={1}&", fpi.Key , fpi.Value ));
}

return sb.ToString();
}



private void postDataToHttpWebRequest ( HttpWebRequest webRequest ,
Collections.FormPostCollection formPostCollec)
{


if (null != formPostCollec )
{



ASCIIEncoding encoding=new ASCIIEncoding();


byte[] data = encoding.GetBytes(this.buildPostString(formPostCollec));


webRequest.Method = "POST";
webRequest.ContentType="application/x-www-form-urlencoded";
//oHttpWebRequest.ContentType = "text/xml";//Does Not Work

webRequest.ContentLength = data.Length;
Stream newStream=webRequest.GetRequestStream();
// Send the data.
newStream.Write(data,0,data.Length);
newStream.Close();
}



}
 
S

Swanand Mokashi

Not trying to do anythig suspicious.
Have a client who wants to sync data with another web site. We have no
control over the other web site (to say write a web service or drop an XML
file and ask them to parse). The data that needs to be synched is the same
as that submitted by the form on the other web site.

Not trying to create an auto-posting bot :)
 
S

Swanand Mokashi

Ok I have stated playing with your code. I tried it with an ASP.NET page
and it did not seem to work -- probably not your problem. I have a check for

if (!Page.IsPostBack)

{

}



and with posting to this page with your code, seems to return false for
Page.IsPostBack. May not be a problem as the form I want to post to
ultimately is not ASP.NET form I will test this with ASP form and let you
know.

Thanks for all your help!

Swanand



sloan said:
Here are the hints you need:

You will not have the FormPostCollection and FormPostItem objects.
These are simple objects. the FormPostItem basically has the key and
value
you want to post.
FormPostCollection is a implemented CollectionBase, which keeps a
collection
of FormPostItems

However, minus that, you can figure out what is going on.

This took me about 3 days to figure out (the GET and querystring was easy,
the form/post was the tough one).
So post a "thank you" for this one.

I got the code for everything outside of the FORM/POST part from another
developer, but the form/post stuff was what I Figured out in the equation.




public void WriteTextFile(string Url, string FilePath, long BufferSize )
{

try
{



//create a web request
HttpWebRequest oHttpWebRequest = null;
oHttpWebRequest = (HttpWebRequest) System.Net.WebRequest.Create(Url);

//set the connection timeout
oHttpWebRequest.Timeout = 100;//m_ConnectTimeout;


this.postDataToHttpWebRequest ( oHttpWebRequest ,
myCollectionOfFormPostValues );



//create a response object that we can read a stream from
HttpWebResponse oHttpResponse = (HttpWebResponse)
oHttpWebRequest.GetResponse();


long workingbuffersize = 1;

//if we don't get back anything from the response, throw and exception
if (oHttpResponse == null)
{
throw new Exception("Url is missing or invalid.");
}

//Define the encoding type
try
{
//see if the page will give us back an encoding type
if (oHttpResponse.ContentEncoding.Length > 0)
m_enc = Encoding.GetEncoding(oHttpResponse.ContentEncoding);
else
m_enc = Encoding.GetEncoding(1252);
}
catch
{
// *** Invalid encoding passed
m_enc = Encoding.GetEncoding(1252);
}


//create a stream reader grabbing text we get over HTTP
StreamReader sr = new
StreamReader(oHttpResponse.GetResponseStream(),m_enc);

//set the variable that we will use as a buffer to store characters in
while the file is downloading
char[] DownloadedCharChunk = new char[BufferSize];

//go ahead and create our streamwriter to write our file
StreamWriter sw = new StreamWriter(FilePath,false,m_enc);

sw.AutoFlush = false;

//when the working buffer size hits 0 then we know that the file has
finished downloading
while (workingbuffersize > 0)
{
//set the working buffer size based on the length of characters we
receive from the stream
//we will also set DownloadedCharChunk to the set of characters we
recieve from the stream
workingbuffersize = sr.Read(DownloadedCharChunk,0,(int) BufferSize);

if (workingbuffersize > 0)
{
//write DownloadedCharChunk to the file on disk
sw.Write(DownloadedCharChunk,0,(int) workingbuffersize );
}

} // while


sr.Close();
sw.Close();

}
catch(Exception e)
{
throw e;
}


}


private string buildPostString ( FormPostCollection formPostCollec)
{

StringBuilder sb = new StringBuilder();

foreach (FormPostItem fpi in formPostCollec)
{
//string postValue = Encode(Request.Form(postKey));
sb.Append( string.Format("{0}={1}&", fpi.Key , fpi.Value ));
}

return sb.ToString();
}



private void postDataToHttpWebRequest ( HttpWebRequest webRequest ,
Collections.FormPostCollection formPostCollec)
{


if (null != formPostCollec )
{



ASCIIEncoding encoding=new ASCIIEncoding();


byte[] data = encoding.GetBytes(this.buildPostString(formPostCollec));


webRequest.Method = "POST";
webRequest.ContentType="application/x-www-form-urlencoded";
//oHttpWebRequest.ContentType = "text/xml";//Does Not Work

webRequest.ContentLength = data.Length;
Stream newStream=webRequest.GetRequestStream();
// Send the data.
newStream.Write(data,0,data.Length);
newStream.Close();
}



}







Swanand Mokashi said:
Hi all --

I would like to create an application(call it Application "A") that I would
like to mimic exactly as a form on a foreign system (Application "F").
Application "F" is on the web (so basically I can not control it). I will
have a form exactly on Application "A" as that of Application "F".
Application "A" will submit to the url of the application "F". I would like
to do a screen scrape of the confirmation obtained after submitting the form
on application "F".

I can easily do the screen scraping of a static page but am not sure how to
screen scrape a form result ?

Any ideas?

TIA
Swanand
 
S

Swanand Mokashi

Works with ASP !!
Thanks again
Swanand

Swanand Mokashi said:
Ok I have stated playing with your code. I tried it with an ASP.NET page
and it did not seem to work -- probably not your problem. I have a check
for

if (!Page.IsPostBack)

{

}



and with posting to this page with your code, seems to return false for
Page.IsPostBack. May not be a problem as the form I want to post to
ultimately is not ASP.NET form I will test this with ASP form and let you
know.

Thanks for all your help!

Swanand



sloan said:
Here are the hints you need:

You will not have the FormPostCollection and FormPostItem objects.
These are simple objects. the FormPostItem basically has the key and
value
you want to post.
FormPostCollection is a implemented CollectionBase, which keeps a
collection
of FormPostItems

However, minus that, you can figure out what is going on.

This took me about 3 days to figure out (the GET and querystring was
easy,
the form/post was the tough one).
So post a "thank you" for this one.

I got the code for everything outside of the FORM/POST part from another
developer, but the form/post stuff was what I Figured out in the
equation.




public void WriteTextFile(string Url, string FilePath, long BufferSize )
{

try
{



//create a web request
HttpWebRequest oHttpWebRequest = null;
oHttpWebRequest = (HttpWebRequest) System.Net.WebRequest.Create(Url);

//set the connection timeout
oHttpWebRequest.Timeout = 100;//m_ConnectTimeout;


this.postDataToHttpWebRequest ( oHttpWebRequest ,
myCollectionOfFormPostValues );



//create a response object that we can read a stream from
HttpWebResponse oHttpResponse = (HttpWebResponse)
oHttpWebRequest.GetResponse();


long workingbuffersize = 1;

//if we don't get back anything from the response, throw and exception
if (oHttpResponse == null)
{
throw new Exception("Url is missing or invalid.");
}

//Define the encoding type
try
{
//see if the page will give us back an encoding type
if (oHttpResponse.ContentEncoding.Length > 0)
m_enc = Encoding.GetEncoding(oHttpResponse.ContentEncoding);
else
m_enc = Encoding.GetEncoding(1252);
}
catch
{
// *** Invalid encoding passed
m_enc = Encoding.GetEncoding(1252);
}


//create a stream reader grabbing text we get over HTTP
StreamReader sr = new
StreamReader(oHttpResponse.GetResponseStream(),m_enc);

//set the variable that we will use as a buffer to store characters in
while the file is downloading
char[] DownloadedCharChunk = new char[BufferSize];

//go ahead and create our streamwriter to write our file
StreamWriter sw = new StreamWriter(FilePath,false,m_enc);

sw.AutoFlush = false;

//when the working buffer size hits 0 then we know that the file has
finished downloading
while (workingbuffersize > 0)
{
//set the working buffer size based on the length of characters we
receive from the stream
//we will also set DownloadedCharChunk to the set of characters we
recieve from the stream
workingbuffersize = sr.Read(DownloadedCharChunk,0,(int) BufferSize);

if (workingbuffersize > 0)
{
//write DownloadedCharChunk to the file on disk
sw.Write(DownloadedCharChunk,0,(int) workingbuffersize );
}

} // while


sr.Close();
sw.Close();

}
catch(Exception e)
{
throw e;
}


}


private string buildPostString ( FormPostCollection formPostCollec)
{

StringBuilder sb = new StringBuilder();

foreach (FormPostItem fpi in formPostCollec)
{
//string postValue = Encode(Request.Form(postKey));
sb.Append( string.Format("{0}={1}&", fpi.Key , fpi.Value ));
}

return sb.ToString();
}



private void postDataToHttpWebRequest ( HttpWebRequest webRequest ,
Collections.FormPostCollection formPostCollec)
{


if (null != formPostCollec )
{



ASCIIEncoding encoding=new ASCIIEncoding();


byte[] data = encoding.GetBytes(this.buildPostString(formPostCollec));


webRequest.Method = "POST";
webRequest.ContentType="application/x-www-form-urlencoded";
//oHttpWebRequest.ContentType = "text/xml";//Does Not Work

webRequest.ContentLength = data.Length;
Stream newStream=webRequest.GetRequestStream();
// Send the data.
newStream.Write(data,0,data.Length);
newStream.Close();
}



}







Swanand Mokashi said:
Hi all --

I would like to create an application(call it Application "A") that I would
like to mimic exactly as a form on a foreign system (Application "F").
Application "F" is on the web (so basically I can not control it). I
will
have a form exactly on Application "A" as that of Application "F".
Application "A" will submit to the url of the application "F". I would like
to do a screen scrape of the confirmation obtained after submitting the form
on application "F".

I can easily do the screen scraping of a static page but am not sure how to
screen scrape a form result ?

Any ideas?

TIA
Swanand
 
A

alex_f_il

You can also try SWExplorerAutomation (SWEA)
(http://www.webunittesting.com). SWEA supports frames, DHTML (AJAX)
pages, windows and HTML dialogs, popup windows, file downloads. SWEA
solutions can be run from ASP.NET pages or windows service.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top