Get info from a page

J

James

Hi everyone,
The fragment below is from a table on a page I pull, (scrape), information
from. The fragment is one row of what is potentially several rows.

The items with id's I can get:
id="dgdBusSchedule__ctl1_lblDepartureDate" yields "06:11"
id="dgdBusSchedule__ctl1_lnkRouteNumber" yields "6"

The id's above change with each row - ... _ctl2_..., ... _ctl3_..., easy to
loop through.

The id's above is the major data items for my app. The last cell in the row
I cannot seem to get. It is the text which reads "University". It is the
destination of route 6. This item seems to have no discerning id, name or
tag, even the class is generic on this page.

What methods are used to iterate over a table and all of its rows/cells?

All opinions appreciated!



<td scope="row" class="gAltContentSection">
<span
id="dgdBusSchedule__ctl1_lblDepartureDate">06:11</span></td>
<td scope="row" class="gAltContentSection">
<a id="dgdBusSchedule__ctl1_lnkRouteNumber"
href="javascript:document.forms[&quot;frmBusStopScheduleResults&quot;][&quot;txtRouteId&quot;].value
=&quot;30&quot;;document.forms[&quot;frmBusStopScheduleResults&quot;][&quot;txtRouteDepartureTime&quot;].value
=&quot;371&quot;;gfnTransit_SwitchPostBackUrl(&quot;BusStopSchedule_Results.aspx|BusStopDetail.aspx&quot;,&quot;RouteSchedule_Results.aspx&quot;,&quot;frmBusStopScheduleResults&quot;,&quot;_blank&quot;,&quot;NOVIEWSTATE&quot;);RouteSchedulePostBack(&quot;&quot;,&quot;&quot;,&quot;frmBusStopScheduleResults&quot;,&quot;txtRouteNumber&quot;,&quot;6&quot;);">6</a></td>
<td scope="row" class="gAltContentSection">University</td>
 
G

Göran Andersson

James said:
What methods are used to iterate over a table and all of its rows/cells?

If you would want to loop over the table, you would have to parse the
html code into some kind of object tree. I would suggest that you just
use a regular expression to get the data from the code.

Something like:

Matches m = Regex.Matches(page,
"<span[^>]+?id=""[^""]+?lblDepartureDate""[^>]*?>([^<]+?)</span>.*?<a[^>]+?id=""[^""]+?lnkRouteNumber""[^>]*?>(\d+)</a>")
 
J

James

Thanks Göran,

Regex seems quite powerful. Looking on msdn I have found info on the methods
for regex, but lack the knowledge how to code one. Where is a page to teach
me how to make regex? Once I learn the syntax, agree, get the info I need is
possible.

Thanks Göran!





Göran Andersson said:
James said:
What methods are used to iterate over a table and all of its rows/cells?

If you would want to loop over the table, you would have to parse the html
code into some kind of object tree. I would suggest that you just use a
regular expression to get the data from the code.

Something like:

Matches m = Regex.Matches(page,
"<span[^>]+?id=""[^""]+?lblDepartureDate""[^>]*?>([^<]+?)</span>.*?<a[^>]+?id=""[^""]+?lnkRouteNumber""[^>]*?>(\d+)</a>")
 
J

James

Found a page! http://www.regular-expressions.info/reference.html

Göran Andersson said:
James said:
What methods are used to iterate over a table and all of its rows/cells?

If you would want to loop over the table, you would have to parse the html
code into some kind of object tree. I would suggest that you just use a
regular expression to get the data from the code.

Something like:

Matches m = Regex.Matches(page,
"<span[^>]+?id=""[^""]+?lblDepartureDate""[^>]*?>([^<]+?)</span>.*?<a[^>]+?id=""[^""]+?lnkRouteNumber""[^>]*?>(\d+)</a>")
 
S

siccolo

Found a page!http://www.regular-expressions.info/reference.html




If you would want to loop over the table, you would have to parse the html
code into some kind of object tree. I would suggest that you just use a
regular expression to get the data from the code.
Something like:
Matches m = Regex.Matches(page,
"<span[^>]+?id=""[^""]+?lblDepartureDate""[^>]*?>([^<]+?)</span>.*?<a[^>]+?­id=""[^""]+?lnkRouteNumber""[^>]*?>(\d+)</a>")

- Show quoted text -

you can also use javascript to iterate in rows and cells...
<script>
function delete_all(table_element)
{
for(i=table_element.rows.length-1; i > -1; i--)
{
... check row content ...
}
}
</script>


... more at http://www.siccolo.com/articles.asp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top