Parsing HTML using TreeBuilder - how to get the "next" tag?

B

Bruce Horrocks

I have a large (6Mb) HTML file that has been generated by a software
application's "document" function which I am trying to parse using
HTML::TreeBuilder. It consists of lots of lines in the form:

<p> Text text text text text
<p> Text text text text text
....
<p> Text text text text text
<h1>Section Heading</h1>
<p> Blah blah blah blah
<p> Blah blah blah blah
<p> Blah blah blah blah
....

I can use $tree->look_down() to find the h1 heading but then, how do I
get the next line? All the examples assume that the thing you want is a
*child* of the heading, not the *next* tag.

This requirement seems to be so basic that I must be missing something
but I can't see what. Perl is ActiveState 5.8.6 on Win32.

Thanks in advance
 
B

Bruce Horrocks

Bruce Horrocks said:
I can use $tree->look_down() to find the h1 heading but then, how do I
get the next line? All the examples assume that the thing you want is a
*child* of the heading, not the *next* tag.

Okay, found it (I think)
HTML::Element->right() looks to be what I'm after. Sorry for the noise.

Regards,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,817
Latest member
AdalbertoT

Latest Threads

Top