Finding String Within Page

S

Seymore

Hi. I'm a perl newbie trying to write a 'scraping' function that can
call, and have it return a tiny value from a web page.

I'd like to pass it a URL, a startPoint string, and a endPoint string.
I'd like it to socket to the web page, read in the source html to the
FIRST occurence of startPoint, then read from there to the first (next)
occurence of endPoint that follows that startPoint. I'd like the result
returned as a string of text, too, which I can process more later.

Better still if startPoint and endPoint can contain wildcards or
somesuch. I guess a fancier version could keep going and return an
array of each matching glob, but I don't really need that now.

Can someone shove me in the right direction? Alternately, if someone
wants to write a fancy rich library of similar routines, I can pay
small money for the time savings.

Seymore R Seiswhittle, UCPS
 
A

A. Sinan Unur

S

Seymore

I don't see anything you have tried.
Instead, it looks like you are trying to get someone to write it for you.
Read the posting guidelines before proceeding.

Wow I didn't realize this was comp.lang.perl.hostile.

Someone in a friendlier group gave me the pointers I needed, actually,
but thanks for the rudeness anyway. Now you can go back to kicking
puppies or whatever you usually do.

SRS
 
G

Guest

: Wow I didn't realize this was comp.lang.perl.hostile.

: Someone in a friendlier group gave me the pointers I needed, actually,
: but thanks for the rudeness anyway. Now you can go back to kicking
: puppies or whatever you usually do.

C'mon, Seymore, don't be childish. The well-established practice here
is:

1. You try to write a piece of code or software which is supposed to
meet certain expectations, like the one you described.

2. Your code fails, either fundamentally because some concept of Perl
was not well understood, or it produces seemingly bizarre results
because of the gory details.

Both is absolutely natural and can hit anybody, at any stage of mastering
Perl (though, erroneous understanding of Perl fundamentals should
decrease over time, at least slightly).

3. No matter which way your code failed, you take it and show it here.
Only when at least a skeleton (functional, that is) of code is visible,
you can expect any useful help from others.

Hint: _everybody_ started as a newcomer to Perl, except for those who
conceived it. As a beginner, writing dysfunctional code is as natural
as the first steps in speech, or in walking, or in life, in general.

4. With your code in front of their eyes, people can/will tell you whether
a) your approach to the underlying data should be changed, b) your
loop construct exits without executing anything, c) your regular expression
catches what it is told to catch, however the instructions were incorrect,
d) you forgot to use the "use warnings;" pragma (pick one or more).

5. Communication style here is terse, but: "terse" ne "hostile". There is
no need to resort to insult if somebody just tried to tell you what you
should do to get help.

6. Take your time before answering, and re-read your postings before hitting
the send key.

Oliver.
 
T

Tad McClellan

Seymore said:
I'm a perl newbie trying to write a 'scraping' function


If you show us your broken "try", we could help you fix it...

I'd like to pass it a URL, a startPoint string, and a endPoint string.
I'd like it to socket to the web page, read in the source html to the


perldoc -q HTML

How do I fetch an HTML file?

FIRST occurence of startPoint, then read from there to the first (next)
occurence of endPoint that follows that startPoint.


That isn't how the web works.

You must read in the entire response first, then you can trim
out what you don't need:

$html =~ s/.*(startPoint.*?endPoint).*/$1/s;
 
T

Tad McClellan

[attribution missing. Please compose followups properly.]



That is what I thought too.

Seems natural that if the OP had actually tried, there would be
some broken code to repair, but there was no code...

Wow I didn't realize this was comp.lang.perl.hostile.


It might accurately be called comp.lang.perl.help_me_write_my_program.

You seemed to be treating it as if it was
comp.lang.perl.write_my_program_for_me.

Someone in a friendlier group gave me the pointers I needed


It might have just been a less-astute group, where they were taken
in by your claims of having tried something already. Or, maybe it
is a group where they actually do just write programs to specification.

but thanks for the rudeness anyway.


It would be easy to think that you have not tried coding any of
this up yet, despite claiming that you had.

If that was the case, then it is rude to make false claims.

You reap what you sow.

Now you can go back to kicking
puppies or whatever you usually do.


And you can head off into perpetual invisibility.

So long!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,184
Messages
2,570,973
Members
47,529
Latest member
JaclynShum

Latest Threads

Top