URL detection follow-up

\Dandy\ Randy · Sep 10, 2003

Hi,

As per my previous posts, I am searching for a way to open a text file that
contains a few paragraphs of text, locate web URL's and replace them with
the needed html tags such as <a href & </a> etc. Most of the responses
suggested using a perl module ... URI::Find and similair methods.
Unfortunately, the hosting company I run my scripts from does not have this
module installed, and they are not prepared to install it just for my needs.
I also cannot change hosting companies.

So ... is there ANY other way my goal can be accomplished giving I cannot
use URI::Find? Here is an example of theoretical code:

#!/usr/bin/perl

# get text file data
open (TEXT, "<data/data.txt") or die "Can't open file: $!";
@data=<TEXT>;
close(TEXT);

# find web URL's and replace occurances with needed HTML tags
scan @list > replace;

# write the changed data back to the text file
open (TEXT, ">data/data.txt") or die "Can't open file: $!";
print DATA @text;
close(TEXT);

Please, it is very important to me to find this solution, if you have any
ideas, post back. Working code examples are very welcomed. Thanx everyone!

Randy

A. Sinan Unur · Sep 10, 2003

methods. Unfortunately, the hosting company I run my scripts from does
not have this module installed, and they are not prepared to install
it just for my needs. I also cannot change hosting companies.

perldoc -q lib
Found in C:\Perl\lib\pod\perlfaq8.pod

How do I keep my own module/library directory?

When you build modules, use the PREFIX option when generating
Makefiles:

perl Makefile.PL PREFIX=/u/mydir/perl

then either set the PERL5LIB environment variable before you run
scripts that use the modules/libraries (see perlrun) or say

use lib '/u/mydir/perl';

This is almost the same as

BEGIN {
unshift(@INC, '/u/mydir/perl');
}

except that the lib module checks for machine-dependent
subdirectories. See Perl's lib for more information.

Brian Wakem · Sep 10, 2003

"Dandy" Randy said:
Hi,

As per my previous posts, I am searching for a way to open a text file that
contains a few paragraphs of text, locate web URL's and replace them with
the needed html tags such as <a href & </a> etc. Most of the responses
suggested using a perl module ... URI::Find and similair methods.
Unfortunately, the hosting company I run my scripts from does not have this
module installed, and they are not prepared to install it just for my needs.
I also cannot change hosting companies.

So ... is there ANY other way my goal can be accomplished giving I cannot
use URI::Find? Here is an example of theoretical code:

#!/usr/bin/perl

# get text file data
open (TEXT, "<data/data.txt") or die "Can't open file: $!";
@data=<TEXT>;
close(TEXT);

# find web URL's and replace occurances with needed HTML tags
scan @list > replace;

# write the changed data back to the text file
open (TEXT, ">data/data.txt") or die "Can't open file: $!";
print DATA @text;
close(TEXT);

Please, it is very important to me to find this solution, if you have any
ideas, post back. Working code examples are very welcomed. Thanx everyone!

If they are all like http://www.domain.com/dir/file.html then you could do
something like -

foreach(@data) {
s!(http://.*?)(?:\s|$)!<a href="$1">$1</a>!gi;
}

Not perfect, but it'll get you started.

\Dandy\ Randy · Sep 10, 2003

Awesome ... works great ... now ... can you formulate a replacement command
that will take an email address and add the <a href="mailto: commands so
that email addresses will also become linked? You've been agreat help!

Randy

Brian Wakem · Sep 10, 2003

"Dandy" Randy said:
Awesome ... works great ... now ... can you formulate a replacement command
that will take an email address and add the <a href="mailto: commands so
that email addresses will also become linked? You've been agreat help!

Randy

Nice example of top posting.

To match email addresses perfectly every time is probably impossible, but a
simple and effective way of matching 99%+ would be:-

foreach(@data) {
s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a>!g;
}

\Dandy\ Randy · Sep 10, 2003

Brian Wakem said:
To match email addresses perfectly every time is probably impossible, but a
simple and effective way of matching 99%+ would be:-

foreach(@data) {
s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a>!g;
}

Brian, thankx again, that one worked too. Here is what i'm now using that
seems to work correctly:

$contents=~ s/http:\/\///g;
$contents=~ s!(www.*?)(?:\s|$)!<a href="http://$1">$1</a> !gi;
$contents=~ s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a> !g;

The first code eliminates the http:// in case the text contained a full url,
then adjusted your code to start looking for www. You also may notice a
deliberate space after the </a> tags ... this was needed as your code seemed
to kill the trailing space. Owe you one.

Randy

P.S. Sorry about the last top post.

Brian Wakem · Sep 10, 2003

"Dandy" Randy said:
Brian Wakem said:

To match email addresses perfectly every time is probably impossible,

Click to expand...

but

a
simple and effective way of matching 99%+ would be:-

foreach(@data) {
s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a>!g;
}

Click to expand...

Brian, thankx again, that one worked too. Here is what i'm now using that
seems to work correctly:

$contents=~ s/http:\/\///g;
$contents=~ s!(www.*?)(?:\s|$)!<a href="http://$1">$1</a> !gi;
$contents=~ s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a> !g;

The first code eliminates the http:// in case the text contained a full url,
then adjusted your code to start looking for www. You also may notice a
deliberate space after the </a> tags ... this was needed as your code seemed
to kill the trailing space. Owe you one.

Yes it would have swallowed the space.

s!(http://.*?)(\s|$)!<a href="$1">$1</a>$2!gi;

Instead should sort that out.

I'm glad they worked for you, but it's important to understand why, in case
you need to alter something. It's also important to understand why those
regexs are not perfect and will not work for every url or email, and from
time-to-time, could match things that aren't urls or email addresses.

Only one table shows up with the information	2	Mar 29, 2023
File Locking Follow-up	1	Sep 10, 2003
How do i resolve this error message Please! I need help	1	Mar 30, 2013
Backtick command with long output super slow	5	Dec 4, 2012
Variable length lookbehind not implemented	19	Aug 21, 2013
LWP::Parallel::UserAgent does not follow redirects	0	Dec 2, 2003
Internalisation support and dictionaries	7	Apr 3, 2007
Regular expression segmentation Fault with in-place substitution	1	Jul 29, 2009

URL detection follow-up

\Dandy\ Randy

A. Sinan Unur

Brian Wakem

\Dandy\ Randy

Brian Wakem

\Dandy\ Randy

Brian Wakem

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads