qr and subroutines

W

WRX

Hi all,
I am writing a program(code at bottom) to clean up bot posts on a web
forum.
To do this, I need to build various URL's with session id's etc.
First time I have actually used a perl module, and it's all really
smooth sailing and good fun.

However, I'm really stuck on WHY, I have has to write some code in a
particular way to achieve something.

In order to follow the methodology of re-using code, I want a
subroutine that I can pass a "regex", in order to build URL's from a
base URL.

Here it is.

my $input_url = 'http://xyz.com.au/forum/index.php';
my $test_url = &build_url( \'admin/index.php?admin=22&sid=' );


sub build_url {
my $tmp_regex = shift;
my $tmp_url = $input_url;
$tmp_url =~ s/index\.php/$$tmp_regex/;
$tmp_url .= $s_id;
}

Ok, so I have created a hard reference to a string (remembers the
"Camel Book" saying one day you might need to). I then dereference it
in the actual regex function itself, and I get what I want. BUT, it
seems a bogus way to do it, and it's not even a regex I'm passing,
which is going to cost me flexibility down the road.

What I wanted to do was something like this.


my $input_url = 'http://xyz.com.au/forum/index.php';
my $test_url = &build_url( qr|index.php/admin/index.php?admin=22&sid=|
);

sub build_url {
my $tmp_regex = shift;
my $tmp_url = $input_url;
$tmp_url =~ s/index\.php/$tmp_regex/;
$tmp_url .= $s_id;
}

BUT, I get the "regex reference" being dereferenced and expressed as
the regex, including status of switches etc. As follows:

http://xyz.com.au/forum/(?-xism:admin/index.php?admin=22&sid=)21292773ce2e5de93dea126ac35784df

(so, probably I need to build the regex reference beforehand, and I'm
then writing more code which is kind of self defeating, unless the
subroutine was like 20 lines long - but it's not)


....AND, what I REALLY wanted to do, was pass the ENTIRE "Regex" to the
function if one dirty big long swoop, somehow.

my $input_url = 'http://xyz.com.au/forum/index.php';
my $test_url = &build_url(
s/index\.php/index.php/admin/index.php?admin=22&sid=/ )

sub build_url {
my $tmp_regex = shift;
my $tmp_url = $input_url;
$tmp_url =~ $tmp_regex;
$tmp_url .= $s_id;
}

Obviously not correct, but why can't I do something like this?
OR can I? Is there a way?
Thanks people in advance.




#!/usr/bin/perl

use strict;
use warnings;

use WWW::Mechanize;
my $mech = WWW::Mechanize->new( autocheck => 1 );

use WWW::Mechanize::Frames;
my $mech_f = WWW::Mechanize::Frames->new();

my $input_url = 'http://xyz.com.au/forum/index.php';
my $name = 'username';
my $password = 'password';
my $button = 'login';

$mech->get( $input_url );

&authenticate;

my $raw_mech = $mech->content;

my $s_id = $raw_mech;
$s_id =~ tr/\n/ /;
$s_id =~ s/^.*admin\/index\.php\?sid=([0-9a-z]*)">.*$/$1/;

my $login_url = $input_url;
$login_url =~ s/index\.php/admin\/index\.php\?sid=/;
$login_url .= $s_id;

my $logout_url = $input_url;
$logout_url =~ s/index\.php/login\.php\?logout=true&sid=/;
$logout_url .= $s_id;

$mech->get( $login_url );

### Forum asks for authentication again to get into admin panel ###
&authenticate;

my $admin_url = $input_url;
$admin_url =~ s/index\.php/admin\/index\.php\?admin=1&sid=/;
$admin_url .= $s_id;

### admin panel is frames :( ###
$mech_f->get( $admin_url );
my @frames = $mech_f->get_frames();

$raw_mech = "";
$raw_mech = $mech->content;

### JUST TESTING RIGHT HERE ###
my $test_url = &build_url( \'admin/index.php?admin=22&sid=' );
print "$test_url\n";

#print $frames[0]->content;
#print $frames[1]->content;

&logout;
exit 0;


#==================================================================#

sub authenticate {
$mech->set_fields( $name => '' );
$mech->set_fields( $password => '' );
$mech->click;
}

sub build_url {
my $tmp_regex = shift;
my $tmp_url = $input_url;
$tmp_url =~ s/index\.php/$$tmp_regex/;
$tmp_url .= $s_id;
}

sub logout {
$mech->get( $logout_url );
}
 
R

Robert 'phaylon' Sedlacek

WRX said:
In order to follow the methodology of re-using code, I want a
subroutine that I can pass a "regex", in order to build URL's from a
base URL.

Here it is.

my $input_url = 'http://xyz.com.au/forum/index.php';
my $test_url = &build_url( \'admin/index.php?admin=22&sid=' );


sub build_url {
my $tmp_regex = shift;
my $tmp_url = $input_url;
$tmp_url =~ s/index\.php/$$tmp_regex/;
$tmp_url .= $s_id;
}

Wait a second. First, don't use & in front of build_url unless you
really know what it is doing. Second: Your $tmp_regex is not a regex,
it's a string. Third: In your function you don't use $tmp_regex as
regex, but as substitute.
Ok, so I have created a hard reference to a string (remembers the
"Camel Book" saying one day you might need to). I then dereference it
in the actual regex function itself, and I get what I want. BUT, it
seems a bogus way to do it, and it's not even a regex I'm passing,
which is going to cost me flexibility down the road.

What I wanted to do was something like this.


my $input_url = 'http://xyz.com.au/forum/index.php';
my $test_url = &build_url( qr|index.php/admin/index.php?admin=22&sid=|
);

sub build_url {
my $tmp_regex = shift;
my $tmp_url = $input_url;
$tmp_url =~ s/index\.php/$tmp_regex/;
$tmp_url .= $s_id;
}

BUT, I get the "regex reference" being dereferenced and expressed as
the regex, including status of switches etc. As follows:

http://xyz.com.au/forum/(?-xism:admin/index.php?admin=22&sid=)21292773ce2e5de93dea126ac35784df

Why do you even think you want a regular expression in the second part
of a s///?
...AND, what I REALLY wanted to do, was pass the ENTIRE "Regex" to the
function if one dirty big long swoop, somehow.

my $input_url = 'http://xyz.com.au/forum/index.php';
my $test_url = &build_url(
s/index\.php/index.php/admin/index.php?admin=22&sid=/ )

sub build_url {
my $tmp_regex = shift;
my $tmp_url = $input_url;
$tmp_url =~ $tmp_regex;
$tmp_url .= $s_id;
}

Obviously not correct, but why can't I do something like this?
OR can I? Is there a way?
Thanks people in advance.

The solutions:
- Use a regex where you want a regex, use a string where you want a
string.
- You can pass more than one argument.


<untested>

my $search = qr/index\.php/;
my $replace = 'admin/index.php?admin=22&sid=';

my $new_uri
= rework_uri('http://example.com/index.php', $search, $replace);

sub rework_uri {
my ($uri, $search, $replace) = @_;
$uri =~ s/$search/$replace/;
$uri .= $sid;
return $uri;
}

</untested>
 
W

WRX

On Fri, 12 Jan 2007 14:18:24 +0100, Robert 'phaylon' Sedlacek

[SNIP]
Wait a second. First, don't use & in front of build_url unless you
really know what it is doing. Second: Your $tmp_regex is not a regex,
it's a string. Third: In your function you don't use $tmp_regex as
regex, but as substitute.

OK, after reading up, I understand no "&" when calling function in
this way, thanks.
I shouldn't have referred to the string as "regex", but I was really
referring to the 3'rd example, which was "regex" and "string
replacement" all in one go. I agree, incorrect in definition.

[SNIP]


The solutions:
- Use a regex where you want a regex, use a string where you want a
string.
- You can pass more than one argument.


<untested>

my $search = qr/index\.php/;
my $replace = 'admin/index.php?admin=22&sid=';

my $new_uri
= rework_uri('http://example.com/index.php', $search, $replace);

sub rework_uri {
my ($uri, $search, $replace) = @_;
$uri =~ s/$search/$replace/;
$uri .= $sid;
return $uri;
}

</untested>

I agree, and I understand that I can do it this way, but it's STILL "3
lines of code", which is no less than what I had originally - so I
don't obtain an advantage of brevity in re-using code through a
subroutine. I just wanted to do something like that all in one
statement, that passes everything to the subroutine.

Mind you, there is an advantage in doing it that way - by passing
"regex" and "replacement" string, a degree of flexibility is obtained,
so that has to be considered a win. Thanks mate.


AND consequently, considering your advice, what about this:

my $input_url = 'http://xyz.com.au/forum/index.php';
my $tmp_regex = '';



$tmp_regex = qr/index\.php/;
my $login_url = build_url( $tmp_regex, 'admin/index.php?sid=', 1 );



sub build_url {
my ( $tmp_regex, $replace_string, $sid_flag ) = @_;
my $tmp_url = $input_url;
$tmp_url =~ s/$tmp_regex/$replace_string/;
$tmp_url .= $s_id if ( $sid_flag == 1 );
return $tmp_url;
}


So, I get the code in the body of the program down to two lines, each
time I need to perform the action, and further, gain the flexibility
of supplying the qr/regex/ AND the replacement string, and a flag to
add the $s_id or not :)

Does it get any better than this?





#!/usr/bin/perl

use strict;
use warnings;

use WWW::Mechanize;
my $mech = WWW::Mechanize->new( autocheck => 1 );

use WWW::Mechanize::Frames;
my $mech_f = WWW::Mechanize::Frames->new();

my $input_url = 'http://xyz.com.au/forum/index.php';
my $name = 'username';
my $password = 'password';
my $button = 'login';
my $tmp_regex = '';

$mech->get( $input_url );

print_title();
authenticate();

my $raw_mech = $mech->content;

my $s_id = $raw_mech;
$s_id =~ tr/\n/ /;
$s_id =~ s/^.*admin\/index\.php\?sid=([0-9a-z]*)">.*$/$1/;

$tmp_regex = qr/index\.php/;
my $login_url = build_url( $tmp_regex, 'admin/index.php?sid=', 1 );

$tmp_regex = qr/index\.php/;
my $logout_url = build_url( $tmp_regex, 'login.php?logout=true&sid=',
1 );

$mech->get( $login_url );

print_title();
authenticate();

$tmp_regex = qr/index\.php/;
my $admin_url = build_url( $tmp_regex, 'admin/index.php?admin=1&sid=',
1 );

$mech_f->get( $admin_url );

print_title();

my @frames = $mech_f->get_frames();

$raw_mech = "";
$raw_mech = $mech->content;

print "$login_url\n";
print "$logout_url\n";
print "$admin_url\n";

#print $frames[0]->content;
#print $frames[1]->content;

logout();
exit 0;


#==================================================================#

sub authenticate {
$mech->set_fields( $name => '' );
$mech->set_fields( $password => '' );
$mech->click;
}

sub print_title {
my $raw_mech = $mech->content;
my $page_title = $raw_mech;
$page_title =~ tr/\n/ /;
$page_title =~ s|^.*<title>(.*)</title>.*$|$1|i;
$page_title =~ s/^/PAGE TITLE: /;
print "\n$page_title\n";
}

sub build_url {
my ( $tmp_regex, $replace_string, $sid_flag ) = @_;
my $tmp_url = $input_url;
$tmp_url =~ s/$tmp_regex/$replace_string/;
$tmp_url .= $s_id if ( $sid_flag == 1 );
return $tmp_url;
}

sub logout {
$mech->get( $logout_url );
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top