need help understanding someone elses script

F

falcon198198

Can someone help me understand a part of this script. Personally I
would not have written the script this way but it is someone elses
system and I need to understand the code. This script is called in a
printing system and is renaming the 2 characters in the file name.

The line I am trying to understand is the translation part. (Again I
did not write this so please take it easy on me )

# These are the translations
$trans{"MO"} = "ST";
$trans{"TU"} = "ST";
$trans{"WE"} = "ST";
$trans{"TH"} = "ST";
$trans{"FR"} = "ST";
$trans{"SA"} = "ST";
$trans{"SU"} = "ST";
$trans{"AD"} = "ST";
$trans{"PP"} = "ST";
$trans{"RL"} = "ST";
$trans{"00"} = "ST";

# Save the input filename
$orig = $_ = shift;

# Run the translation
if (/(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig;
}

# Move the file
system("move $orig $_");

# Send to Oman
system("move $_ j:\ink");
 
X

xhoster

falcon198198 said:
Can someone help me understand a part of this script. Personally I
would not have written the script this way but it is someone elses
system and I need to understand the code. This script is called in a
printing system and is renaming the 2 characters in the file name.

Can you point to the part you don't understand? The whole thing is really
nothing more than the sum of its parts.
# Run the translation
if (/(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig;
}

The /g probably aren't going to do anything, unless you have multi line
strings (because of the .* at the end, there is nothing left for another
match)

For any string that starts with #PP and is at least 12 characters long,
the 11th and 12 characters will be replace with ST iff that two
character string is in the hash.

Otherwise, If the string is at least 9 characters long, the 8th and 9th
characters will be replaces with ST (as long as the 8th and 9th characters
are in the hash).

Otherwise, nothing is changed.

Xho
 
F

falcon198198

That makes sense based on what the application does but I do not
undersatnd a couple of parts here.

You mentioned the starts with #PP that makes sense but how you counted
the 12 characters long does not.
What piece tells it to use the 11 and 12 th characters if the hash
matches.

Since I did not understand the above part it is more confusing on how
you got the otherwise piece as well.
Can you please break down how to read that statement.
 
X

xhoster

falcon198198 said:
That makes sense based on what the application does but I do not
undersatnd a couple of parts here.

Please quote some context when you reply.
You mentioned the starts with #PP that makes sense but how you counted
the 12 characters long does not.

if (/(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig;
}

#PP = 3 (if present and matchable)
..{7} = 7
(..) = 2
(.*) = 0 to infinite

total 3+7+2+0 = 12

If the #PP is not there, or it can't match because there are not enough
characters after the #PP to fulfill the remaining 7+2 position, then the
option part of the "?" kicks in.
What piece tells it to use the 11 and 12 th characters if the hash
matches.

In the first regex, $2 contains the 11th and 12th (or 8th and 9th, if #PP
doesn't apply) characters, because that is what (..) does. So that piece
would be:

and exists($trans{$2})
Since I did not understand the above part it is more confusing on how
you got the otherwise piece as well.
Can you please break down how to read that statement.

If there is a specific part you don't understand, yes. If you don't
understand any of it, I doubt I could do a better job of explaining it than
"perldoc perlre" could do.

Xho
 
J

jgraber

falcon198198 said:
Can someone help me understand a part of this script. Personally I
would not have written the script this way but it is someone elses
system and I need to understand the code. This script is called in a
printing system and is renaming the 2 characters in the file name.

The line I am trying to understand is the translation part. (Again I
did not write this so please take it easy on me )

# These are the translations
$trans{"MO"} = "ST";
$trans{"TU"} = "ST";
$trans{"WE"} = "ST";
$trans{"TH"} = "ST";
$trans{"FR"} = "ST";
$trans{"SA"} = "ST";
$trans{"SU"} = "ST";
$trans{"AD"} = "ST";
$trans{"PP"} = "ST";
$trans{"RL"} = "ST";
$trans{"00"} = "ST";

# Save the input filename
$orig = $_ = shift;

# Run the translation
if (/(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig;
}

These three lines above appear to do the same thing as this
shown in alternation format, if that is easier to understand..
s/^(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1ST/;

If filename contains any of this set of characters starting at position 8,
or at position 11, only if the first 3 characters are #PP (any case)
then change these matched trans characters to "ST".

The g (global) switch is a little misleading, since the (.*) will use up
the rest of the line, so there can be only one s/// at most,
so I left it off.

The i (insensitive to case) switch is a little misleading,
since the test "and exists( $hash{ key })" will only pass if the
key exists, and the keys "MO", "TU" etc are all uppercase,
so I left it off. It does affect the PP, so I put in [pP][pP] to match that.

I almost wrote:
The (#PP)? is a little misleading, since it is optional,
so there are no such lines that will be matched or not-matched
based only on the (#PP?) criteria, so it might as well not exist.
But after testing, I see there is some interaction between regexp
and the exists. The regexp matches at the first opportunity,
so there is an implied ^ to bind to the front of the line.
I couldn't find any testcases where the trans characters
were matched other than at positions 8 or 11.

If the name does not match the pattern,
then $orig is still same as $_

# Move the file, even onto itself, if it was not renamed.
system("move $orig $_");

Possibly moving the file ontop of itself,
if it did not match the pattern.

The posting guidelines suggest showing a short but complete script,
but don't give any examples. Here is an example of a test script
to tickle this transformation so you can see what it might match
or not match, and explore for yourself.

use warnings; use strict; no warnings "uninitialized";
my %trans= qw(
MO ST TU ST WE ST TH ST FR ST SA ST SU ST AD ST PP ST RL ST 00 ST);
while(<DATA>){
chomp;
my $orig = $_ ; # = shift;
if ( /(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig;
}
printf "%-25s%-25s %s\n",$orig,$_,($orig eq $_ ? 'No':'Yes');
}
exit;
__DATA__
_NO___SUCHFILE.txt
1234567MO
1234567TU
1234567we
#PP1234567PP
#PP12345678PP.txt
PPPPPPPPPPP
123#PP1234567PP.txt
#pp1234567PP.txt
#PP#ppppppADSU

Output:
_NO____SUCHFILE.txt _NO___STCHFILE.txt Yes
1234567MO 1234567ST Yes
1234567TU 1234567ST Yes
1234567we 1234567we No
#PP1234567PP #PP1234567ST Yes
#PP12345678PP.txt #PP12345678PP.txt No
PPPPPPPPPPP PPPPPPPSTPP Yes
123#PP1234567PP.txt 123#PP1234567PP.txt No
#pp1234567PP.txt #pp1234567ST.txt Yes
#PP#ppppppADSU #PP#ppppppSTSU Yes

If the no warnings "uninitalized";
is commented out, then this warning is printed
for every matching line that has no #pp leader, for some reason.
Use of uninitialized value in concatenation (.) or string at trans1.pl line 8, <DATA> line 7.
 
J

John W. Krahn

falcon198198 said:
# Run the translation
if (/(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig;
}

These three lines above appear to do the same thing as this
shown in alternation format, if that is easier to understand..
s/^(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1ST/;

You just lost a part of the original string. And the OP's wasn't anchored so
it may not match correctly.

s/(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1$2ST/;


John
 
T

Tad McClellan

John W. Krahn said:
falcon198198 said:
# Run the translation
if (/(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig;
}

These three lines above appear to do the same thing as this
shown in alternation format, if that is easier to understand..
s/^(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1ST/;

You just lost a part of the original string. And the OP's wasn't anchored so
it may not match correctly.

s/(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1$2ST/;


And now you must remember to update the %trans keys in 2 places
should that ever change.

my $keysRE = join '|', sort {$b cmp $a} keys %trans; # untested
s/(#[pP][pP])?(.{7})($keysRE)/$1$2ST/;
 
J

Joel Graber

John W. Krahn said:
falcon198198 said:
# Run the translation
if (/(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig;
}

These three lines above appear to do the same thing as this
shown in alternation format, if that is easier to understand..
s/^(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1ST/;

You just lost a part of the original string.

s/(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1$2ST/;

Thanks for your correction. Thats what I meant.
And the OP's wasn't anchored so it may not match correctly.

As explained by Xho and my additional analysis and examples,
I think they are the same. If you can provide a counter example,
I'm sure it would increase my understanding of regexps.
 
J

jgraber

Tad McClellan said:
John W. Krahn said:
# Run the translation
if (/(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig;
}

These three lines above appear to do the same thing as this
shown in alternation format, if that is easier to understand..
s/^(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1ST/;

You just lost a part of the original string.
And the OP's wasn't anchored so it may not match correctly.
s/(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1$2ST/;

Additional analysis superceeding my previous followup to John.
I agree I lost the $2, it should be included.
The OP's was effectively anchored, as shown by example below,
so the anchor is required to match behavior.

s/^(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1$2ST/;
And now you must remember to update the %trans keys in 2 places
should that ever change.

my $keysRE = join '|', sort {$b cmp $a} keys %trans; # untested
s/(#[pP][pP])?(.{7})($keysRE)/$1$2ST/;

With a single regexp and single replacement string,
there is no need for the %trans at all, so there is only one place.

If there is some other need for %trans,
then your suggestion is a fine idea, but
Why do you suggest sort?

Apparently I was not clear that the purpose of my regexp
and the volumes of surrounding text was NOT to suggest
a different way to do it, but to address the OP's question,
Can someone help me understand ...
by providing similar alternatives that reduce obfu
of /g /i and missing anchor in the regexp.

The implied anchor was a surprise to me,
but the anchor is required to match the behavior,
as shown below.

use warnings; use strict; no warnings "uninitialized";
my %trans= qw(
MO ST TU ST WE ST TH ST FR ST SA ST SU ST AD ST PP ST RL ST 00 ST);
while(<DATA>){ chomp;
my $orig = $_ ; # = shift;
if ( /(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig; }
my $reg_and_exists = $_; $_=$orig;
s/(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1$2ST/;
my $reg_no_anchor = $_; $_ = $orig;
s/^(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1$2ST/;
my $reg_w_anchor = $_;
printf "%-25s orig\n%-25s reg and exists\n" .
"%-25s reg_no_anchor\n%-25s reg_w_anchor\n\n",
$orig,$reg_and_exists,$reg_no_anchor,$reg_w_anchor;
}
exit;
__DATA__
#PP1234567PP
#PP12345678PP.txt

**** output ****
#PP1234567PP orig
#PP1234567ST reg and exists
#PP1234567ST reg_no_anchor
#PP1234567ST reg_w_anchor

#PP12345678PP.txt orig
#PP12345678PP.txt reg and exists
#PP12345678ST.txt reg_no_anchor *** oops
#PP12345678PP.txt reg_w_anchor
 
T

Tad McClellan

s/^(#[pP][pP])?(.{7})(MO|TU|WE|TH|FR|SA|SU|AD|PP|RL|00)/$1$2ST/;

And now you must remember to update the %trans keys in 2 places
should that ever change.

my $keysRE = join '|', sort {$b cmp $a} keys %trans; # untested
s/(#[pP][pP])?(.{7})($keysRE)/$1$2ST/;

With a single regexp and single replacement string,
there is no need for the %trans at all, so there is only one place.


Doh! I knee-jerked into thinking of my usual case, where the
hash values (replacement strings) are not all identical as they
are in this OP's case.

If there is some other need for %trans,
then your suggestion is a fine idea, but
Why do you suggest sort?


Once bitten, twice shy. IOW, it avoids a bug that has gotten me before.

The sort is to ensure that prefixes of other strings come
"to the right" of the longer strings with that same prefix.

ie.: so that we match against /Tuesday|Tues/ instead of
against /Tues|Tuesday/ (because you'll never match "Tuesday"
with that 2nd one).

With a literal pattern, you can just pay attention to the order
of the alternatives when you write the pattern, no sort()ing needed.
 
A

Anno Siegel

falcon198198 said:
Can someone help me understand a part of this script. Personally I
would not have written the script this way but it is someone elses
system and I need to understand the code. This script is called in a
printing system and is renaming the 2 characters in the file name.

The line I am trying to understand is the translation part. (Again I
did not write this so please take it easy on me )

# These are the translations
$trans{"MO"} = "ST";
$trans{"TU"} = "ST";
$trans{"WE"} = "ST";
$trans{"TH"} = "ST";
$trans{"FR"} = "ST";
$trans{"SA"} = "ST";
$trans{"SU"} = "ST";
$trans{"AD"} = "ST";
$trans{"PP"} = "ST";
$trans{"RL"} = "ST";
$trans{"00"} = "ST";

There's nothing essentially wrong with this, but the hash setup shows your
author is not a very experienced Perl programmer. Besides the unnecessary
quotes around the hash keys, a small loop would have made it shorter and
less error-prone:

$trans{ $_} = 'ST' for qw( MO TU WE TH FR SA SU AD PP RL 00);

Anno
 
R

robic0

Can someone help me understand a part of this script. Personally I
would not have written the script this way but it is someone elses
system and I need to understand the code. This script is called in a
printing system and is renaming the 2 characters in the file name.

The line I am trying to understand is the translation part. (Again I
did not write this so please take it easy on me )

# These are the translations
$trans{"MO"} = "ST";
$trans{"TU"} = "ST";
$trans{"WE"} = "ST";
$trans{"TH"} = "ST";
$trans{"FR"} = "ST";
$trans{"SA"} = "ST";
$trans{"SU"} = "ST";
$trans{"AD"} = "ST";
$trans{"PP"} = "ST";
$trans{"RL"} = "ST";
$trans{"00"} = "ST";

# Save the input filename
$orig = $_ = shift;

# Run the translation
if (/(#PP)?.{7}(..).*/ig and exists($trans{$2})) {
s/(#PP)?(.{7})(..)(.*)/$1$2$trans{$3}$4/ig;
}

# Move the file
system("move $orig $_");

# Send to Oman
system("move $_ j:\ink");

if (/(#PP.{7})(.{2})(.*)/i) {
$_ = $1$trans{$2}$3;
# Move the file
system("move $orig $_");

# Send to Oman
system("move $_ j:\ink");
} else {die "error: $_";}

There can be no substitution for some kind of input you expect
the regexp to parse. Without it you have absolutely no legs to walk on...
 
R

robic0

if (/(#PP.{7})(.{2})(.*)/i) {
$_ = $1$trans{$2}$3; $_ = "$1$trans{$2}$3";
# Move the file
system("move $orig $_");

# Send to Oman
system("move $_ j:\ink");
} else {die "error: $_";}

There can be no substitution for some kind of input you expect
the regexp to parse. Without it you have absolutely no legs to walk on...
 
D

Dave Weaver

robic0 said:
$_ = "$1$trans{$2}$3";

Your code doesn't do the same as the original.

1. "#PP" isn't optional in your code

2. Your code will throw away all characters before the #PP,
the original preserves them

3. Your code does the translation regardless of whether the
translation table contains a suitable replacement.

Not what I'd call an improvement.
 
F

falcon198198

Thanks everyone, all of your examples and suggested reading have helped
me get ahold of this.
Never used the regex except for simple examples but now I am seeing the
power. Thanks once again for the help.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,999
Messages
2,570,243
Members
46,836
Latest member
login dogas

Latest Threads

Top