Cant it be done??

S

Steve

Hi,

Have got this niggly reg ex qvestion. I refuse to accept this can't
be done with a single regex, but can't see a way at the moment.

Lets say some temperature information about a datetime is stored
numerically as follows :

13112 22003 31143 42331 53651

Here, the 1st number of each "block" will always represent the "block
number", and between each block there is always a single whitespace
character.

So, this datetime value would represent :

Block 1 : 31/12 (dd/mm)
Block 2 : 2003, (yyyy)
Block 3 : 11:43, (hh:mm)
Block 4 : 2331 microseconds (MMMM)
Block 5 : 36.51 Degrees (TTTT) (units here are in 0.01 degrees)

So, the general form of the string to parse would be :

1ddmm 2yyyy 3hhmm 4MMMM 5TTTT.

So, as a first bash, the following would work ok.

/^
1(\d\d)(\d\d)\s
2(\d\d\d\d)\s
3(\d\d)(\d\d)\s
4(\d\d\d\d)\s
5(\d\d\d\d)
$/x

We could then use :

($dd, $mm, $yyyy, $hh, $mm, $MMMM, $TTTT) = ($1, $2, $3, $4, $5, $6,
$7)

to assign the values correctly.

Ok, here's the question. In Perl, is it possible to write a single
regular expression that will still match and still maintain the
variable assignments but allow for the absence of 1 or more of the
blocks?

So, for example we may (instead of our original string) get the
following.

13112 22003 31143 53651

or

13112 31143 42331 53651

or even

42331 53651

The problem I see is the fact that we need to use ()'s for quantifier
grouping AND for variable assignment in the same expression. I may
well be missing something BIG though!

Thanks in advance for your help.

Steve
 
B

Bob Walton

Steve wrote:

....

Have got this niggly reg ex qvestion. I refuse to accept this can't
be done with a single regex, but can't see a way at the moment.

Lets say some temperature information about a datetime is stored
numerically as follows :

13112 22003 31143 42331 53651

Here, the 1st number of each "block" will always represent the "block
number", and between each block there is always a single whitespace
character.

So, this datetime value would represent :

Block 1 : 31/12 (dd/mm)
Block 2 : 2003, (yyyy)
Block 3 : 11:43, (hh:mm)
Block 4 : 2331 microseconds (MMMM)
Block 5 : 36.51 Degrees (TTTT) (units here are in 0.01 degrees)

So, the general form of the string to parse would be :

1ddmm 2yyyy 3hhmm 4MMMM 5TTTT.

So, as a first bash, the following would work ok.

/^
1(\d\d)(\d\d)\s
2(\d\d\d\d)\s
3(\d\d)(\d\d)\s
4(\d\d\d\d)\s
5(\d\d\d\d)
$/x

We could then use :

($dd, $mm, $yyyy, $hh, $mm, $MMMM, $TTTT) = ($1, $2, $3, $4, $5, $6,
$7)

to assign the values correctly.

Ok, here's the question. In Perl, is it possible to write a single
regular expression that will still match and still maintain the
variable assignments but allow for the absence of 1 or more of the
blocks?

So, for example we may (instead of our original string) get the
following.

13112 22003 31143 53651

or

13112 31143 42331 53651

or even

42331 53651

The problem I see is the fact that we need to use ()'s for quantifier
grouping AND for variable assignment in the same expression. I may
well be missing something BIG though! ....


Steve

Try:

use strict;
use warnings;
while(<DATA>){
my($dd,$mm,$yyyy,$hh,$min,$MMMM,$deg,$degf)=$_=~/
(?:1(\d\d)(\d\d)\s)?
(?:2(\d{4})\s)?
(?:3(\d\d)(\d\d)\s)?
(?:4(\d{4})\s)?
(?:5(\d\d)(\d\d))
/x;
{no warnings 'uninitialized';
print "dd=$dd, mm=$mm, yyyy=$yyyy, hh=$hh, min=$min,
MMMM=$MMMM, deg=$deg.$degf\n";
}
}
__END__
13112 22003 31143 42331 53651
13112 22003 31143 53651
13112 31143 42331 53651
42331 53651
 
C

Chief Squawtendrawpet

Steve said:
Have got this niggly reg ex qvestion. I refuse to accept this can't
be done with a single regex, but can't see a way at the moment.

Maybe not what you had in mind, but how about a single regex taking
advantage of the /g option?

$^W = 1;
use strict;
my (@block);

for my $datetime (<DATA>){
@block = ();
$block[$1] = $2 while $datetime =~ /(\d)(\d{4})/g;

# Check results.
print $datetime;
for (1..5){
print " $_ = $block[$_]\n" if defined $block[$_];
}
}

__END__
13112 22003 31143 42331 53651
13112 22003 31143 53651
13112 31143 42331
42331 53651


Here's the output:

13112 22003 31143 42331 53651
1 = 3112
2 = 2003
3 = 1143
4 = 2331
5 = 3651
13112 22003 31143 53651
1 = 3112
2 = 2003
3 = 1143
5 = 3651
13112 31143 42331
1 = 3112
3 = 1143
4 = 2331
42331 53651
4 = 2331
5 = 3651
 
J

John Bokma

Chief said:
Steve said:
Have got this niggly reg ex qvestion. I refuse to accept this can't
be done with a single regex, but can't see a way at the moment.


Maybe not what you had in mind, but how about a single regex taking
advantage of the /g option?

$^W = 1;
use strict;
my (@block);

for my $datetime (<DATA>){
@block = ();
$block[$1] = $2 while $datetime =~ /(\d)(\d{4})/g;

%hash = ()
$hash{$1} = $2 while ...
# Check results.
print $datetime;
for (1..5){
print " $_ = $block[$_]\n" if defined $block[$_];
}

foreach my $key (sort { $a <=> $b } keys %$hash) {
print " $key = $hash{$key}\n";
}

not tested
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,293
Messages
2,571,505
Members
48,192
Latest member
LinwoodFol

Latest Threads

Top