File::Stream confusion

P

Paul Lalli

I am attempting to use File::Stream to set $/ to a regular expression
that will match integers or floating point numbers. (This is a
learning exercise only - I am well aware of the various Regexp::Common
abilities to more robustly match numbers). For reasons I have not been
able to diagnose, the stream is not terminating on any integer value -
only on floating points.

Below is my short-but-complete script. I am using the DATA filehandle
below, but have seen the same results when used with an actual
filehandle.

#!/usr/bin/perl
use strict;
use warnings;
use File::Stream;

my $stream = File::Stream->new(*DATA);
local $/ = qr/\d+(?:\.\d+)?/;
#local $/ = qr/\d+\.?\d*/;

while (<$stream>){
print "Line: '$_'\n";
}

__DATA__
foo 35.3 bar 35 baz

Output as is:
Line: 'foo 35.3'
Line: ' bar 35 baz
'

If I switch the commented line, so that the regular expression is the
less-correct /\d+\.\d*/, these are the results:
Line: 'foo 35.3'
Line: ' bar 35'
Line: ' baz
'

Can someone help me to understand why the first version does not also
terminate a readline on the 35?

Much obliged,
Paul Lalli
 
I

it_says_BALLS_on_your forehead

Paul said:
I am attempting to use File::Stream to set $/ to a regular expression
that will match integers or floating point numbers. (This is a
learning exercise only - I am well aware of the various Regexp::Common
abilities to more robustly match numbers). For reasons I have not been
able to diagnose, the stream is not terminating on any integer value -
only on floating points.

Below is my short-but-complete script. I am using the DATA filehandle
below, but have seen the same results when used with an actual
filehandle.

#!/usr/bin/perl
use strict;
use warnings;
use File::Stream;

my $stream = File::Stream->new(*DATA);
local $/ = qr/\d+(?:\.\d+)?/;
#local $/ = qr/\d+\.?\d*/;

while (<$stream>){
print "Line: '$_'\n";
}

__DATA__
foo 35.3 bar 35 baz

Output as is:
Line: 'foo 35.3'
Line: ' bar 35 baz
'

If I switch the commented line, so that the regular expression is the
less-correct /\d+\.\d*/, these are the results:
Line: 'foo 35.3'
Line: ' bar 35'
Line: ' baz
'

Can someone help me to understand why the first version does not also
terminate a readline on the 35?

Much obliged,
Paul Lalli

Not really sure...what happens if you change the first one to:
local $/ = qr/\d+(?:\.\d+?)/;
 
M

Matt Garrish

Paul Lalli said:
I am attempting to use File::Stream to set $/ to a regular expression
that will match integers or floating point numbers. (This is a
learning exercise only - I am well aware of the various Regexp::Common
abilities to more robustly match numbers). For reasons I have not been
able to diagnose, the stream is not terminating on any integer value -
only on floating points.

Below is my short-but-complete script. I am using the DATA filehandle
below, but have seen the same results when used with an actual
filehandle.

#!/usr/bin/perl
use strict;
use warnings;
use File::Stream;

my $stream = File::Stream->new(*DATA);
local $/ = qr/\d+(?:\.\d+)?/;
#local $/ = qr/\d+\.?\d*/;

while (<$stream>){
print "Line: '$_'\n";
}

__DATA__
foo 35.3 bar 35 baz

Output as is:
Line: 'foo 35.3'
Line: ' bar 35 baz
'

I get the same output above regardless of which I use (WinXP, AS Build 813,
File-Stream v. 1.10).

Matt
 
P

Paul Lalli

Matt said:
I get the same output above regardless of which I use (WinXP, AS Build 813,
File-Stream v. 1.10).

Interesting. The results I'm seeing are for v5.8.7 built for
sun4-solaris-thread-multi, File::Stream 1.11. (Sorry for not
mentioning that in the original post)
 
P

Paul Lalli

it_says_BALLS_on_your forehead said:
Not really sure...what happens if you change the first one to:
local $/ = qr/\d+(?:\.\d+?)/;

The same un-desired results, which would actually make sense, as it's
now forced to match a period, unlike the original broken example.

Paul Lalli
 
M

Matt Garrish

Paul Lalli said:
Interesting. The results I'm seeing are for v5.8.7 built for
sun4-solaris-thread-multi, File::Stream 1.11. (Sorry for not
mentioning that in the original post)

Something evidently got broken in the newest release. First output is v.
1.10 and second is 1.11 using your code above:

E:\scripts>perl fstream.pl
Line: 'foo 35.3'
Line: ' bar 35'
Line: ' baz
'

E:\scripts>perl fstream.pl
Line: 'foo 35.3'
Line: ' bar 35 baz
'

Matt
 
A

A. Sinan Unur

Interesting. The results I'm seeing are for v5.8.7 built for
sun4-solaris-thread-multi, File::Stream 1.11. (Sorry for not
mentioning that in the original post)

Ditto for AS Perl 5.8.7 with File::Stream 1.11 on Windows.

There is gotta to be something funny going on in File::Stream->find, but
I cannot see it now.

Sinan
 
S

smueller

Hi Matt, hi Paul,

Matt said:
Something evidently got broken in the newest release. First output is v.
1.10 and second is 1.11 using your code above:

E:\scripts>perl fstream.pl
Line: 'foo 35.3'
Line: ' bar 35'
Line: ' baz
'

E:\scripts>perl fstream.pl
Line: 'foo 35.3'
Line: ' bar 35 baz
'

I'm the author of the File::Stream module. (I got Paul's RT ticket.
Thanks, by the way.)

I have run the example code with both 1.11 and 1.10 and for me, they
both turn out wrong as demonstrated by Paul. That is not surprising
since the difference between the two releases is marginal: I just added
a check for anchors (^ and $) which don't make sense in the context of
streams. I'll include a diff snippet for reference:

diff -r -u File-Stream-1.10/lib/File/Stream.pm
File-Stream-1.11/lib/File/Stream.pm
--- File-Stream-1.10/lib/File/Stream.pm 2003-10-12 14:18:34.000000000
+0200
+++ File-Stream-1.11/lib/File/Stream.pm 2004-03-02 13:50:58.000000000
+0100
[...]
@@ -160,7 +160,13 @@
my $str = '';
my $token;
while ($token = $yp->next()) {
- $str .= $token->string() .
+ my $tstr = $token->string();
+ if ($tstr eq '^' or $tstr eq '$') {
+ croak "Invalid use of anchors (here: '$tstr') in a ",
+ "regular expression that will be\n",
+ "applied to a stream";
+ }
+ $str .= $tstr .
'(?:\z(?{$End_Of_String++})(?!)|)';
}
qr/$str/;
[...]

The code should behave exactly the same unless the if applies. In that
case, we'd be seeing a fatal error.

Hence, I boldly suspect that the problem lies outside of the
File::Stream module. Not in the user code either which is too simple to
be the cause. It might be the dependencies of the module or the Perl
version.

Matt: Can you post the exact module and perl versions you used for the
successful and unsuccessful tests? In particular, I'm interested in the
version of YAPE::Regex and of the perl interpreter and platform you're
using.

For me, it's YAPE::Regex version 3.01 and perl 5.8.6 on linux/x86-64.

Thanks for the help. I'll try to keep an eye on the thread, but please
drop me a copy of any answer to (e-mail address removed)

Best regards,
Steffen Müller
 
S

smueller

Hi Paul,

since I sent the last comment on the issue in a reply to Matt Garrish,
I have investigated the problem. As suspected, it lies within
YAPE::Regex which I use to parse and modify the regexes. (File::Stream
does some *really* fancy things to make stuff work as expected.)

YAPE::Regex doesn't find the ? after the regex group. Don't ask me why.
Expect a File::Stream release which corrects the problem one way or
another within a week at the most.
I'm currently resolving the issue with the author of YAPE::Regex and
Regexp::parser (which is likely replacing YAPE::Regex), Jeff Pinyan.

Again, thanks for the bug report.

Steffen Müller
 
P

Paul Lalli

since I sent the last comment on the issue in a reply to Matt Garrish,
I have investigated the problem. As suspected, it lies within
YAPE::Regex which I use to parse and modify the regexes. (File::Stream
does some *really* fancy things to make stuff work as expected.)

YAPE::Regex doesn't find the ? after the regex group. Don't ask me why.
Expect a File::Stream release which corrects the problem one way or
another within a week at the most.
I'm currently resolving the issue with the author of YAPE::Regex and
Regexp::parser (which is likely replacing YAPE::Regex), Jeff Pinyan.

Steffen,

Thanks very much for the incredibly timely responses. Very
appreciated.

Paul Lalli
 
M

Matt Garrish

Hi Matt, hi Paul,
Matt said:
Something evidently got broken in the newest release. First output is v.
1.10 and second is 1.11 using your code above:

E:\scripts>perl fstream.pl
Line: 'foo 35.3'
Line: ' bar 35'
Line: ' baz
'

E:\scripts>perl fstream.pl
Line: 'foo 35.3'
Line: ' bar 35 baz
'

I'm the author of the File::Stream module. (I got Paul's RT ticket.
Thanks, by the way.)

I have run the example code with both 1.11 and 1.10 and for me, they
both turn out wrong as demonstrated by Paul. That is not surprising
since the difference between the two releases is marginal: I just added
a check for anchors (^ and $) which don't make sense in the context of
streams. I'll include a diff snippet for reference:

diff -r -u File-Stream-1.10/lib/File/Stream.pm
File-Stream-1.11/lib/File/Stream.pm
--- File-Stream-1.10/lib/File/Stream.pm 2003-10-12 14:18:34.000000000
+0200
+++ File-Stream-1.11/lib/File/Stream.pm 2004-03-02 13:50:58.000000000
+0100
[...]
@@ -160,7 +160,13 @@
my $str = '';
my $token;
while ($token = $yp->next()) {
- $str .= $token->string() .
+ my $tstr = $token->string();
+ if ($tstr eq '^' or $tstr eq '$') {
+ croak "Invalid use of anchors (here: '$tstr') in a ",
+ "regular expression that will be\n",
+ "applied to a stream";
+ }
+ $str .= $tstr .
'(?:\z(?{$End_Of_String++})(?!)|)';
}
qr/$str/;
[...]

The code should behave exactly the same unless the if applies. In that
case, we'd be seeing a fatal error.

Hence, I boldly suspect that the problem lies outside of the
File::Stream module. Not in the user code either which is too simple to
be the cause. It might be the dependencies of the module or the Perl
version.

Matt: Can you post the exact module and perl versions you used for the
successful and unsuccessful tests? In particular, I'm interested in the
version of YAPE::Regex and of the perl interpreter and platform you're
using.

Here you go:

<perl>
This is perl, v5.8.7 built for MSWin32-x86-multi-thread
(with 7 registered patches, see perl -V for more detail)

Copyright 1987-2005, Larry Wall

Binary build 813 [148120] provided by ActiveState http://www.ActiveState.com
ActiveState is a division of Sophos.
Built Jun 6 2005 13:36:37
</perl>

<YAPE>
ppm> query YAPE
Querying target 1 (ActivePerl 5.8.7.813)
1. YAPE-Regex [3.01] Yet Another Parser/Extractor for Regular
Expressions
</YAPE>

The successful run was done with v. 1.10 and the unsuccessful with v. 1.11
of File-Stream. The version of Perl and YAPE-Regex was the same for both
tests. I grabbed the source for v. 1.11 directly from CPAN as it wasn't
available using ppm, so no modules were updated or changed between runs
except for yours.

Matt
 
A

A. Sinan Unur

(e-mail address removed) wrote in
since I sent the last comment on the issue in a reply to Matt Garrish,
I have investigated the problem. As suspected, it lies within
YAPE::Regex which I use to parse and modify the regexes. (File::Stream
does some *really* fancy things to make stuff work as expected.)

YAPE::Regex doesn't find the ? after the regex group. Don't ask me
why. Expect a File::Stream release which corrects the problem one way
or another within a week at the most.
I'm currently resolving the issue with the author of YAPE::Regex and
Regexp::parser (which is likely replacing YAPE::Regex), Jeff Pinyan.

With File::Stream 2.00 and YAPE::Regex 3.02 (both released today), I get
the expected results with AS Perl 5.8.7 on Windows XP.

So, it looks like the bug has been fixed :)

Sinan
 
I

it_says_BALLS_on_your_forehead

I have just released a new version of File::Stream. It is availlable at
http://steffen-mueller.net/modules/File-Stream or soon also on CPAN.
The new version (2.00) requires a new version of YAPE::Regex, which
Jeff Pinyan will release really soon. (Or he might have done so
already.)

Steffen

today i had to employ a regular expression as an input record
separator. i had to download and install several modules, including the
ones mentioned above. just wanted to express my gratitude to Paul ( i
stole your example and used it for my own nefarious purposes ) and to
the folks at CPAN for their incredible dedication to Perl and the
community. i'm extremely impressed by the rapidity of response. thanks!
 
E

Eric J. Roode

today i had to employ a regular expression as an input record
separator. i had to download and install several modules, including the
ones mentioned above. just wanted to express my gratitude to Paul ( i
stole your example and used it for my own nefarious purposes ) and to
the folks at CPAN for their incredible dedication to Perl and the
community. i'm extremely impressed by the rapidity of response. thanks!

....and they say Open Source code is "unprofessional".

--
Eric
`$=`;$_=\%!;($_)=/(.)/;$==++$|;($.,$/,$,,$\,$",$;,$^,$#,$~,$*,$:,@%)=(
$!=~/(.)(.).(.)(.)(.)(.)..(.)(.)(.)..(.)......(.)/,$"),$=++;$.++;$.++;
$_++;$_++;($_,$\,$,)=($~.$"."$;$/$%[$?]$_$\$,$:$%[$?]",$"&$~,$#,);$,++
;$,++;$^|=$";`$_$\$,$/$:$;$~$*$%[$?]$.$~$*${#}$%[$?]$;$\$"$^$~$*.>&$=`
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,176
Messages
2,570,947
Members
47,501
Latest member
Ledmyplace

Latest Threads

Top