eof and nested while (<$fh>) {...}

Greg Bacon · Jun 24, 2004

I was writing code to scan an assembly-language definition of
operational data and produce a report and ended up writing code
that gave me the "there has to be a better way" feeling.

Single parameters are easy to spot, e.g.,

label1 .word 1234ABCDh
label2 .float 3.14159

Most arrays are trivial too:

label3 .word 1, 2, 3

Array specifications can span multiple lines, however. For example:

label4 .float 0.0, 0.5, 1.0
.float 1.5, 2.0, 2.5

At first, I used a regular expression to feed individual values into
a sub that kept track of the last label grabbed and determined whether
the current value was a new parameter or a continuation of an array.

The code -- and the approach, really -- was unsatisfying, so I
considered a two-pass scan: grab and decompose the chunks and then
coalesce the arrays in a second pass. I made a start in that
direction but didn't like the way it was playing out.

I saw that scanning an entire array would be straightforward too.
I could safely look ahead. Lines without labels continued the
current array, and I could pretend the values were on one line by
appending to the end of what I've already recognized.

If the lookahead line had a label, I could process what I had and then
C<redo> to process the lookahead line that's already in $_.

Here's a sketch of the code:

while (<$fh>) {
next unless /^(\w+)\s+\.(word|float)\s+(.+?),?\s*$/;
my($label,$type,$data) = ($1,$2,$3);

# look for continued spec
my $needredo = 0;
while (<$fh>) {
if (/^\s*\.(word|float)\s+(.+?),?\s*$/) {
$data .= ", $2";
}
else {
$needredo = 1;
}
}

# now $label, $type, and $data comprise an
# entire parameter
...;

redo if $needredo;
}

That's already kind of klunky, but I also saw that the inner while loop
will exhaust the input, which the outer loop's implicitly tests too. I
tested for C<eof $fh> at each iteration of the inner loop and reset
$needredo if I needed to C<last> out of the inner loop.

The code now feels very klunky. Is there a more elegant way to code
this scan?

Greg

Steven Kuo · Jun 24, 2004

I was writing code to scan an assembly-language definition of
operational data and produce a report and ended up writing code
that gave me the "there has to be a better way" feeling.

Single parameters are easy to spot, e.g.,

label1 .word 1234ABCDh
label2 .float 3.14159

Most arrays are trivial too:

label3 .word 1, 2, 3

Array specifications can span multiple lines, however. For example:

label4 .float 0.0, 0.5, 1.0
.float 1.5, 2.0, 2.5

(snipped)

Here's a sketch of the code:

while (<$fh>) {
next unless /^(\w+)\s+\.(word|float)\s+(.+?),?\s*$/;
my($label,$type,$data) = ($1,$2,$3);

# look for continued spec
my $needredo = 0;
while (<$fh>) {
if (/^\s*\.(word|float)\s+(.+?),?\s*$/) {
$data .= ", $2";
}
else {
$needredo = 1;
}
}

# now $label, $type, and $data comprise an
# entire parameter
...;

redo if $needredo;
}

That's already kind of klunky, but I also saw that the inner while loop
will exhaust the input, which the outer loop's implicitly tests too. I
tested for C<eof $fh> at each iteration of the inner loop and reset
$needredo if I needed to C<last> out of the inner loop.

The code now feels very klunky. Is there a more elegant way to code
this scan?

I don't think nested loops are needed. How about:

#!/usr/local/bin/perl
use strict;
use warnings;

my ($label, $type, $data);

while (<DATA>) {
if (/^\s*\.(word|float)\s+(.+?),?\s*$/) {
$data .= ", $2";
} elsif (/^(\w+)\s+\.(word|float)\s+(.+?),?\s*$/) {
do_stuff($label, $type, $data) if ($label); # previously found label
($label, $type, $data) = ($1, $2, $3);
}

}

do_stuff($label, $type, $data) if ($label);

sub do_stuff {
print "$label, $type, $data\n";
}

__DATA__

label1 .word 1234ABCDh
label2 .float 3.14159

label3 .word 1, 2, 3

label4 .float 0.0, 0.5, 1.0
.float 1.5, 2.0, 2.5

Anno Siegel · Jun 29, 2004

Greg Bacon said:
I was writing code to scan an assembly-language definition of
operational data and produce a report and ended up writing code
that gave me the "there has to be a better way" feeling.

Single parameters are easy to spot, e.g.,

label1 .word 1234ABCDh
label2 .float 3.14159

Most arrays are trivial too:

label3 .word 1, 2, 3

Array specifications can span multiple lines, however. For example:

label4 .float 0.0, 0.5, 1.0
.float 1.5, 2.0, 2.5

[snip]

Ah, ye olde continuation line problem. Subtype 2, where you know if a
line *is* a continuation but not if a line *has* a continuation.

Here is one way of doing that. A continuation line is one that starts
with 10 blanks.

my $coll = '';
while ( <DATA> ) {
chomp;
if ( substr( $_, 0, 10) =~ /\S/ ) {
print "$coll\n" if length $coll;
$coll = $_;
} else {
$coll .= $_;
}
}
print "$coll\n" if length $coll;

This only collects continued lines into one. It would be simple
to add further processing to the loop so that it spits out ready-
to-use records.

Anno

Exit the infinity while loop by pressing the button and continue with the switch element.	2	Apr 21, 2024
While loop unclear, can someone help?	4	Dec 6, 2023
C99 Seg fault on while(), why ?	0	Sep 13, 2022
String and list error while running a Markov Chain	1	Aug 26, 2020
Forcing list context on <$fh>	23	Feb 26, 2009
While Loop Freezing?	1	Feb 20, 2021
Reversing output of user input by using while loop...	2	Sep 1, 2022
EOF	3	Aug 22, 2008

eof and nested while (<$fh>) {...}

Greg Bacon

Steven Kuo

Anno Siegel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads