building generators in Perl

R

Rainer Weikusat

NB: This text is partially motivated by the fairly usesless (in this
particular respect) coroutine, closure and generator articles in
Wikipedia ("Perl supports this *if* you download the following 10E24
modules ...").

In case someone's not familiar with the concept, a 'generator' is
something which, when invoked, returns the 'next value' from some
ordered set of values. An extremly simple-minded example would be
(using the 'state' feature)

sub powers_of_2
{
state $next = 1;
my $cur;

$cur = $next;
$next *= 2;

return $cur;
}

which will return all members of the set of 'powers of 2' from first
to last provided it is called an infinite number of times (or any
finite subset of this set for a finite number of calls). This is
obviously similar to a lazily evaluated sequence (a term I've read
here and there).

This is, of course, not very useful, as contrived examples are wont to
be (and it is also less efficient than calculating the values in a
loop).

The most interesting feature of generators is that they can be
used to arrange independent subroutines in a 'processing pipeline' in
order to perform a multi-step transformation of some set of 'input
data items' to a set of 'output data items' while isolating the code
for each step in its own subroutine. A real-word example I had to deal
with yesterday was in some code supposed to preprocess a set of 'port
numbers' (as in 'TCP' or 'UDP') including port ranges such that the
output set is composed of a set of distinct ports or port ranges which
is equivalent (contains the same ports) to the input set. Further,
while the input set (produced by some other, intermediate processing
step) represented individual ports as 'ranges' with an identical start
and end value, the output set should contain bare numbers for these
cases and 'ranges' (represented as anonymous arrays with two elements)
only when the corresponding input element was actually a range
encompassing more than one port number.

The first implementation of that looked like this:

sub dedupe_ports
{
my ($cur, @in, @out);

@in = sort { $a->[0] <=> $b->[0] } @{$_[0]};
$cur = shift(@in);
for (@in) {
if ($_->[0] <= $cur->[1] + 1) {
$cur->[1] = $_->[1] if $_->[1] > $cur->[1];
next;
}

push(@out, $cur->[1] > $cur->[0] ? $cur : $cur->[1]);

$cur = $_;
}

push(@out, $cur->[1] > $cur->[0] ? $cur : $cur->[1]);
return \@out;
}

But I really didn't like the way the two separate processing steps of
'merging overlapping ranges' and 'collapsing single-port ranges' where
intermingled here, especially as I originally forgot the collapsing step
for the duplicated final push.

One way to deal with that is to use a generator to create a sequence
of merged ranges and apply the 'collapsing' step to the set of output
values produced by that. The subroutine included below creates a
closure which is such a generator:

sub range_merger
{
my (@in, $cur);

@in = sort { $a->[0] <=> $b->[0] } @{$_[0]};
$cur = shift(@in);
return sub {
my ($next, $res);

$next->[1] > $cur->[1] and $cur->[1] = $next->[1]
while ($next = shift(@in)) && $next->[0] <= $cur->[1] + 1;

$res = $cur;
$cur = $next;

return $res;
};
}

Whenever the returned closure is invoked, it will return the next
isolated, distinct range (my English feels somewhat lacking here)
aggregated from the elements of the anonymous array passed as input to
the ranger_merger subroutine, until all elements of that have been
processed. Afterwards, it will return an undefined value. This
generator can now be used by the collapsing subroutine,

sub dedupe_ports
{
my (@out, $merge_ranges, $range);

$merge_ranges = &range_merger;

push(@out, $range->[1] > $range->[0] ? $range : $range->[0])
while $range = $merge_ranges->();
return \@out;
}

in order to calculate the desired set of output values without code
duplication/ special-casing the final element case.
 
R

Rainer Weikusat

bugbear said:

This is a slightly different way to represent a lazily evaluated
infinite sequence (which was not the point of my text) and actually, I
have 'Higher Order Perl' (I assume you'll understand the reference)
sitting on a bookshelf behind me. But that's another book full of
contrived examples and usually, badly written ones as well. Also, at
least judgeing from the index, it doesn't refer to generators at all.
Lastly, I strongly suspect that most people neither own the book nor
know what happened to be published in the Perl Journal sixteen years
ago (taking Wikipedia as measure of common knowledge, this seems to be
a certainty).

Apart from that, what's the point of your posting? This stuff was by
no means new by the time Mr Dominus got paid for writing articles
about it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,699
Latest member
AnneRosen

Latest Threads

Top