File::Find calls "wanted" before calling "preprocess"

Y

yary

File::Find always calls "wanted" with "." before calling preprocess
for the first directory. I can work around it, but it doesn't seem
right: the docs say "Your preprocessing function is called after
readdir(), but before the loop that calls the wanted() function."

Or is there a subtle reason why find calls "wanted" with ".", before
letting preprocess change the list of entries, which I should be
appreciating?

A simple program illustrating the order of calls-

#!/usr/bin/perl
use File::Find;
sub wanted { print " I found entry $_\n" }
sub pre { print " Entering $File::Find::dir\n"; @_}
sub post {print " leaving $File::Find::dir\n";}
find ({wanted => \&wanted, preprocess => \&pre, postprocess =>
\&post },'.');
__END__

in an empty dir, produces:
I found entry .
Entering .
leaving .

I expected "Entering ." to be the first line. Tried this on different
machines with perl 5.8 and 5.10.0 with the same results.

thanks for any elucidation

-y
 
J

J. Gleixner

yary said:
File::Find always calls "wanted" with "." before calling preprocess
for the first directory. I can work around it, but it doesn't seem
right: the docs say "Your preprocessing function is called after
readdir(), but before the loop that calls the wanted() function."

Or is there a subtle reason why find calls "wanted" with ".", before
letting preprocess change the list of entries, which I should be
appreciating?

A simple program illustrating the order of calls-

#!/usr/bin/perl
use File::Find;
sub wanted { print " I found entry $_\n" }
sub pre { print " Entering $File::Find::dir\n"; @_}
sub post {print " leaving $File::Find::dir\n";}
find ({wanted => \&wanted, preprocess => \&pre, postprocess =>
\&post },'.');
__END__

in an empty dir, produces:
I found entry .
Entering .
leaving .

I expected "Entering ." to be the first line. Tried this on different
machines with perl 5.8 and 5.10.0 with the same results.

thanks for any elucidation

-y

You do have access to the source code to see what's going on.

It shows:
...
@filenames = readdir DIR;
closedir(DIR);
@filenames = $pre_process->(@filenames) if $pre_process;

That should make it clear that 'pre_process' can change
@filenames before it does anything with the file names.
 
Y

yary

You do have access to the source code to see what's going on.

It shows:
...
@filenames = readdir DIR;
closedir(DIR);
@filenames = $pre_process->(@filenames) if $pre_process;

That should make it clear that 'pre_process' can change
@filenames before it does anything with the file names.

That part of the source isn't relevant to my question. Thanks for your
attention though. I have looked through the code, and suppose the
question could be re-worded "why does _find_opt call (via _find_dir)
$wanted_callback before calling $pre_process? That contradicts its
documentation for the preprocess argument."
 
B

Ben Morrow

Quoth yary said:
File::Find always calls "wanted" with "." before calling preprocess
for the first directory. I can work around it, but it doesn't seem
right: the docs say "Your preprocessing function is called after
readdir(), but before the loop that calls the wanted() function."

In this case '.' is not obtained from readdir, but from the argument
list, so you don't get to preprocess it at all. If you wanted to, you
should have done it first :).
A simple program illustrating the order of calls-

#!/usr/bin/perl
use File::Find;
sub wanted { print " I found entry $_\n" }
sub pre { print " Entering $File::Find::dir\n"; @_}
sub post {print " leaving $File::Find::dir\n";}
find ({wanted => \&wanted, preprocess => \&pre, postprocess =>
\&post },'.');
__END__

in an empty dir, produces:

Calls in brackets:

[stat '.'
I found entry . [readdir '.']
Entering .
[wanted for every entry in '.', that is, none]
leaving .

So preprocess is called between readdir and the 'wanted' loop, as the
docs say.

Ben
 
R

Randal L. Schwartz

yary> That part of the source isn't relevant to my question. Thanks for your
yary> attention though. I have looked through the code, and suppose the
yary> question could be re-worded "why does _find_opt call (via _find_dir)
yary> $wanted_callback before calling $pre_process? That contradicts its
yary> documentation for the preprocess argument."

"preprocess"
The value should be a code reference. This code refer-
ence is used to preprocess the current directory. The
name of the currently processed directory is in
$File::Find::dir. Your preprocessing function is called
after "readdir()", but before the loop that calls the
"wanted()" function. It is called with a list of
strings (actually file/directory names) and is expected
to return a list of strings. The code can be used to
sort the file/directory names alphabetically, numeri-
cally, or to filter out directory entries based on
their name alone. When follow or follow_fast are in
effect, "preprocess" is a no-op.

Note. Called *after* readdir(). As noted in another thread,
no call to readdir() is necessary to obtain your ".".

Docs trump your understanding. Time to realign your understanding. :)

print "Just another Perl hacker,"; # the original
 
M

Michele Dondi

File::Find always calls "wanted" with "." before calling preprocess
for the first directory. I can work around it, but it doesn't seem
right: the docs say "Your preprocessing function is called after
readdir(), but before the loop that calls the wanted() function."

Or is there a subtle reason why find calls "wanted" with ".", before
letting preprocess change the list of entries, which I should be
appreciating?

A simple program illustrating the order of calls-

#!/usr/bin/perl
use File::Find;
sub wanted { print " I found entry $_\n" }
sub pre { print " Entering $File::Find::dir\n"; @_}
sub post {print " leaving $File::Find::dir\n";}
find ({wanted => \&wanted, preprocess => \&pre, postprocess =>
\&post },'.');

Because '.' is passed verbatim by you, not retrieved with readdir().
One workaround to do what you want would be File::Find to be patched
in such a way that it has a getentries() routine after which
preprocess is called. getentries() would return the list of supplied
arguments on top level, and readdir()'s output otherwise. Personally I
don't see much need for this additional level of indirection.
Actually, I have this dup files removal tiny utility of mine in which
I use preprocess() to sort the entries, because... well I like it like
that, BUT not on toplevel, so that if I say

rmdups more_important less_important

then files in more_important are given some priority over
less_important's, if you get what I mean...


Michele
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,810
Latest member
Kassie0918

Latest Threads

Top