Common file operations

  • Thread starter Shmuel (Seymour J.) Metz
  • Start date
S

Shmuel (Seymour J.) Metz

on 10/27/2004 said:
I happen to know that the Win32 modules allow you to retreive te
current drive and current dir separately;

Looking through the source code for several modules I found
Cwd::abs_path; what are the tradeoffs between that and
File::Spec->rel2abs?
Which File::Spec does for you.

I will add a few lines to my test code to verify that it does what I
expect.
In that case you want to use glob *after* rel2abs:
OK.

You may also be interested in File::DosGlob,

It doesn't work with strict.

Thanks.

Whoops: rel2abs has a bug, so I'll use abs_path:

rel2abs(g:tsm\x)=H:/comm/tsm/x

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to (e-mail address removed)
 
S

Shmuel (Seymour J.) Metz

on 10/27/2004 said:
Hmmm, maybe I should apologize... but I think you misunderstood the
sense of my cmt: it was intended ironic, sarcastic maybe, but
certainly not offensive. If, inadvertently, it was, then I
apologize.

And I apologize for getting your sex wrong.
What I meant is that at all effects what you (not too clearly)
described seemed a good task for glob().

"the complete path" seems clear enough.
To be precise there's no reason why it shouldn't work.

The documentation doesn't claim that it returns a complete[1] path,
and an experiment shows that it doesn't. Have you seen code that
causes glob to return a complete path?
If for any reason it didn't work for me,
then I'd suspect about me having done something wrong, however
deeply hidden the error may be, rather than blaming glob().

I wasn't "blaming" glob, simply noting that it didn't have the
functionality that I wanted.
minimal example

You've already commented on my sample code in a later article.
D'Oh! I'm sorry to inform you that your perl installation is
broken...

So it would seem, but since I rely on documentation[1] built from the
same POD, it hasn't been an issue for me.
Sorry, but I think that you didn't clearly say what it is that you
want.

1. If I have a partial file name, how do I get the complete path?

The answer seems to be Cwd::abs_path

2. If I have a directory name and a file specification, how do I find
all files in that directory matching the specification. File::Find
and issuing an ls command seem like overkill. I could use readdir
if I don't need a recursive search, but I was hoping for an
equivalent of SysFileTree in OS/2.

There doesn't seem to be a single service that will handle recursion.
File::Find as documented doesn't do the matching. What I'm looking for
is a directory tree-walking function that applies matching criteria
and only returns the files meeting those criteria. If I drop the
requirement for recursion then calling glob with the output of
abs_path would work.
Care to expand?

The term "dead tree" is a slang term for hard copy (printed)
documentation. Much of the information in the online documentation is
available in "Programming Perl" and "Perl in a Nutshell" from
O'Reilly.

[1] Assuming that the input didn't have a complete path.

[2] Specifically, a .inf file for the OS/2 view command.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to (e-mail address removed)
 
S

Shmuel (Seymour J.) Metz

on 10/27/2004 said:
BTW: do you really need -T?

Probably not; I just assumed that it was prudent to make a habit of
it.
Also, with modern perls it's better to
use warnings;

Do you mean instead of the taint flag?
Huh?!? Don't do this! Well, if you really really like...

A. Sinan Unur already commented on it; if it's consider poor Perl
style then I won't do it.

Hmmm, then I'd rather do (somthing like):
@ARGV == 2 or die "Usage: $0 <wildcard> <file>\n";

ITYM in additon to the existing code.
Not a real issue, but are you sure you want to print all these info?

The entire file is strictly for investigating the actual behavior of
various Perl functions; those data tell me whether the functions do
what I expect. I won't be putting anything like that in production
code, except possibly under the control of a debugging flag.
Hey, and you said you had never heard about regexen...

Correct. Check the spelling ;-)
However... you're just removing possible C<'>s from at ^ or $,
right?

Yes. I need to be able to pass command arguments containing * without
the shell interpreting the * as a wildcard. The only other technique
that I came up with was keying the argument with \* instead of *,
which I would have found to be more of a nuisance than the
apostrophes.
Then I suggest you do something like
s/^'//, s/'$// for $dir;

Thanks. In this case efficiency is irelevant, but if I need to do
something similar inside a loop, is the clear version as fast as the
other?
Oh, this one too, then why not doing them both in one run?

It was incremental Q&D code.
Also, still if you really *do* want to print all those info, then
for clarity resons you may consider an HERE doc instead.
EXPN?

This is not required
Thanks.

You may have used
@dirs=grep -d, glob $dir;

Thanks. Although I'd still need to test that the count was 1.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to (e-mail address removed)
 
B

Bradd W. Szonye

Shmuel said:
[2] OS/2 uses \ as a separator rather than /.

Michele said:
Well, I don't know OS/2, and so I cannot tell about Perl under OS/2,
but under DOS/Windows the directory separator is \ too, however Perl
lets you use / anyway. I *guess* that it may be the same for OS/2.

I suspect that OS/2 uses the same rules as MS-DOS and Windows: While
command-line utilities conventionally use backslashes as path
separators, system calls accept both slashes and backslashes.
 
M

Michele Dondi

If he doesn't believe me, then why would he believe that the test
output came from my computer? He didn't simply say that she wanted to
see the test code and output, she called me a liar.

It-was-a-joke! I have already explained that my cmt was not intended
to be offensive, and I even apologized in case it inadvertently was.
Would you mind reading the other posts too?!?

I didn't call you a liar. But your claim as it was stated is
unsupported and would require more input, precisely in the form of a
(possibly minimal) example exhibiting the problem.

You said that I made up my mind. It seems to me that the one who has
mad up his mind is just you, precisely on the alleged fact that "I
don't believe you". To me you're nothing more than a name for now,
"Shmuel (Seymour J.) Metz", and certainly I don't judge you and mark
you with a label once and forever by a single claim of yours.

Oh, and I apologize for having tried to help you too...


Michele
 
J

John W. Kennedy

Bradd said:
I suspect that OS/2 uses the same rules as MS-DOS and Windows: While
command-line utilities conventionally use backslashes as path
separators, system calls accept both slashes and backslashes.

Yes. In these matters, OS/2 is pretty much identical to MS-DOS and Windows.

(Part of the confusion appears to be that this discussion is being done
in terms of REXX, IBM's scripting language.)
 
B

Ben Morrow

Quoth "Shmuel (Seymour J.) Metz said:
There doesn't seem to be a single service that will handle recursion.
File::Find as documented doesn't do the matching. What I'm looking for
is a directory tree-walking function that applies matching criteria
and only returns the files meeting those criteria.

File::Find::Rule.

Ben
 
B

Ben Morrow

Quoth "Shmuel (Seymour J.) Metz said:
Probably not; I just assumed that it was prudent to make a habit of
it.

Well, it certainly won't do any harm :).
Do you mean instead of the taint flag?

No, instead of -w. warnings is a replacement, that allows fine-grained
lexically-scoped control of warnings.
ITYM in additon to the existing code.

Yup, he does.
Correct. Check the spelling ;-)

regex is the usual spelling in this group, and regexen the usual plural.
Yes. I need to be able to pass command arguments containing * without
the shell interpreting the * as a wildcard. The only other technique
that I came up with was keying the argument with \* instead of *,
which I would have found to be more of a nuisance than the
apostrophes.

....and your shell doesn't do proper quoting and remove the 's for you,
right?
Thanks. In this case efficiency is irelevant, but if I need to do
something similar inside a loop, is the clear version as fast as the
other?

Which are you calling the clear version? I (and most Perl programmers)
would call Michele's clearer than yours.

See <<EOF under 'Regexp Quote-like Operators' in perlop.

Ben
 
A

A. Sinan Unur

1. If I have a partial file name, how do I get the complete path?

The answer seems to be Cwd::abs_path

It seems like you are using "partial name" to mean something other than I
would naturally expect it to mean. abs_path has nothing to do with partial
filenames. It converts a potentially relative path to an absolute path.
However, the file in question is fully identified.

On the other hand, when you say partial, one thinks of a file specification
such as te??.p* or even c:\dir\path\more\te??.p* Both of those patterns are
partial file names.
2. If I have a directory name and a file specification, how do I find
all files in that directory matching the specification. File::Find
and issuing an ls command seem like overkill. I could use readdir
if I don't need a recursive search, but I was hoping for an
equivalent of SysFileTree in OS/2.

There doesn't seem to be a single service that will handle recursion.
File::Find as documented doesn't do the matching.

See File::Find::Rule

Feel free to search CPAN next time.

Sinan
 
A

Arndt Jonasson

Ben Morrow said:
regex is the usual spelling in this group, and regexen the usual plural.

[totally off-topic:]

"Regexen" with the same stress pattern as "oxen"? Or with the stress
on the first 'e'? (Of course, "in this group" we only write, not talk,
but one has to pronounce these things occasionally, I suppose.)
 
M

Michele Dondi

1. If I have a partial file name, how do I get the complete path?

It depends on what you mean with "partial file name"...
The answer seems to be Cwd::abs_path

Indeed, since you want to get the "complete path", this could well be
the right tool. Only it's not clear to me if it *is* what you were
looking for, as you still write "seems".
and issuing an ls command seem like overkill. I could use readdir
if I don't need a recursive search, but I was hoping for an
equivalent of SysFileTree in OS/2.

Unfortunately I'm not much familiar with SysFileTree (I guess it's a
system call, or more probably a system library call, isn't it?) and so
I guess are most other people here. What does it return?
There doesn't seem to be a single service that will handle recursion.
File::Find as documented doesn't do the matching. What I'm looking for

File::Find as documented lets *you* do whatever you like, including
matches. This is why in my first post I asked you if you knew about
regexen:

#!/usr/bin/perl -l

use strict;
use warnings;
use File::Find;

@ARGV=grep { -d or !warn "`$_': no such directory\n" } @ARGV;
die "Usage: $0 <dir> [<dirs>]\n" unless @ARGV;

find sub {
return unless /metz/i and -f;
print $File::Find::name;
}, @ARGV;

__END__
is a directory tree-walking function that applies matching criteria
and only returns the files meeting those criteria. If I drop the
requirement for recursion then calling glob with the output of
abs_path would work.

Again, File::Find seems to be the right tool for this. It will let you
walk through one or more directory trees and you can check for
yourself which files match your criteria, as per the simplicistic
example above, but probably in a more elaborate way.
[2] Specifically, a .inf file for the OS/2 view command.
^^^
^^^

What does this footnote refer to?


Michele
 
M

Michele Dondi

Do you mean instead of the taint flag?

Instead of -w.
ITYM in additon to the existing code.

Yes, but also instead of the cmt explaining what should be in @ARGV.
Thanks. In this case efficiency is irelevant, but if I need to do
something similar inside a loop, is the clear version as fast as the
other?

IMHO even in a loop efficiency issues related to this kind of things
would be most probably irrelevant. Experience shows that rarely big
efficiency gains stem from such micro-optimizations.

However I *think* that as the regexen are much simpler, it should be
actually faster. I don't know if the C<for> statement modifier, used
only for topicalization here, adds any overhead.

Well, it's not just the same (hope it's fair to me to point this out),
but we can at least benchmark the two substitution solutions:


#!/usr/bin/perl

use strict;
use warnings;
use Benchmark qw/:all/;

cmpthese 500_000, {
single => sub {
my @a=qw/foo 'bar baz' 'foobarbaz'/;
s/'?([^\']*)'$/$1/ for @a;
},

double => sub {
my @a=qw/foo 'bar baz' 'foobarbaz'/;
s/^'//, s/'$// for @a;
}
};

__END__


Rate single double
single 85763/s -- -44%
double 154321/s 80% --


Indicating that "my" solution *may* be 80% faster than "yours". Also,
search 'WARNING' in perldoc perlre. You may be interested in reading
that.
It was incremental Q&D code.

BTW: "Q&D"? (Sorry, I'm not a native English speaker.)

perldoc perlop

However, wrt the code you posted, something like:

print <<"EOF";
BEFORE:
\$dir = $dir
\$file = $file

AFTER:
\$dir = @{[ rmquotes $dir ]}
\$file = @{[ rmquotes $file ]}
EOF

sub rmquotes {
s/^'//, s/'$// for @_;
@_;
}


C:\TEMP>perl metz.pl 'foo' 'bar'
BEFORE:
$dir = 'foo'
$file = 'bar'

AFTER:
$dir = foo
$file = bar


Michele
 
M

Michele Dondi

Which are you calling the clear version? I (and most Perl programmers)
would call Michele's clearer than yours.

I think it's clear enough he means "mine". Only he's concerned about
possible efficiency issues.


Michele
 
S

Shmuel (Seymour J.) Metz

on 10/29/2004 said:
Yes, but also instead of the cmt explaining what should be in @ARGV.

Actually, that was code that was commented out.
Indicating that "my" solution *may* be 80% faster than "yours".

Okay, I've changed the code to

my $dir=shift;
s/^'//, s/'$// for $dir;
print '$dir =',"$dir\n";
print '@ARGV=(',join(',',@ARGV),")\n";
my @files=@ARGV;
my @dirs=grep -d, glob abs_path($dir);
if $dirs==0 {
die "$dir dosn't match any directory names:\n";
}
elsif $dir>1 {
print "$dir matches multiple directory names:\n";
die join " \n" @dirs, "\n";
}
Also, search 'WARNING' in perldoc perlre.

Are you referring to the issue of $1 versus \1?
BTW: "Q&D"? (Sorry, I'm not a native English speaker.)

Quick and dirty, IOW, a hack.
perldoc perlop

Thanks. It turned out to be easier to find it in the dead tree. The
code in question was strictly for debugging, and the messages are
terse because that's all I need for my purposes; I certainly won't put
them into production code. Is it considered good style to use a
here-doc when the output is only a line or two?
However, wrt the code you posted, something like:

If I were adding snapshot code to a production routine then I'd
probably do something like that, but what I've posted so far was
intended to be thrown out once I had everything nailed down. I agree
that were I to add more verbose diagnostics then the here-doc would be
cleaner. And, in fact, I've used that elsewhere; I just didn't
remember the nomenclature.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to (e-mail address removed)
 
S

Shmuel (Seymour J.) Metz

on 10/28/2004 said:
On the other hand, when you say partial, one thinks of a file
specification such as te??.p* or even c:\dir\path\more\te??.p* Both
of those patterns are partial file names.

That wasn't my intent.
See File::Find::Rule
Thanks.

Feel free to search CPAN next time.

The problem is knowing what to search for.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to (e-mail address removed)
 
S

Shmuel (Seymour J.) Metz

on 10/28/2004 said:
regex is the usual spelling in this group, and regexen the usual
plural.

Ah, so! I should have guessed that by analogy with boxen and vaxen :-(
...and your shell doesn't do proper quoting and remove the 's for
you, right?

Perl on OS/2 doesn't use my shell; it uses one that is more
Unix®-like. Otherwise the user wouldn't have to quote specifications
containing "*". Note that in this case allowing the shell to handle
the file globbing is not an option, because it would do so in the
wrong directories.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to (e-mail address removed)
 
U

Uri Guttman

SM> my $dir=shift;
SM> s/^'//, s/'$// for $dir;
SM> print '$dir =',"$dir\n";
SM> print '@ARGV=(',join(',',@ARGV),")\n";
SM> my @files=@ARGV;
SM> my @dirs=grep -d, glob abs_path($dir);
SM> if $dirs==0 {

what is $dirs? did you even run this under strict? please post real code
and not some made up random text. maybe you meant it to be @dirs which
make sense but why post code that you never ran even to check for syntax
and strictness?

FYI @dirs and $dirs have no relationship

also use a little white space. it won't hurt you a bit and it's free!

uri
 
M

Michele Dondi

Okay, I've changed the code to

my $dir=shift;
s/^'//, s/'$// for $dir;
print '$dir =',"$dir\n";
print '@ARGV=(',join(',',@ARGV),")\n";

Not that I despise join(), but IMHO it would be more terse having it
take place under the curtain, a la (e.g.)

{
local $,=',';
print "@ARGV\n";
}
my @files=@ARGV;
my @dirs=grep -d, glob abs_path($dir);
if $dirs==0 {
die "$dir dosn't match any directory names:\n";
}

Huh?!? Are you running Perl6? If not then this has to be:

if ($dirs==0) {
# ^ ^
die "$dir dosn't match any directory names:\n";
}

But then it's also @dirs, not $dirs.
elsif $dir>1 {

Still running Perl6, I see... ;-)

Also... well, here $dir at least does exist. But @dirs is again what
you really want, isnt'it?
print "$dir matches multiple directory names:\n";
die join " \n" @dirs, "\n";

Is there any good reason you are mixing print() and die() statements?
Are you aware you will be printing to two different fd's?

Also, the last line has an error: join() wants comma separated args,
but in that case you must also add parenthesis to avoid the last "\n"
being considered as another argument to join().

Moreover, out of curiosity: why " \n" instead of "\n"?

All in all I'd rewrite the whole fragment as:

@dirs or die "`$dir' doesn't match any directory name\n";
die "`$dir' matches multiple directory names:\n",
map "$_\n", @dirs if @dirs>1;
Are you referring to the issue of $1 versus \1?

No, I suggested you to search 'WARNING' in perldoc perlre. This turns
out to be:

| WARNING: Once Perl sees that you need one of $&, $`, or $' anywhere in
| the program, it has to provide them for every pattern match. This may
| substantially slow your program. Perl uses the same mechanism to produce
| $1, $2, etc, so you also pay a price for each pattern that contains
| capturing parentheses. (To avoid this cost while retaining the grouping
| behaviour, use the extended regular expression "(?: ... )" instead.) But
| if you never use $&, $` or $', then patterns *without* capturing
| parentheses will not be penalized. So avoid $&, $', and $` if you can,
| but if you can't (and some algorithms really appreciate them), once
| you've used them once, use them at will, because you've already paid the
| price. As of 5.005, $& is not so costly as the other two.
them into production code. Is it considered good style to use a
here-doc when the output is only a line or two?

One or two lines... hmmm, I would say: no. Just a few more and then:
yes. In both cases this is largely a matter of personal tastes.


Michele
 
S

Shmuel (Seymour J.) Metz

on 11/02/2004 said:
what is $dirs?

Typo. That should have been

if (@dirs==0) {

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to (e-mail address removed)
 
S

Shmuel (Seymour J.) Metz

on 11/02/2004 said:
Not that I despise join(), but IMHO it would be more terse having it
take place under the curtain, a la (e.g.)
{
local $,=',';
print "@ARGV\n";
}

But would it be as clear?
Huh?!? Are you running Perl6?

You're right; I need parentheses around the comparisons.
Is there any good reason you are mixing print() and die()
statements? Are you aware you will be printing to two different
fd's?

You're right; I need to use a single die().
Also, the last line has an error: join() wants comma separated args,
but in that case you must also add parenthesis to avoid the last
"\n" being considered as another argument to join().
Yes.

Moreover, out of curiosity: why " \n" instead of "\n"?

Typo. I meant

elsif (@dirs>1) {
die "$dir matches multiple directory names:\n",
' ', join("\n ",@dirs),"\n";

Which indents for readability.
All in all I'd rewrite the whole fragment as:
@dirs or die "`$dir' doesn't match any directory name\n";
die "`$dir' matches multiple directory names:\n",
map "$_\n", @dirs if @dirs>1;

What is the reason for mixing styles here, as opposed to putting the
tests on the same ends?

@dirs or die "`$dir' doesn't match any directory name\n";
@dirs>1 or die "`$dir' matches multiple directory names:\n",
map " $_\n", @dirs if @dirs>1;
No, I suggested you to search 'WARNING' in perldoc perlre.

That's how I got the text I quoted, in

=head1 DESCRIPTION
=head2 Warning on \1 vs $1

It seemed more relevant than the text in

=head1 DESCRIPTION
=head2 Regular Expressions

since I wasn't using $&, $' or $`.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to (e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,163
Messages
2,570,897
Members
47,434
Latest member
TobiasLoan

Latest Threads

Top