convention regarding lexical filehandles

P

Paul Lalli

Prior to the invention of lexical file handles, I'd always seen the
convention that filehandles should be given in all capital letters:
open FILE, 'file.txt' or die "...";

Now that we have lexical file handles, I'm wondering if there is a
consensus as towards this convention. Which of the following looks more
'right':

open $file, 'file.txt' or die "...";
or
open $FILE, 'file.txt' or die "...";

Obviously, I know that both work, but I'd like to get opinions on the
Best Practice way to do things.

Thanks for your input,
Paul Lalli
 
P

phaylon

Paul said:
Obviously, I know that both work, but I'd like to get opinions on the Best
Practice way to do things.

That, in my opinion, depends largely on what you want to do and how your
code is (like mostly everything, except strict;warning;&Co). I used the
uppercase-convention in first place because of the (often) missing syntax
highlighting for barewords. With lexical filehandle's (which I like much
more, just to say) it "fit's more in".

Sure, if I'm working on a method with a large number of filehandles to
handle, I prefix them ($fh_csvin or categorical: $csv_infh).

Hope that gives some idea's or so..


p
 
B

Brian McCauley

Paul said:
Prior to the invention of lexical file handles, I'd always seen the
convention that filehandles should be given in all capital letters:
open FILE, 'file.txt' or die "...";

Now that we have lexical file handles, I'm wondering if there is a
consensus as towards this convention. Which of the following looks more
'right':

open $file, 'file.txt' or die "...";
or
open $FILE, 'file.txt' or die "...";

Neither looks right to me.

open my $file, '<', 'file.txt' or die "...";

You should always declare all variables in the smallest applicable
scope. In the case of a lexical filehandle this means the my()
declaration should be inside the open() statement.

Conceptually open() takes three arguments (or more/less in pipe mode).

The legacy two argument form of open() that combines the mode and
filename into one argument is a wart on the language and using it is a
dangerous habit that will burn you one day.

As for whether the variable containing the file handle reference should
be uppercase or lowercase I have almost no preference. I, myself,
usually use lowercase.
 
P

Paul Lalli

Brian McCauley said:
Neither looks right to me.

open my $file, '<', 'file.txt' or die "...";

You should always declare all variables in the smallest applicable
scope. In the case of a lexical filehandle this means the my()
declaration should be inside the open() statement.

Okay, point. Admittedly, I was typing fast and neglected to include the
'my'. Thanks for catching that.
Conceptually open() takes three arguments (or more/less in pipe mode).

The legacy two argument form of open() that combines the mode and
filename into one argument is a wart on the language and using it is a
dangerous habit that will burn you one day.

This one I've never really understood. I'm going to go take a look at
perlopentut, but if you'd care to explain in what way the two-argument
form is dangerous, I'd appreciate it.

Thank you,
Paul Lalli
 
S

Sherm Pendley

Paul said:
This one I've never really understood. I'm going to go take a look at
perlopentut, but if you'd care to explain in what way the two-argument
form is dangerous, I'd appreciate it.

Think about what might happen if one of your users names a file ">file.txt",
and your app uses "open HANDLE, $filename" to open $filename for reading.

sherm--
 
S

Sherm Pendley

Abigail said:
Well, so what? The user loses his file. Big deal. Had he typed

$ cat >file.txt

he would have lost his file too.

Yes, but that's due to a mistake on the user's part, not cat's. Had the user
correctly typed

$ cat ">file.txt"

instead, "cat" would have behaved correctly. In the case of a Perl script
that uses "open FILE, $filename", there would be *no* correct way for the
user to run the script. The file gets overwritten even if the user
correctly quotes it when he runs the script.

sherm--
 
T

Tad McClellan

Paul Lalli said:
Prior to the invention of lexical file handles, I'd always seen the
convention that filehandles should be given in all capital letters:
open FILE, 'file.txt' or die "...";

Now that we have lexical file handles, I'm wondering if there is a
consensus as towards this convention.


IMO, the reason for the convention is to avoid collisions
with built-in functions:

open file, 'file.txt' or die "...";

works fine today, but will break when you update your perl version
and it introduces a brand-new function named file().

In other words, the convention is a consequence of filehandles
being sigil-less, hence confusing the parser when the FH is
a function call rather than a bareword.

Or, maybe it would just confuse the programmer that his untouched
program stopped working when he upgraded perl...



So, you only need upper case to get the parser to apply the
semantic that you intend with filehandles, you don't need
to resort to that trickery with lexical filehandles.
 
M

Michele Dondi

This one I've never really understood. I'm going to go take a look at
perlopentut, but if you'd care to explain in what way the two-argument
form is dangerous, I'd appreciate it.

If the "name" of the file is under the control of (possibly
untrusted) users, then they can have arbitrary code executed. Not too
much time ago an user posted his CGI code here asking how someone got
to hack his page: do you wonder ho they managed to do it?


Michele
 
M

Michele Dondi

I find the two arg form of 'open' useful. And no, I don't create file
names that start with '|', '<' or '>', nor filenames that end with a
'|'. Nor do I include leading or trailing spaces in my filenames.

Please note that I'm not one of those '"Perl" eq "CGI"' kinda guys: to
be fair I've hardly done any CGI et similia at all, but what if a CGI
script (is not run in taint mode and) accepts a parameter to be
interpreted as a filename to be opened and someone passes

%7Cecho%20'%3CHTML%3E%0A%3CHEAD%3E%0A%3CTITLE%3EYou%20have%20been%20hacked!%3C%2
FTITLE%3E%0A%3C%2FHEAD%3E%0A%3CBODY%3E%3CH1%3EYou%20have%20been%20hacked!%3C%2FH
1%3E%3C%2FBODY%3E%0A%3C%2FHTML%3E'%20%3Eindex.html%0A

(beware of line wraps)

to it?


Michele
 
S

Sherm Pendley

Abigail said:
I remain with my opinion that if you put a '>' in your file name,
you're responsible it doesn't get clobbered.

The fact is that '>' is a legal character to use in most file systems. If
your programs can't handle it correctly, your programs are broken.

It's trivially simple to write a program that *does* handle it, and blaming
the user for your own refusal to do so is elitism at its worst. It's the
kind of crap that gives rise to the popular stereotype of the "geek know it
all", and I'll have no part of it.

sherm--
 
S

Sherm Pendley

Abigail said:
And a - is a valid character in most file systems as well. Yet 'rm -r'
doesn't remove the file. Is rm broken?

No, the user typed the wrong command. "rm -- -r" is the correct command to
use in that instance. That option is specifically mentioned in "man rm",
for the express purpose of supporting files that begin with -.

Why do you keep bringing up mistyped or incorrect commands, when that's not
even relevant? The issue is a program that breaks *when it's being used
correctly*.

I don't understand your reluctance on this issue, honestly. It's not as if
you need to jump through hoops or anything - the difference in programmer
effort between "open FOO, $bar" and "open FOO, '<', $bar" is negligible.

In my view, you've got it completely backwards. The real question is what
possible benefit would you get from *not* using 3-argument open()?

sherm--
 
S

Sherm Pendley

Abigail said:
Why is it why you accept passing filenames in another ways then
exactly typing the filename to cat or rm, but when can't do that
with a Perl program, you balk?

I was balking at what I perceived as your attitude. Everything you've said -
until this message, that is - sounded to me like "the user chose a file
name I think is stupid, so I'll punish him for it."

I think I understand your point better now. You're not saying "use magic
open() all the time, users be damned". I think what you're saying is that
it shouldn't be ruled out as a matter of dogma. Well, I'm not suggesting it
should be. I'm simply pointing out that there are potential traps for the
unwary, so it shouldn't be used blindly.
Say you have a diff utility written in Perl. Magical open allows you to
get the difference from the output of two commands:

mydiff 'prog1 |' 'prog2 |'

Okay, point well taken. That's not something I'd thought of - chalk it up to
a failure of imagination. Now that I see it, I have to admit it would be a
*damn* useful thing for an advanced user to have, and magic open() makes it
trivially easy to do.

But by the same token, that same simplicity is also what makes easy for a
newbie programmer to make some serious mistakes if they use magic open()
without being fully aware of the magic. I rank it in the same class as
calling subs with '&', symbolic references and string evals - indispensable
in some situations, but not something that newbies should get into the
habit of using by default, all the time.

Your audience needs to be considered too. Your example assumes trustworthy
skilled users who can be counted upon to choose appropriate file names and
quote them as needed. In my experience - which I'll freely admit might be
very different from yours - that's asking entirely too much of a typical
end user.

I often write GUI apps aimed at a not-very advanced audience. These are the
kind of users who would name a file with their monthly budget in it
"January '05 $$". They'll assign it a leading character of ">" for no other
reason than to make their GUI list it first.

Now, if that user navigates to that file and opens it with an "open file"
dialog, there's no sane reason why the contents of the file should just
vanish - which is what would happen if I simply passed the file as-is to a
magic open().

sherm--
 
M

Michele Dondi

Are you saying that if you use three arg open, and blindly accept
filenames from third parties to open, you're safe?

Please do not confuse a necessary condition with a sufficient one.
Concisely

A ==> B

does not imply

B ==> A

(it is true, though, that

!B ==> !A

but there are delicate issues in the realm of logic even with this...)
I don't buy this argument. Blindly accepting filenames to open which

Don't buy it, but I've never tried to sell it to you...
are handed to you over CGI is a dumb idea to start with. You will have
to do more than just disabling magic open.

Granted!

Using the two args form of open() leaves a big security hole. Using
the three args form closes _that_ hole. I've never meant to say that
it closes _any_ hole and that one should be content with that and
avoid validating whatever he's passed.
But if you have fixed all other things, you probably still don't want
to use magic open. (But people were able to write safe programs before

This is indeed what I meant.


Michele
 
M

Michele Dondi

\\ Using the two args form of open() leaves a big security hole.

No, it doesn't have to. People where able to write Perl 5.005 programs
without having security holes.

This is not in contraddiction with what I wrote. In the kind of
situation I mentioned the two args form of open() _does_ leave a
security hole[1]. This security hole _must_ be closed _if_ one cares
about security. One way to close this security hole (and a "cheap"
one[2] IMHO) is to use the three args form of open() and _another_ one
is to use some sort of data validation or whatever technique that
those people used to write Perl 5.005 programs with no security holes.

As a side note I would argue that if one duly takes care about data
validation as to avoid potential risks of the kind we're talking
about, then he's most probably _not_ interested in the advantages that
"magic open" can offer, as tersely shown by you in another post.

As a side note to the side note I would argue that one can restore the
advantages of magic open by some sort of input validation as well. At
which point you may and probably would argue that it's stupid to
reinvent the wheel to do something that Perl can do out of the box.

Still, _logically_ the open() mode is a separate entity from the
filename of the file to be open()ed, and I feel more satisfied if I'm
certain that e.g. an handle supposed to be read is open() to something
that can be read.

So I can have e.g. (no claim of completeness/reliability/etc.):

my $mode = '<';
$mode = '-|' if $filename =~ s/\s*|\s*//;
open my $fh, $mode, $filename or
die "Can't open/run `$filename': $!\n";

and be sure that if someone tries this on '>file', then it won't
(succesfully, most probably) open() 'file' for writing.


[1] In fact you wrote yourself "no, it doesn't have to", rather than
"no, it doesn't". There's a substantial difference.

[2] In the sense that newbies can be safely thought to do so, and with
no other effort a huge source of insecurity (for the cases in which
this may matter) is already done away. With no implication whatsoever
that they should not care about other possible ones...


Michele
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,166
Messages
2,570,901
Members
47,442
Latest member
KevinLocki

Latest Threads

Top