File names / portability

J

john

Hi

I am trying to write a portable C program.

However I have this problem. I need to get the user to input a filename
to save the results. But if he shall input a filename not in the form 8.3
then this will not be portable to DOS.

My question is: as DOS is a common target for C, wouldn't it be better if
fopen() only allowed strings of length <11 and in the format 8.3 for
maximum compatability, and had an undefined behavior if it was given a
string in another format?

Thanks.
 
S

Seebs

I am trying to write a portable C program.

Good luck with that!
However I have this problem. I need to get the user to input a filename
to save the results. But if he shall input a filename not in the form 8.3
then this will not be portable to DOS.

That's fine.
My question is: as DOS is a common target for C, wouldn't it be better if
fopen() only allowed strings of length <11 and in the format 8.3 for
maximum compatability, and had an undefined behavior if it was given a
string in another format?

No, that would be stupid. There are other filesystems with other, different,
restrictions on what names may be.

The correct solution is that names are not expected to be portable, but the
behavior of fopen() is -- it will open the file or return a null pointer so
that you know it failed.

Consider how crippling it would be trying to use C if you could never open any
file with a name that was impermissible on any piece of hardware anywhere;
keep in mind that colons can be part of file names on some machines and not
others, some machines allow multiple dots in names, some distinguish case
and some don't... There are no names that are completely portable, but
fopen() isn't built around whether the specific name provided would work
everywhere, but whether a higher level description of its functionality is
available.

-s
 
J

jacob navia

john a écrit :
Hi

I am trying to write a portable C program.

However I have this problem. I need to get the user to input a filename
to save the results. But if he shall input a filename not in the form 8.3
then this will not be portable to DOS.

My question is: as DOS is a common target for C, wouldn't it be better if
fopen() only allowed strings of length <11 and in the format 8.3 for
maximum compatability, and had an undefined behavior if it was given a
string in another format?

Thanks.

DOS is an obsolete system that is no longer maintained by the company
that created it.

Yes, there are maybe some users but I would like to know what data do
you have that says that DOS is a "common target for C".

So, you propose that all other systems destroy their file systems to
please some die hard DOS users?

So, I can't name a file with:

ApplicationCore.c

but I have to name it

APPLICAT.C

so that it collides with

ApplicationOutput.c

that ALSO will be truncated to APPLICAT.C

The fact that for you this is natural and a matter of course shows only
that you have never really left DOS.

IT IS OK. Stay in there. But PLEEEEZ do not come out of your hole with
proposals like this.

Thanks in advance for your understanding.

jacob
 
E

Eric Sosman

Hi

I am trying to write a portable C program.

However I have this problem. I need to get the user to input a filename
to save the results. But if he shall input a filename not in the form 8.3
then this will not be portable to DOS.

Why is that a problem? If he runs your program on DOS and
supplies a file name DOS doesn't accept, fopen() will return NULL
and you'll say "Sorry: can't use that file name." On the other
hand, if he runs your program on Linux why should he be limited
to using only the names DOS would accept, and be walled off from
the names Linux can use? Also, if he runs your program on Linux
and provides CON.DAT as the file name, should Linux bend over
backwards to send the output to a console (some console, somewhere)?

*Your* name is non-portable; will you stop using it?
My question is: as DOS is a common target for C, wouldn't it be better if
fopen() only allowed strings of length<11 and in the format 8.3 for
maximum compatability, and had an undefined behavior if it was given a
string in another format?

No.
 
J

John Bode

Hi

I am trying to write a portable C program.

However I have this problem. I need to get the user to input a filename
to save the results. But if he shall input a filename not in the form 8.3
then this will not be portable to DOS.

Sure it will. If the user insists on entering a string that isn't a
valid DOS file name, then that's the *user's* problem, not your code's
(modulo properly escaping backslashes in the path name and other
sanity checks; i.e. if the user enters "\a\path\name.txt", you need to
change it into "\\a\\path\\name.txt"). All your code has to do is
pass the string to fopen() and check the result. If the user entered
a bad filename, fopen() should return NULL, and you can then prompt
the user to try again (or cancel the operation).
My question is: as DOS is a common target for C, wouldn't it be better if
fopen() only allowed strings of length <11 and in the format 8.3 for
maximum compatability, and had an undefined behavior if it was given a
string in another format?

Thanks.

Tying C stdio functions to a specific and obsolete file naming
convention is *not* going to maximize portability. If anything, such
a requirement would only marginalize C on many other platforms; most
users (including programmers) aren't going to willingly use a less-
expressive naming convention if they don't have to, and most
programmers aren't going to be enthusiastic about having to map longer
filenames to the 8.3 format and back.

The C language *shouldn't* care about file naming conventions; all it
cares about is passing a string to the underlying file system and
getting a file handle back. It's up to the underlying system to
determing whether the string represents a valid file name or not, and
to return the appropriate value to fopen().
 
K

Keith Thompson

John Bode said:
Sure it will. If the user insists on entering a string that isn't a
valid DOS file name, then that's the *user's* problem, not your code's
(modulo properly escaping backslashes in the path name and other
sanity checks; i.e. if the user enters "\a\path\name.txt", you need to
change it into "\\a\\path\\name.txt"). All your code has to do is
pass the string to fopen() and check the result. If the user entered
a bad filename, fopen() should return NULL, and you can then prompt
the user to try again (or cancel the operation).

No, you don't need to escape backslashes in user-entered strings.
Backslashes need to be doubled only in string literals and character
constants.

If you read a string with fgets() (and drop the terminating
'\n'), you can pass that string directly to fopen(). If you try
to double the backslashes first, fopen() will see a string with
doubled backslashes.

As for portability, there are contexts in which you might want to
limit file names to the least common denominator, which might be
the DOS-style 8.3 format. For example, if you're distributing data
files, avoiding longer names can make your files more portable to a
wider variety of systems. More realistically, making sure you don't
have distinct file names that differ in case can be beneficial for
files that might be ported from, say, Unix to Windows.

But restricting the behavior of fopen() itself is not the way
to do this. It would have the very small benefit of warning you
about file names that might not be portable, but the huge drawback
of making it impossible to deal with any such files even when you
don't care whether DOS can support them.
 
P

Peter Nilsson

john said:
My question is: as DOS is a common target for C, wouldn't
it be better if fopen() only allowed strings of length <11
and in the format 8.3 for maximum compatability, and had
an undefined behavior if it was given a string in another
format?

No, for reasons cited elsethread.

In any case, DOS has long file name support since 1994.

% type CON > "this is a long file name.txt"
This is a sentance in a file with a very long name.
^Z

% type lfn.c
#include <stdio.h>

int main(int argc, char **argv)
{
if (argc)
while (argv++, --argc)
{
static char line[1024];
FILE *fi = fopen(*argv, "r");
if (!fi) continue;
while (fgets(line, sizeof line, fi))
fputs(line, stdout);
fclose(fi);
}
return 0;
}

% acc lfn.c -o lfn.exe

% lfn "this is a long file name.txt"
This is a sentance in a file with a very long name.

% dir /B *.txt
this is a long file name.txt

%

If you're C implementation can't handle that, then upgrade,
e.g. DJGPP.
 
U

Uno

Keith Thompson wrote:

[filenames and dos stuff snipped]

Keith,

I finally have a gcc install on windows that works almost like I want it
too, but I'm missing it on one detail.

What is that group where they talk about this?

ms-dos.win32.programmer ?? Nothing I try works, and similar looking
groups are abandoned.

I have no place for this forum's name in my head. I apologize for
interrupting the continuity of this thread.

To OP, the C standard says nothing about filenames, so you might want to
find this group as well.
 
R

robertwessel2

Nevertheless, it is still used by real companies for real commercial
purposes. No doubt the number of installations of MS-DOS is decreasing
over time, but it's still > 0. This is not an argument for bowing C's
knee to DOS file formats, of course - that would be silly for the
reasons Seebs has already pointed out - but it is most certainly an
argument against knee-jerk "DOS is dead" responses. When a programmer,
through no fault of his own, is forced by circumstance to maintain a
system that uses an antiquated OS because of external constraints over
which he has no control, it is no comfort to him to be told "use a
proper OS, you fool" when that's precisely what he wants to do and
precisely what he's prevented from doing in his particular situation.


FWIW, reading between the lines a bit, I kind of got the impression
that the OP was more interested in the many, many storage devices that
are formatted as non-LFN FAT, rather than DOS, per-se. As a practical
matter FAT (particularly in non-LFN form) does define what might be
considered a lowest common denominator data exchange medium.

Of course that doesn't change the fact that the way the volume and
directories are accessed will be different between systems. Or that
this is an issue outside the C standard.
 
F

Francois Grieu

Le 11/05/2010 07:40, Richard Heathfield a écrit :
When you transfer a file between two OSen that use
incompatible filename conventions, you will almost certainly use a
utility program to do so, and no doubt the author of that program will
have provided a way for the file to be renamed during the transfer.

In my experience moving files back and forth between OSes, the above is
wildly overoptimistic. Truth is, the author of the transfert program has
declared job done when transfer of "textfile.txt" and "database.bin"
appeared to work.

In particular, under MacOS (the operating system of the Macintosh before
OS X), a filename can contain any of 255 characters, including '/' '\\'
'*' '?' '\0' '"' '\n' (which is 13 rather than the usual 10), with the
exception of ':' reserved for what other OS do with '/' or '\\', and a
unique mapping of codes to accentuated characters, and a limit of 31
chars (less for directories in volumes that support that), and
case-insensitivity for A-Z a-z; that's leaving aside the "file version"
(some silly extension of the file name that never caught). Simply said
no transfer utility handles it properly. Further a file comprises its
data, an optional "ressource fork" (think mini-database alongside the
file), and a handfull of indispensible attributes, making the whole idea
of transfering such a file an accident going to happen.


François Grieu
 
N

Nobody

I am trying to write a portable C program.

Good Luck With That.
However I have this problem. I need to get the user to input a filename to
save the results. But if he shall input a filename not in the form 8.3
then this will not be portable to DOS.

My question is: as DOS is a common target for C, wouldn't it be better if
fopen() only allowed strings of length <11 and in the format 8.3 for
maximum compatability, and had an undefined behavior if it was given a
string in another format?

No, of course not. Why would you want the program to refuse to accept a
perfectly valid filename?

The program should simply pass the filename provided by the user
directly to fopen(), which will typically pass it directly to the OS.

Also, requiring strings of length <11 would mean that you couldn't pass a
path, which is probably more common than a simple filename.
 
U

Uno

Richard said:
MS-DOS: comp.os.msdos.programmer
Win32: comp.os.ms-windows.programmer.win32

The latter is still very active. I am not sure about the former.

Thx, Richard, I posted in the win32 one, which is a great ng.

I'm always interested in what translators sound like when they have to
render text back into the original language:

http://i42.tinypic.com/1zzs1op.png

It would seem that Poles don't know what to do with the word "ghastly."
That's a word we really don't use on this side of the pond either,
despite its descriptive power. Maybe we'll use it to describe an
Atlantic full of crude, dispersants and dying biosphere.

Cheers,
 
B

BGB / cr88192

jacob navia said:
john a écrit :

DOS is an obsolete system that is no longer maintained by the company that
created it.

but, there is still FreeDOS and others...

there might be some merit for using it for certain tasks, but I wouldn't
expect it to be used much for much else.

I suspect the "FitLinx" system (used in the YMCA, at least here...) also
uses this, as I have noted seeing the FreeDOS name (followed by a lot of
typical DOS-looking stuff) in cases where the machines have had to be
rebooted (usually because the position sensors on the excercise machines go
out of calibration).

(basically, the machines are little embedded panels attached to the
excercise machines, powered I think by a power-cube / "AC adapter", and also
generally with an ethernet connection to the wall). when booting one sees
usual BIOS POST stuff, followed by what looks like FreeDOS booting up
(absent the "@echo off" trick in "autoexec.bat", ...).

there are I suspect many other places where DOS (in one form or another), is
still in use, despite it being long-dead as a consumer OS...

Yes, there are maybe some users but I would like to know what data do you
have that says that DOS is a "common target for C".

so is ARM, apparently...

So, you propose that all other systems destroy their file systems to
please some die hard DOS users?

this would suck...

So, I can't name a file with:

ApplicationCore.c

but I have to name it

APPLICAT.C

so that it collides with

ApplicationOutput.c

that ALSO will be truncated to APPLICAT.C

The fact that for you this is natural and a matter of course shows only
that you have never really left DOS.


actually there is an easy solution:
return to good old alphanumeric-soup naming conventions...

"APPLCR1A.C"
"APPLOT2B.C"

there was also the convention used by windows, but it was well advised to
stay clear of this for ones' own files...

IT IS OK. Stay in there. But PLEEEEZ do not come out of your hole with
proposals like this.

around in this group, it almost seems reasonable...

"but, someone, somewhere, might be using a system with such-and-such
arbitrary issue or limitation...".

but, to be entirely portable, there is not really a whole lot one can do.
damn near anything all that much more advanced than "Hello World!" risks
running into potential portability issues in one place or another...


or one can even write a potentially non-portable hello world, such as
violating the all important "main does not return void" rule...
 
J

john

jacob said:
john a écrit :

DOS is an obsolete system that is no longer maintained by the company
that created it.

Yes, there are maybe some users but I would like to know what data do
you have that says that DOS is a "common target for C".

So, you propose that all other systems destroy their file systems to
please some die hard DOS users?

So, I can't name a file with:

ApplicationCore.c

but I have to name it

APPLICAT.C

so that it collides with

ApplicationOutput.c

that ALSO will be truncated to APPLICAT.C

I believe you are wrong here - I would say that DOS has never been more
widely used.

Most Linux distributions come with the DOSBox emulator. And Windows
(except Windows NT) is still built on DOS: just select Run from the start
menu and type command.com! Also autoexec.bat and config.sys are still
there from DOS.

The filenames will not have the same truncation, instead you can have
APPLIC~1.C and APPLIC~2.C, a very nice solution available on advanced
FAT32 filesystems that support long filenames.

Personally even when long filenames are available I tend to stick to the
8.3 convention to make transferring files between systems easier, and
also because with long filenames directory listings can become very hard
to read.
 
K

Keith Thompson

john said:
I believe you are wrong here - I would say that DOS has never been more
widely used.

I'm skeptical of that claim, but I have no data to support my skepticism.
Most Linux distributions come with the DOSBox emulator. And Windows
(except Windows NT) is still built on DOS: just select Run from the start
menu and type command.com! Also autoexec.bat and config.sys are still
there from DOS.

As I understand it, all modern Windows operating systems are based
on Windows NT; that includes 2000, XP, Vista, and 7. command.com on
such systems is a DOS emulator. I'm not sure how faithful it is, but
I was just able to create a file named "verylongname.txt" using it.

(Yes, Windows is off-topic, but this is relevant to the question of
whether C should impose an 8.3 file name limit.)
The filenames will not have the same truncation, instead you can have
APPLIC~1.C and APPLIC~2.C, a very nice solution available on advanced
FAT32 filesystems that support long filenames.

Personally even when long filenames are available I tend to stick to the
8.3 convention to make transferring files between systems easier, and
also because with long filenames directory listings can become very hard
to read.

If you want to give your files short names, that's fine. But I really
hope you're not still seriously suggesting that C implementations should
be unable to open files with longer names.

And what about systems with even more restrictive file names that DOS?
What about path names?

The syntax of the file name string passed to fopen() is determined
entirely by the underlying operating system. (Well, the C
implementation could play some games with it, but I know of none that do
so.)

Suppose I want my C program, running on a Linux system, to open a file
called "longfilename.foobar.data". You're proposing that it should
be unable to do so.

No.
 
L

Lew Pitcher

john said:
I believe you are wrong here - I would say that DOS has never been more
widely used.

I'm skeptical of that claim, but I have no data to support my skepticism.
Most Linux distributions come with the DOSBox emulator. And Windows
(except Windows NT) is still built on DOS: just select Run from the start
menu and type command.com! Also autoexec.bat and config.sys are still
there from DOS.
[snip]
Suppose I want my C program, running on a Linux system, to open a file
called "longfilename.foobar.data". You're proposing that it should
be unable to do so.

Codifying C to require MSDOS-compatable filenames would eliminate platforms
that currently support C environments. Some environments /cannot/ work with
MSDOS-like filenames, and others would be severely curtailed.

In zOS (and other IBM MVS-born systems), file names consist of multiple
1-to-8-character uppercase alphanumeric sequences, separated by
single 'period' characters. I suppose, if we through out 40+ years of MVS
application and system development, we /could/ have MVS files which use the
MSDOS 8+3 format. But, that ain't gonna happen. SYS1.MACLIB and
APLP.C.P.BASE0.CPYLIB will still exist, as will
CDAT.A.A2995777.DATA.G0010V00. And I want my C program to be able to read
and write CDAT.A.A2995777.DATA.G0010V00.

Add to this the fact that, in some OS environments, the name of the file (as
recorded on medial) is *not* readily accessable to the program, either to
be set or to be tested. My zOS C program will /not/
fopen("CDAT.A.A2995777.DATA.G0010V00","r"), because the OS has no
underlying hooks (at least, none readily available to a
high-level-language) to permit the program to specify the file name.
Instead, zOS programs must go through an intemediary name, a "DDNAME" that
binds the program to the file through a language called "JCL" (for "Job
Control Language)". Instead of fopen("CDAT.A.A2995777.DATA.G0010V00","r"),
the OS will only permit me to fopen("DD:CDADATA","r"), with the
DDNAME "CDADATA" redirecting to the
filename "CDAT.A.A2995777.DATA.G0010V00" at a later stage in the
processing.

So, limiting filenames to "8+3" eliminates /this/ environment. (An
environment which has /never/ seen MSDOS and is more powerful than /any/
Microsoft OS).

I agree. No.
 
R

robertwessel2

I am afraid you are incorrect. ALL Windows systems since Windows NT
3.1 have been based on the NT kernel, not DOS. The last
Windows-over-DOS system was Windows 98SE.


Actually Windows ME, but DOS was pretty well hidden there, but it was
still the basic Win9x architecture.

The key word here is DOS _emulation_. Whether you use cmd.exe or
command.com makes very little difference. The system is still using a
DOS virtual machine. To prove it, invoke command windows via
command.com or cmd.exe and type "ver" you will get the same version
from both systems. Typing "mem" will get you the same virtual memory
profile. All MS-DOS functionality is in the DVM. If not for this
MS-DOS programs executing under Windows would not have access to the
NTFS disks.

As a matter of fact, command.com shells call autoexec.nt and
config.nt, not autoexec.bat and config.sys.


Yep. And the DOS emulation has been in NT since at least 3.51 (I have
no way to check the support in 3.5 or 3.1). It’s only been removed in
the 64 bit versions of Windows (but it’s still present in the 32 bit
version of Win7).

And many people confuse the command prompt with "DOS". The command
prompt supplied by the usual cmd.exe is most definitely not DOS. And
while command.com exists, it only exists as something that's loaded by
the DOS emulator (aka NTVDM).

Truncation only occurs if you have turned on the optional "8.3
filename support" for FAT32 and NTFS partitions. I believe the default
is "on" but it's been a while since I last checked.


Probably more directed at the PP: long file name support is orthogonal
to the version of FAT - you can have long file names on FAT12, FAT16
*and* FAT32. In fact, the typical 3.5 inch (1.44MB) floppy is FAT12,
and it can obviously hold long file names.
 
J

john

Keith said:
If you want to give your files short names, that's fine. But I really
hope you're not still seriously suggesting that C implementations should
be unable to open files with longer names.

And what about systems with even more restrictive file names that DOS?
What about path names?

The syntax of the file name string passed to fopen() is determined
entirely by the underlying operating system. (Well, the C
implementation could play some games with it, but I know of none that do
so.)

Suppose I want my C program, running on a Linux system, to open a file
called "longfilename.foobar.data". You're proposing that it should be
unable to do so.

No.

I think you are being deliberately obtuse here.

I am proposing no such thing. What I am proposing is that passing a
string to fopen() that is not in 8.3 format should invoke an undefined
behavior (actually, implementation defined behavior would be even better).

A Linux implementation would be free to choose to open any filename you
threw at it. A DOS implementation obviously couldn't do this in the
absence of a LFN support layer.
 
B

Ben Bacarisse

I think you are being deliberately obtuse here.

That seems very unlikely. For one thing, he is not the only person who
is now confused by you proposal:
I am proposing no such thing. What I am proposing is that passing a
string to fopen() that is not in 8.3 format should invoke an undefined
behavior (actually, implementation defined behavior would be even better).

A Linux implementation would be free to choose to open any filename you
threw at it. A DOS implementation obviously couldn't do this in the
absence of a LFN support layer.

How is that different to what happens now? 7.19.3 p8: "The rules for
composing valid file names are implementation-defined".
 
K

Keith Thompson

john said:
I think you are being deliberately obtuse here.

It looks like you just didn't express your proposal clearly enough.
A lot of smart people here drew the same conclusion I did about
what you're proposing.
I am proposing no such thing. What I am proposing is that passing a
string to fopen() that is not in 8.3 format should invoke an undefined
behavior (actually, implementation defined behavior would be even better).

Ok, here's what you wrote upthread (emphasis added):

My question is: as DOS is a common target for C, wouldn't it be
better if fopen() *only allowed* strings of length <11 and in
the format 8.3 for maximum compatability, and had an *undefined
behavior* if it was given a string in another format?

(And I think you meant "<=12" rather than "<11"; the '.' is part of the
file name.)
A Linux implementation would be free to choose to open any filename you
threw at it. A DOS implementation obviously couldn't do this in the
absence of a LFN support layer.

Yes, implementation-defined behavior makes much more sense than the
undefined behavior you originally proposed.

And how exactly does your proposal differ from the current situation?

C99 7.19.5.3p2:
The fopen function opens the file whose name is the string
pointed to by filename, and associates a stream with it.

and p8:
The fopen function returns a pointer to the object controlling
the stream. If the open operation fails, fopen returns a
null pointer.

And C99 7.19.3p8:

Functions that open additional (nontemporary) files require
a file name, which is a string. The rules for composing
valid file names are implementation-defined.

So the C standard already says that valid file names are
implementation-defined. As far as I can tell, the only thing your
proposal would change is to give some special status to names that
fit within the MS-DOS 8.3 limits.

Oh, wait, it would also make the behavior of fopen() when given a
non-8.3 file name implementation-defined, rather than (as in the
current standard) requiring it either to succeed or to return a
null pointer to indicate failure.

Since there's nothing particularly special about MS-DOS with respect
to the C standard (some systems impose looser restrictions on file
names, some impose tighter ones), I'll say again that your proposal,
as you've re-explained it, is either already covered by the standard
or a bad idea.

Under your proposal, if an OS only supports 7.3 file names, or
doesn't permit '.' in file names, would a C implementation
on that system be required to support DOS-style 8.3 file names?
What would be the benefit of such a requirement?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,093
Messages
2,570,613
Members
47,230
Latest member
RenaldoDut

Latest Threads

Top