file object, details of modes and some issues.

S

simon place

is the code below meant to produce rubbish?, i had expected an exception.

f=file('readme.txt','w')
f.write(' ')
f.read()

( PythonWin 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on
win32. )

I got this while experimenting, trying to figure out the file objects modes,
which on very careful reading of the documentation left me completely in the
dark. Below is a summary of that experimentation, some is for reference and
some is more of a warning.

'r' is (r)ead mode so you can't write to the file, you get
'IOError:(0,'Error')' if you try, which isn't a particularly helpful error
description. read() reads the whole file and read(x) x-bytes, unless there are
less than x bytes left, then it reads as much as possible. so a test for less
than the required number of bytes indicates the end of the file, i think maybe
an exception when a read at the end of the file is attempted would be better,
like iterators. if you try to open a non-existent file in 'r' mode you get
'IOError: [Errno 2] No such file or directory: "filename"' which makes sense

'w' is (w)rite mode so you can't read from the file, ( any existing file is
erased or a new file created, and bear in mind that anything you write to the
file can't be read back directly on this object.), you get 'IOError: [Errno 9]
Bad file descriptor' if you try reading, which is an awful error description.
BUT this only happens at the beginning of the file? when at the end of the
file, as is the case when you have just written something ( without a backward
seek, see below), you don't get an exception, but lots of rubbish data ( see
example at beginning.) This mode allows you to seek backward and rewrite
data, but if you try a read somewhere between the first character and the end,
you get a different exception 'IOError: (0, 'Error')'

'a' is (a)ppend mode, you can only add to the file, so basically write mode
(with the same problems ) plus a seek to the end, obviously append doesn't
erase an existing file and it also ignores file seeks, so all writes pile up
at the end. tell() gives the correct location in the file after a write ( so
actually always gives the length of the file.) but if you seek() you don't get
an exception and tell() returns the new value but writes actually go to the
end of the file, so if you use tell() to find out where writes are going, in
this mode it might not always be right.

'r+' is (r)ead (+) update, which means read and write access, but
don't read, without backward seeking, after a write because it will then read
a lot of garbage.( the rest of the disk fragment/buffer i guess? )

'w+' is (w)rite (+) update mode, which means read and write access,
(like 'r+' but on a new or erased file).

'a+' is (a)ppend (+) update mode, which also means read and write, but
file seeks are ignored, so any reads seems a bit pointless since they always
read past the end of the file! returning garbage, but it does extend
the file, so this garbage becomes incorporated in the file!! ( yes really )

'b', all modes can have a 'b' appended to indicate binary mode, i think this
is something of a throw-back to serial comms ( serial comms being bundled into
the same handlers as files because when these things were developed, 20+ years
ago, nothing better was around. ) Binary mode turns off the 'clever' handling
of line ends and ( depending on use and os ) other functional characters (
tabs expanded to spaces etc ), the normal mode is already binary on windows so
binary makes no difference on win32 files. But since in may do on other
o.s.'s, ( or when actually using the file object for serial comms.) i think
you should actually ALWAYS use the binary version of the mode, and handle the
line ends etc. yourself. ( then of course you'll have to deal with the
different line end types!)

Bit surprised that the file object doesn't do ANY access control, multiple
file objects on the same actual file can ALL write to it!! and other software
can edit files opened for writing by the file object. However a write lock on
a file made by other software cause a 'IOError: [Errno 13] Permission denied'
when opened by python with write access. i guess you need
os.access to test file locks and os.chmode to change the file locks, but i
haven't gone into this, shame that there doesn't appear to be a nice simple
file object subclass that does all this! Writes to the file object actually
get done when flush() ( or seek() ) is called.

suffice to say, i wasn't entirely impressed with the python file object, then
i remembered the cross platform problems its dealing with and all
the code that works ok with it, and though i'd knock up this post of my
findings to try to elicit some discussion / get it improved / stop others
making mistakes.
 
J

Jeff Epler

Here's what I get on my system: Traceback (most recent call last):
Traceback (most recent call last):
File "<stdin>", line 1, in ?
IOError: [Errno 9] Bad file descriptor

Python relies fairly directly on the C standard library for correct
behavior when it comes to file objects. I suspect that the following C
program will also "succeed" on your system:

#include <stdio.h>
int main(void) {
FILE *f = fopen("xyzzy", "w");
char buf[2];
char *res;
fputs(" ", f);
res = fgets(buf, 2, f);
if(!res) {
perror("fgets");
return 1;
}
return 0;
}

On my system, it does given an error:
$ gcc simon.c
$ ./a.out
fgets: Bad file descriptor
$ echo $?
1
If the C program prints an error message like above, but Python does not
raise an exception on the mentioned code, then there's a Python bug.
Otherwise, if the C program executes on your system without printing an
error and returns the 0 (success) exit code, then the problem is the
poor quality of your platform's stdio implementation.

Jeff
PS relevant text from the fgets manpage on my system:
gets() and fgets() return s on success, and NULL on error or when
end of file occurs while no characters have been read.
and from fopen:
w Truncate file to zero length or create text file for writing.
The stream is positioned at the beginning of the file.
 
C

Christos TZOTZIOY Georgiou

I will open a bug report if none other does,
but first I would like to know if it's the Windows stdio to blame or
not.

I didn't wait that long, it's bug 795550 in SF.
 
M

Michael Hudson

simon place said:
is the code below meant to produce rubbish?

Python uses C's stdio. According to the C standard:
, i had expected an exception.

f=file('readme.txt','w')
f.write(' ')
f.read()

engages in undefined behaviour (i.e. is perfectly entitled to make
demons fly out of your nose). You can apparently trigger hair-raising
crashes on Win98 by playing along these lines. There's not a lot that
Python can do about this except include it's own implementation of a
stdio-a-like, and indeed some future version of Python may do just
this.

Cheers,
mwh
 
C

Christos TZOTZIOY Georgiou

engages in undefined behaviour (i.e. is perfectly entitled to make
demons fly out of your nose).

OK, then, let's close the 795550 bug (I saw your reply there after
posting the second comment).
 
M

Michael Hudson

Jeff Epler said:
If it's true that stdio doesn't guarantee an error return from fwrite() on
a file opened for reading, then the Python documentation should be
changed (it claims an exception is raised, but this depends on the
return value being different from the number of items written
(presumably 0))

I may be getting confused. The undefined behaviour I was on about was
interleaving reads & writes without an intervening seek.
It's my feeling that this is intended to be an error condition, not
undefined behavior. But I can't prove it. Here are some relevant pages
from the SUS spec, which intends to follow ISO C:
http://www.opengroup.org/onlinepubs/007904975/functions/fopen.html
http://www.opengroup.org/onlinepubs/007904975/functions/fwrite.html

The EBADF error seems to be marked as an extension to ISO C, but I
don't know what that signifies.
Hm, and there's a bug even on Linux:

That might well not even call an C library routine at all (I don't
know).

Cheers,
mwh
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,189
Members
46,735
Latest member
HikmatRamazanov

Latest Threads

Top