lfs confusion

J

John Hunter

I am a bit confused about LFS in python. In the olden days, I used
the following to test whether python and my kernel supported large
files
0L

If 0L was returned, LFS was enabled, if 0 was returned, LFS was not
enabled.

I built python with LFS

499 CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' OPT="-g -O2 $CFLAGS" ./configure
501 make
502 make install

But fd.tell() returned 0. I wrote a little script to create a large
file >4GB

[root@crcdocs tmp]# cat lfs_write.py
fd = file('test2.dat', 'w')

SIZE = 4096
STOP = 2**32
total = 0
s = 'a'*SIZE
while total<STOP:
fd.write(s)
total += SIZE
print total


print fd.tell()


and it ran without incident, reporting at the end 4294967296 (no L).
So I can at least write files past the 2GB limit (and I wrote a script
to verify that I could read the who file too).

I am installing a zope server and will need to create a Data.fs that
exceeds the 2GB limit.

Is the fd.tell() 0L trick no longer valid. What is the right way to
test for LFS support in python?

JDH
 
J

John Hunter

Andrew> You might want to try doing a seek to a large int to see
Andrew> what happens.

Andrew> lfs_enabled = True try: fd.tell(sys.maxint * 3L) except
Andrew> OverflowError: lfs_enabled = False

Do you mean fd.seek?

I get
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: tell() takes no arguments (1 given)

fd.seek works as advertised though. On box 1 where fd.tell() returns
0L I can do fd.seek(sys.maxint * 3L).

On box2 "crcdocs" where fd.tell() returns 0, I get
Traceback (most recent call last):
File "<stdin>", line 1, in ?
OverflowError: long int too large to convert to int

I did manage to read and write a 4GB file on crcdocs using the script I
posted above. Perhaps I don't understand what the "large" in large
file really means. I assumed it was 2**31 approx equal 2GB. Has the
default, non LFS limit, increased?

I did do the normal incantation with the CFLAGS when compiling python
for LFS on crcdocs.

982 CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' OPT="-g -O2 $CFLAGS" ./configure
983 make
984 make install


I now wonder if the there is a problem with my kernel or glibc.

crcdocs:~> ls /usr/lib/libglib-2.0.so.0*
/usr/lib/libglib-2.0.so.0 /usr/lib/libglib-2.0.so.0.400.0

crcdocs:~> uname -a
Linux crcdocs.bsd.uchicago.edu 2.6.5-1.358 #1 Sat May 8 09:01:26 EDT 2004 x86_64 x86_64 x86_64 GNU/Linux


I did create a 4GB file with a simple C++ program on crcdocs to test
whether my kernel/glibc supported large files. Since I was able to
create the 4GB file successfully with this program, I assumed my
system was LFS enabled ). Now I am not so sure this is the right test
for the system.

Do you have any suggestions or special tricks to test the system for
LFS, and do you do something different when compiling python for LFS?

JDH
 
A

Andrew Dalke

John said:
I am a bit confused about LFS in python. In the olden days, I used
the following to test whether python and my kernel supported large
files

0L

If 0L was returned, LFS was enabled, if 0 was returned, LFS was not
enabled. ..
Is the fd.tell() 0L trick no longer valid. What is the right way to
test for LFS support in python?

Python is undergoing a process of int/long unification.
str(long_number) as of 2.0 no longer puts a "L" at the end
of string.

If you want to use your method, try using 'repr' instead
of 'str'. That probably won't happen for the long term
though (3.0 and greater?).

You might want to try doing a seek to a large int
to see what happens.

lfs_enabled = True
try:
fd.tell(sys.maxint * 3L)
except OverflowError:
lfs_enabled = False

I've tried various Python versions I have, but they
are all enabled with large file support so I don't
know if this works.

Andrew
(e-mail address removed)
 
P

Peter Hansen

Andrew said:
Python is undergoing a process of int/long unification.
str(long_number) as of 2.0 no longer puts a "L" at the end
of string.

If you want to use your method, try using 'repr' instead
of 'str'. That probably won't happen for the long term
though (3.0 and greater?).

Wouldn't doing type(fd.tell()) be better than coupling
the logic to what might be a changing representation of
the data?

-Peter
 
A

Andrew Dalke

John Hunter
Do you mean fd.seek?

D'oh! Yeah.
I did manage to read and write a 4GB file on crcdocs using the script I
posted above. Perhaps I don't understand what the "large" in large
file really means. I assumed it was 2**31 approx equal 2GB. Has the
default, non LFS limit, increased?

I think the answer is "it depends."

By default Python checks for large file support. From
'configure'

if test "$have_long_long" = yes -a \
"$ac_cv_sizeof_off_t" -gt "$ac_cv_sizeof_long" -a \
"$ac_cv_sizeof_long_long" -ge "$ac_cv_sizeof_off_t"; then

cat >>confdefs.h <<\_ACEOF
#define HAVE_LARGEFILE_SUPPORT 1
_ACEOF

In other words, the default is to support large files
but only when the system supports >32 bit off_t. CVS
says that was added in 1999.


The fileobject.c code for seek has

#if !defined(HAVE_LARGEFILE_SUPPORT)
offset = PyInt_AsLong(offobj);
#else
offset = PyLong_Check(offobj) ?
PyLong_AsLongLong(offobj) : PyInt_AsLong(offobj);
#endif
if (PyErr_Occurred())
return NULL;

so my (corrected) statement about seeking to >2**31
should work. Though I'm not sure now that sys.maxint
is the proper test since it might return 2**63 under
a 64 bit machine. Hardcoding the value should work.

I did do the normal incantation with the CFLAGS when compiling python
for LFS on crcdocs.

You're beyond my knowledge there. I thought that
Python did the check automatically and didn't need
the CLAGS= ...

I compile from CVS source without special commands
and it Just Works.

Andrew
(e-mail address removed)
 
A

Andrew Dalke

Peter said:
Wouldn't doing type(fd.tell()) be better than coupling
the logic to what might be a changing representation of
the data?

I suggested using repr instead of str because that
would make the smallest impact on the OP's code.

Using type(fd.tell()) == long) or rather
isinstance(fd.tell(), long) would be better,
but I can see a few possible problems.

- What does tell() return on a 64 bit
machine? A Python integer or long?

- When unification is finished, will it be
that isinstance(0, long) and/or
isinstance(2**35, int) (on a 32 bit machine)?

Since the question is "can I seek to positions
> 2**31" it seems easier to just try to seek
to something that high up. The OP pointed out
that /dev/null supports seeks, making it the
easiest one to test on a Unix system.

I looked in the available configuration information
(distutils.sysconfig and grepping
/usr/local/lib/python2.4/config ) but didn't
see any mention of HAVE_LARGEFILE_SUPPORT so I
don't think it's possible for the runtime Python
to figure that information expect by testing
the function calls.

Andrew
(e-mail address removed)
 
J

John Hunter

Andrew> You're beyond my knowledge there. I thought that Python
Andrew> did the check automatically and didn't need the CLAGS= ...

Andrew> I compile from CVS source without special commands and it
Andrew> Just Works.

Well, the reason I have embarked on this path is to find out if my
zope has LFS support or not. Your post gave me the inspiration to
just dig through the zope configure script and find out what they were
doing (I've hit the 2GB limit one too many times with the zobd to let
myself do it again).

In inst/configure.py, I found

def test_largefile():
OK=0
f = open(sys.argv[0], 'r')
try:
# 2**31 == 2147483648
f.seek(2147483649L)
f.close()
OK=1
except (IOError, OverflowError):
f.close()
if OK:
return
...else raise an error message...


I pass the 2**31 test but fail your sys.maxint * 3L
Traceback (most recent call last):
File "<stdin>", line 1, in ?
OverflowError: long int too large

It appears, that as far as zope if concerned, I can exceed the 2GB
limit and so I should be safe, hopefully up to
9223372036.8547764

Should cover me for a while <wink>

Thanks for your help,
JDH
 
J

John Hunter

Andrew> Huh. I wonder what's going on there. Perhaps the Windows
Andrew> API used handles up to 2**32?

No, I can do 2**32 and much higher. Apparently, the limit on my
system is sys.maxint (which is *really big*)
Traceback (most recent call last):
0

I suddenly realize what is going on. The CPU, the AMD Opteron 250, is
a 64 bit processor. I just got this machine and didn't know what kind
of CPU it had.
9223372036854775808L

Apparently, on 64bit systems, tell returns an int because an int can
address 2**63 bytes. So the acid test in the 64 bit era for LFS is to
try and seek to 2**31, not to check the return type of tell.

JDH
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top