Speeding up Python's exit

S

Steven D'Aprano

I just quit an interactive session using Python 2.7 on Linux. It took in
excess of twelve minutes to exit, with the load average going well past 9
for much of that time.

I think the reason it took so long was that Python was garbage-collecting
a giant dict with 10 million entries, each one containing a list of the
form [1, [2, 3], 4]. But still, that's terribly slow -- ironically, it
took longer to dispose of the dict (12+ minutes) than it took to create
it in the first place (approx 3 minutes, with a maximum load of 4).

Can anyone explain why this was so painfully slow, and what (if anything)
I can do to avoid it in the future?

I know there is a function os._exit which effectively kills the Python
interpreter dead immediately, without doing any cleanup. What are the
consequences of doing this? I assume that the memory used by the Python
process will be reclaimed by the operating system, but other resources
such as opened files may not be.
 
N

Neil Cerutti

Can anyone explain why this was so painfully slow, and what (if
anything) I can do to avoid it in the future?

I think your explanation makes sense. Maybe the nested nature of
the strings was causing it to churn looking for circular
references?

Disabling gc before exiting might do the trick, assuming you're
assiduously managing other resources with context managers.

gc.disable()
exit()
 
C

Chris Angelico

I think the reason it took so long was that Python was garbage-collecting
a giant dict with 10 million entries, each one containing a list of the
form [1, [2, 3], 4]. But still, that's terribly slow -- ironically, it
took longer to dispose of the dict (12+ minutes) than it took to create
it in the first place (approx 3 minutes, with a maximum load of 4).

Leaving the question of just *why* you have so much in your dict.....
but anyway.

Is it any different if you create a deliberate reference loop and then
stuff it into some module somewhere? That would force it to be kept
until interpreter shutdown, and then a cyclic garbage collection after
that, which quite probably would be never run. A stupid trick,
perhaps, but it might work; I tested it with a dummy class with a
__del__ method and it wasn't called. Putting it into some other module
may not be necessary, but I don't know what happens with the
interactive interpreter and what gets freed up when.

ChrisA
 
D

Devin Jeanpierre

Is it any different if you create a deliberate reference loop and then
stuff it into some module somewhere? That would force it to be kept
until interpreter shutdown, and then a cyclic garbage collection after
that, which quite probably would be never run. A stupid trick,
perhaps, but it might work; I tested it with a dummy class with a
__del__ method and it wasn't called. Putting it into some other module
may not be necessary, but I don't know what happens with the
interactive interpreter and what gets freed up when.

__del__ is never called for cyclic references.

-- Devin
 
D

Devin Jeanpierre

__del__ is never called for cyclic references.

Sorry, I posted too early. Not only is __del__ never called, but
__del__ is the reason the cycles aren't collected. I don't know if
your trick will work without __del__.

-- Devin
 
C

Chris Angelico

__del__ is never called for cyclic references.

D'oh. Test is flawed, then. But is the theory plausible? That the
cycle detector won't be called on exit after other modules get freed?

ChrisA
 
G

Grant Edwards

I know there is a function os._exit which effectively kills the
Python interpreter dead immediately, without doing any cleanup. What
are the consequences of doing this?

You loose any data you haven't saved to disk.
I assume that the memory used by the Python process will be reclaimed
by the operating system, but other resources such as opened files may
not be.

All open files (including sockets, pipes, serial ports, etc) will be
flushed (from an OS standpoint) and closed. If you've closed all the
files you've written to, there should be no danger in just pulling the
plug.
 
A

Antoine Pitrou

Steven D'Aprano said:
I just quit an interactive session using Python 2.7 on Linux. It took in
excess of twelve minutes to exit, with the load average going well past 9
for much of that time.

I think the reason it took so long was that Python was garbage-collecting
a giant dict with 10 million entries, each one containing a list of the
form [1, [2, 3], 4]. But still, that's terribly slow -- ironically, it
took longer to dispose of the dict (12+ minutes) than it took to create
it in the first place (approx 3 minutes, with a maximum load of 4).

Can anyone explain why this was so painfully slow, and what (if anything)
I can do to avoid it in the future?

You are basically asking people to guess where your performance problem
comes from, without even providing a snippet so that people can reproduce ;)
I know there is a function os._exit which effectively kills the Python
interpreter dead immediately, without doing any cleanup. What are the
consequences of doing this? I assume that the memory used by the Python
process will be reclaimed by the operating system, but other resources
such as opened files may not be.

The OS always disposes of per-process resources when the process terminates
(except if the OS is buggy ;-)). However, file buffers will not be flushed,
atexit handlers and other destructors will not be called, database
transactions will be abandoned (rolled back), etc.

Regards

Antoine.
 
A

Antoine Pitrou

Grant Edwards said:
All open files (including sockets, pipes, serial ports, etc) will be
flushed (from an OS standpoint) and closed.

According to POSIX, no, open files will not be flushed:

“The _Exit() and _exit() functions shall not call functions registered with
atexit() nor any registered signal handlers. Open streams shall not be flushed.
Whether open streams are closed (without flushing) is implementation-defined.â€

http://pubs.opengroup.org/onlinepubs/9699919799/functions/_exit.html

(under the hood, os._exit() calls C _exit())

Regards

Antoine.
 
D

Dave Angel

According to POSIX, no, open files will not be flushed:

“The _Exit() and _exit() functions shall not call functions registered with
atexit() nor any registered signal handlers. Open streams shall not be flushed.
Whether open streams are closed (without flushing) is implementation-defined.â€

Note he didn't say the python buffers would be flushed. It's the OS
buffers that are flushed.
 
A

Antoine Pitrou

Dave Angel said:
Note he didn't say the python buffers would be flushed. It's the OS
buffers that are flushed.

Now please read my message again. The OS buffers are *not* flushed according
to POSIX.
 
J

Jason Swails

Now please read my message again. The OS buffers are *not* flushed
according
to POSIX.

I have observed this behavior on some Linux systems with a Fortran program
that terminated abnormally (via a kill signal). Other Linux systems I've
used appear to flush their file buffers to disk in the event of a kill
signal, it really depends on the system.

If a file object's destructor is not called when the Python interpreter
exits and it's up to the OS to flush the file buffers to disk, you can't be
sure that it will do so. And as Antoine pointed out, POSIX standard
doesn't require that they do.

All the best,
Jason
 
R

Ross Ridge

Antoine Pitrou said:
Now please read my message again. The OS buffers are *not* flushed according
to POSIX.

POSIX says open *streams* might not be flushed. POSIX streams are C
FILE * streams and generally aren't regarded as being part of the OS.

When you call os._exit() in a Python program any unwritten data still
in Python's own file buffers will be lost. Any unwritten data still
in the C library's FILE * buffers will be lost. Any data successfuly
written through a POSIX file descriptor (eg. using the write() function)
will not be lost becasue os._exit() was used.

Note that this doesn't mean that OS buffers will flushed when os._exit()
is called. Data that hasn't yet been physically written to disk, hasn't
be successfully transmitted over the network, or otherwise hasn't been
fully comitted could still be lost. However, exiting Python normally
doesn't change this. Only the Python process's own internal buffers are
flushed, the OS doesn't change its handling of its buffers. If you want
written data to be fully committed before exiting you need to use other
OS services that guarantee this.

Ross Ridge
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,740
Latest member
AdolphBig6

Latest Threads

Top