RotatingFileHandler bugs/errors and a general logging question.

  • Thread starter nicholas.petrella
  • Start date
N

nicholas.petrella

I am currently trying to use the python logging system as a core
enterprise level logging solution for our development and production
environments.

The rotating file handler seems to be what I am looking for as I want
the ability to have control over the number and size of log files that
are written out for each of our tools. I have noticed a few problems
with this handler and wanted to post here to get your impressions and
possibly some ideas about whether these issues can be resolved.


The first issue is with multiple copies of the same tool trying to log
to the same location. This should not be an issue as the libraries are
supposed to be thread safe and therefore also should be safe for
multiple instances of a tool. I have run into two problems with
this...


1.
When a log file is rolled over, occasionally we see the following
traceback in the other instance or instances of the tool:

Traceback (most recent call last):
File "/usr/local/lib/python2.4/logging/handlers.py", line 62, in
emit
if self.shouldRollover(record):
File "/usr/local/lib/python2.4/logging/handlers.py", line 132, in
shouldRollover
self.stream.seek(0, 2) #due to non-posix-compliant Windows
feature
ValueError: I/O operation on closed file

As best I can tell this seems to be caused by instance A closing the
log file and rolling it over and instance B is still trying to use
it's file handle to that log file. Except that A has replaced the file
during rollover. It seems that a likely solution would be to handle
the exception and reopen the log file. It seems that the newer
WatchedFileHandler (http://www.trentm.com/python/dailyhtml/lib/
node414.html) provides the functionality that is needed, but I think
it would be helpful to have the functionality included with the
RotaingFileHandler to prevent these errors.

2.
I am seeing that at times when two instances of a tool are logging,
the log will be rotated twice. It seems that ass app.log approaches
the size limeit (10 MB in my case), the rollover is triggered in both
instances of the application causing a small log file to be created.
-rw-rw-rw- 1 petrella user 10485641 May 8 16:23 app.log
-rw-rw-rw- 1 petrella user 2758383 May 8 16:22 app.log.1 <----
Small log
-rw-rw-rw- 1 petrella user 10485903 May 8 16:22 app.log.2
-rw-rw-rw- 1 petrella user 2436167 May 8 16:21 app.log.3

It seems that the rollover should also be protected so that the log
file is not rolled twice.




I also wanted to ask for anyone's thoughts on maybe a better way to
implement python logging to meet our needs.

The infrastructure in which I am work needs the ability to have log
files written to from multiple instances of the same script and
potentially from hundreds or more different machines.

I know that the documentation suggests using a network logging server
but I wanted to know if anyone had any other solutions to allow us to
build off of the current python logging packages.

Thanks in advance for any of your responses.

-Nick
 
D

Dennis Lee Bieber

The first issue is with multiple copies of the same tool trying to log
to the same location. This should not be an issue as the libraries are
supposed to be thread safe and therefore also should be safe for
multiple instances of a tool. I have run into two problems with
this...
I wouldn't be so sure of that. The logger may perform an internal
lock call before attempting output, so threads within one program are
safe. But those locks may not be system globals, and a second instance
of the program does not see them.
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
V

Vinay Sajip

The infrastructure in which I am work needs the ability to have log
files written to from multiple instances of the same script and
potentially from hundreds or more different machines.

I know that the documentation suggests using a networkloggingserver
but I wanted to know if anyone had any other solutions to allow us to
build off of the current pythonloggingpackages.
Dennis is right - the logging system is threadsafe but not safe
against multiple processes (separate Python instances) writing to the
same file. It certainly sounds like you need a scalable solution - and
having each script send the events to a network logging server seems a
good way of handling the scalability requirement. The logger name used
can include the script instance and machine name, e.g. by starting
with hostname.scriptname.scriptpid... The socket server which receives
the events can demultiplex them based on this information and write
them to a central repository in any arrangement you care to implement
(e.g. into one file or several).

Given that the example in the docs is a (basic) working example, is
there any particular reason why you don't want to follow the suggested
approach?

Regards,

Vinay Sajip
 
N

nicholas.petrella

On May 9, 12:52 am, (e-mail address removed) wrote:> The infrastructure in which I am work needs the ability to have log


Dennis is right - the logging system is threadsafe but not safe
against multiple processes (separate Python instances) writing to the
samefile. It certainly sounds like you need a scalable solution - and
having each script send the events to a network logging server seems a
good way of handling the scalability requirement. The logger name used
can include the script instance and machine name, e.g. by starting
with hostname.scriptname.scriptpid... The socket server which receives
the events can demultiplex them based on this information and write
them to a central repository in any arrangement you care to implement
(e.g. into onefileor several).

Given that the example in the docs is a (basic) working example, is
there any particular reason why you don't want to follow the suggested
approach?

Regards,

Vinay Sajip


Our biggest concerns with the network solution is having a single
point of failure and the need for scalability. We could have
potentially thousands of machines doing logging across multiple
geographic sites. Our fear with the network solution is overwhelming
the server or group of servers as well as having a single point of
failure for the logging interface. In addition using a server or
servers would require added support for the logigng server machines.
Our NFS infrastructure is very well supported and can handle the load
generated by these machines already (A load which would be many times
more than what the logging would generate) which is why we would like
to log directly to file system without going through a separate
server. Also adding in a logging server introduces one more level
where we could potentially have failure. We would like to keep the
infrastructure for our logging as simple as possible as we rely on log
files to give us critical information when troubleshooting issues.

It sounds like my only option may be using a server in order to handle
the logging from different hosts. That or possibly having individual
log files for each host.

Thanks for your input. It is much appreciated.

-Nick
 
V

Vinay Sajip

Our biggest concerns with the network solution is having a single
point of failure and the need for scalability. We could have
potentially thousands of machines doingloggingacross multiple
geographic sites. Our fear with the network solution is overwhelming
the server or group of servers as well as having a single point of
failure for thelogginginterface. In addition using a server or
servers would require added support for the logigng server machines.
Our NFS infrastructure is very well supported and can handle the load
generated by these machines already (A load which would be many times
more than what theloggingwould generate) which is why we would like
to log directly to file system without going through a separate
server. Also adding in aloggingserver introduces one more level
where we could potentially have failure. We would like to keep the
infrastructure for ourloggingas simple as possible as we rely on log
files to give us critical information when troubleshooting issues.

It sounds like my only option may be using a server in order to handle
theloggingfrom different hosts. That or possibly having individual
log files for each host.

There are other options - you don't necessarily need a separate
logging server. For example, you could

(a) Have a single network receiver process per host which writes to
disk and avoids the problem of contention for the file. Although this
process could be a point of failure, it's a pretty simple piece of
software and it should be possible to manage the risks.
(b) If you wanted to centralize log information, you could move the
log files from each host onto a central NFS disk using standard tools
such as e.g. rsync, and then manipulate them for reporting purposes
however you want.

Regards,

Vinay Sajip
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top