problem - logging for mt programs

A

ara.t.howard

anyone out there have good ways of dealing with multithreaded logging? my
current approach uses a meta programming hack/mutex/etc so that, if the
calling thread is the main one, a tid is not written, otherwise it is. thus,

I, [2006-09-14T15:41:58.394355 #14588] INFO -- : aquired lock.

I, [2006-09-14T15:41:59.225737 #14588] INFO -- : t[-609956564] command <DDB_GEN_DIR=/dev/shm/fod/gen DDB_OUT_DIR=/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out DDB_LOG_DIR=/dev/shm/fod/log DDB_TMP_DIR=/de v/shm/fod/tmp DDB_EPH_DIR=/raid/lv1/dmsp_reprocessing/ddb/eph.d DDB_BVL_DIR=/raid/lv1/dmsp_reprocessing/ddb/bvl.d DDB_BVL_EXE=false ddbasic -M nrt -n 2 -P ofdt -0.252525252525252 -P osdt -0.420875420875421 -S fod 2000.F12_0010909_DS.DAT 2000.F12_0011055_DS.DAT>


shows the first line being logged in the main thread, while the next line is
logged by tid -609956564. this is ok. but, obviously, log lines become
interleaved at a certian point leading to sections like

I, [2006-09-14T15:43:03.486280 #14588] INFO -- : t[-610194194] command <DDB_GEN_DIR=/dev/shm/foa/gen DDB_OUT_DIR=/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/foa/out DDB_LOG_DIR=/dev/shm/foa/log DDB_TMP_DIR=/dev/shm/foa/tmp DDB_EPH_DIR=/raid/lv1/dmsp_reprocessing/ddb/eph.d DDB_BVL_DIR=/raid/lv1/dmsp_reprocessing/ddb/bvl.d DDB_BVL_EXE=false ddbasic -M nrt -n 2 -P ofdt -0.252525252525252 -P osdt -0.420875420875421 -S foa 2000.F12_0011232_DS.DAT 2000.F12_0011414_DS.DAT>
I, [2006-09-14T15:43:03.487003 #14588] INFO -- : t[-610194894] tmp_dir </raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/tmp>
I, [2006-09-14T15:43:03.489989 #14588] INFO -- : t[-610194894] command <DDB_GEN_DIR=/dev/shm/fod/gen DDB_OUT_DIR=/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out DDB_LOG_DIR=/dev/shm/fod/log DDB_TMP_DIR=/dev/shm/fod/tmp DDB_EPH_DIR=/raid/lv1/dmsp_reprocessing/ddb/eph.d DDB_BVL_DIR=/raid/lv1/dmsp_reprocessing/ddb/bvl.d DDB_BVL_EXE=false ddbasic -M nrt -n 2 -P ofdt -0.252525252525252 -P osdt -0.420875420875421 -S fod 2000.F12_0011232_DS.DAT 2000.F12_0011414_DS.DAT>
I, [2006-09-14T15:44:18. 45782 #14588] INFO -- : t[-610194894] status <0>
I, [2006-09-14T15:44:18. 46468 #14588] INFO -- : t[-610297924] command <empty_ols /raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001010754.OIS>
I, [2006-09-14T15:44:18. 47187 #14588] INFO -- : t[-610299344] command <empty_ols /raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001010936.OIS>
I, [2006-09-14T15:44:18. 50663 #14588] INFO -- : t[-610298484] command <empty_ols /raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001011118.OIS>
I, [2006-09-14T15:44:18. 53356 #14588] INFO -- : t[-610301254] command <empty_ols /raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001011300.OIS>
I, [2006-09-14T15:44:25.361170 #14588] INFO -- : t[-610194194] status <0>
I, [2006-09-14T15:44:18. 50663 #14588] INFO -- : t[-610297924] status <0>


here, we can certainly track the status of commands back the the line/thread
that issued them, but it's pretty painful to to by hand. of course, one could
easily post-process in order to analyze...

so - has anyone else solved this problem neatly?

regards.

-a
 
D

Dan Bensen

anyone out there have good ways of dealing with multithreaded logging?

You mean other than assigning each thread a separate file?
You probably have to either buffer messages more heavily, so they're
interleaved more coarsely, or if you want to write even more code, walk
through the log file each time and insert the message in the appropriate
place.
 
A

Alex Young

anyone out there have good ways of dealing with multithreaded logging? my
current approach uses a meta programming hack/mutex/etc so that, if the
calling thread is the main one, a tid is not written, otherwise it is.
thus,

I, [2006-09-14T15:41:58.394355 #14588] INFO -- : aquired lock.

I, [2006-09-14T15:41:59.225737 #14588] INFO -- : t[-609956564] command
<DDB_GEN_DIR=/dev/shm/fod/gen
DDB_OUT_DIR=/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out
DDB_LOG_DIR=/dev/shm/fod/log DDB_TMP_DIR=/de v/shm/fod/tmp
DDB_EPH_DIR=/raid/lv1/dmsp_reprocessing/ddb/eph.d
DDB_BVL_DIR=/raid/lv1/dmsp_reprocessing/ddb/bvl.d DDB_BVL_EXE=false
ddbasic -M nrt -n 2 -P ofdt -0.252525252525252 -P osdt
-0.420875420875421 -S fod 2000.F12_0010909_DS.DAT
2000.F12_0011055_DS.DAT>


shows the first line being logged in the main thread, while the next
line is
logged by tid -609956564. this is ok. but, obviously, log lines become
interleaved at a certian point leading to sections like

I, [2006-09-14T15:43:03.486280 #14588] INFO -- : t[-610194194] command
<DDB_GEN_DIR=/dev/shm/foa/gen
DDB_OUT_DIR=/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/foa/out
DDB_LOG_DIR=/dev/shm/foa/log DDB_TMP_DIR=/dev/shm/foa/tmp
DDB_EPH_DIR=/raid/lv1/dmsp_reprocessing/ddb/eph.d
DDB_BVL_DIR=/raid/lv1/dmsp_reprocessing/ddb/bvl.d DDB_BVL_EXE=false
ddbasic -M nrt -n 2 -P ofdt -0.252525252525252 -P osdt
-0.420875420875421 -S foa 2000.F12_0011232_DS.DAT 2000.F12_0011414_DS.DAT>
I, [2006-09-14T15:43:03.487003 #14588] INFO -- : t[-610194894] tmp_dir
</raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/tmp>
I, [2006-09-14T15:43:03.489989 #14588] INFO -- : t[-610194894] command
<DDB_GEN_DIR=/dev/shm/fod/gen
DDB_OUT_DIR=/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out
DDB_LOG_DIR=/dev/shm/fod/log DDB_TMP_DIR=/dev/shm/fod/tmp
DDB_EPH_DIR=/raid/lv1/dmsp_reprocessing/ddb/eph.d
DDB_BVL_DIR=/raid/lv1/dmsp_reprocessing/ddb/bvl.d DDB_BVL_EXE=false
ddbasic -M nrt -n 2 -P ofdt -0.252525252525252 -P osdt
-0.420875420875421 -S fod 2000.F12_0011232_DS.DAT
2000.F12_0011414_DS.DAT>
I, [2006-09-14T15:44:18. 45782 #14588] INFO -- : t[-610194894] status <0>
I, [2006-09-14T15:44:18. 46468 #14588] INFO -- : t[-610297924] command
<empty_ols
/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001010754.OIS>
I, [2006-09-14T15:44:18. 47187 #14588] INFO -- : t[-610299344] command
<empty_ols
/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001010936.OIS>
I, [2006-09-14T15:44:18. 50663 #14588] INFO -- : t[-610298484] command
<empty_ols
/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001011118.OIS>
I, [2006-09-14T15:44:18. 53356 #14588] INFO -- : t[-610301254] command
<empty_ols
/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001011300.OIS>
I, [2006-09-14T15:44:25.361170 #14588] INFO -- : t[-610194194] status <0>
I, [2006-09-14T15:44:18. 50663 #14588] INFO -- : t[-610297924] status <0>


here, we can certainly track the status of commands back the the
line/thread
that issued them, but it's pretty painful to to by hand. of course, one
could
easily post-process in order to analyze...

so - has anyone else solved this problem neatly?
I handle this by having each thread *always* drop its thread id into the
logs, whether it's the main thread or not. However, I don't do it as a
number - I write the entry at /usr/share/dict/words corresponding to the
thread id modulo the length of the file. That way, the threads are
identified by a memorable word rather than an unmemorable number, and
grep does the rest. It might not seem ground-breaking, but it makes the
whole thing much more friendly.

I've toyed in the past with assigning each thread a sequential ID, and
altering the log output so that it's actually a CSV file with the log
output for each thread appearing in its own column, but that ended up
being more trouble than it was worth at the time. I can dig the code up
when I'm in the office tomorrow if you're interested, but I'm sure
you're more than capable of replicating it without my help :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,212
Messages
2,571,101
Members
47,695
Latest member
KayleneBee

Latest Threads

Top