Stream thread-safety

Y

Yang Zhang

I know that std streams are not thread-safe (i.e., output can end up
becoming interleaved), but do they share mutable state? (At least in
libstdc++ 4.2 on Linux kernel 2.6.x?)

I just spent a long time tracking down a bug in my program where,
given enough threads, eventually all cout<<int would print in hex
rather than dec, even though I never used stream formatters. couts in
this program are called from multiple threads, and some couts around
the middle of my program ended up interleaved - couts before it are
fine, and later ones end up in hex. After removing the contentious
couts in the middle, the problem went away - no more hex.

Could this have been an effect of using the streams from multiple
threads? I've been searching a long time for memory bugs, but haven't
come up with anything. Thanks in advance for any hints.
 
J

James Kanze

I know that std streams are not thread-safe (i.e., output can
end up becoming interleaved), but do they share mutable state?
(At least in libstdc++ 4.2 on Linux kernel 2.6.x?)

That's very implementation defined. They certainly depend at
least a little on shared mutable state, since they use dynamic
memory.

It shouldn't matter, however. Unless there's an error in the
implementation, if the implementation is given as being usable
in a multithreaded environment (which is what thread-safe
means), it's not your problem. You only need to synchronize if
several threads are accessing the same stream.
I just spent a long time tracking down a bug in my program where,
given enough threads, eventually all cout<<int would print in hex
rather than dec, even though I never used stream formatters. couts in
this program are called from multiple threads, and some couts around
the middle of my program ended up interleaved - couts before it are
fine, and later ones end up in hex. After removing the contentious
couts in the middle, the problem went away - no more hex.

Obviously, all accesses to cout must be synchronized externally
if cout is used by more than one thread. Otherwise, it's your
program which isn't thread safe, not cout. And just as
obviously, it is the responsibility of whoever modifies a
formatting flag to restore it after they've finished---and
before they free up any synchronization.
Could this have been an effect of using the streams from
multiple threads?

Not if the threads are using the object correctly. If each
thread acts as if it were the only thread using the resource,
however, anything could happen.
 
C

Chris Thomasson

I know that std streams are not thread-safe (i.e., output can
end up becoming interleaved), but do they share mutable state?
(At least in libstdc++ 4.2 on Linux kernel 2.6.x?)

That's very implementation defined. They certainly depend at
least a little on shared mutable state, since they use dynamic
memory.

It shouldn't matter, however. Unless there's an error in the
implementation, if the implementation is given as being usable
in a multithreaded environment (which is what thread-safe
means), [...]

Thread-Safe in what context? basic or strong?
 
C

Chris Thomasson

DAMN,SHI%$###!

The quoted text is totally screwed! Let me try and fix that nonsense bastar$
garbage:


Chris Thomasson said:
I know that std streams are not thread-safe (i.e., output can
end up becoming interleaved), but do they share mutable state?
(At least in libstdc++ 4.2 on Linux kernel 2.6.x?)

That's very implementation defined. They certainly depend at
least a little on shared mutable state, since they use dynamic
memory.

It shouldn't matter, however. Unless there's an error in the
implementation, if the implementation is given as being usable
in a multithreaded environment (which is what thread-safe
means), [...]

Origin Question by CT in response to JK:

'Thread-Safe in what context? basic or strong?'







I so VERY sorry for the horrible syntax contained within my first response
to James Kanze within this thread!


;^(...
 
J

James Kanze

[...]
Thread-Safe in what context? basic or strong?

I'm not familiar with those terms. Thread safe means,
basically, specifying how objects of the class must behave in a
multithreaded environment. If a class has a contract which
specifies what the user must do in a multithreaded environment,
it is thread safe. If it doesn't it isn't.

Obviously, since there is a contract, the user must uphold his
end of it as well, or you have undefined behavior. But "thread
safety" is largely a question of specification and
documentation, not of any particular code behavior. (Although I
suppose one could argue that if the contract said that all use
of the class must be synchronized externally, including use of
two different instances, that the class wasn't thread safe.)
 
C

Chris Thomasson

[...]
I'm not familiar with those terms. Thread safe means,
basically, specifying how objects of the class must behave in a
multithreaded environment.
[...]3

Darn... I was thinking of something else.


To the OP: in order to sync stream objects, you need a level of coarse-grain
external sync... *Or* here is a possible scheme that can increase
granularity and scalability:


http://groups.google.com/group/comp.programming.threads/msg/3f0362ba3da48d0b






That being said:
__________________________________________________________________

http://www.boost.org/libs/smart_ptr/shared_ptr.htm#ThreadSafety
the ("Any other simultaneous accesses result in undefined behavior." clause
contained therein means that shared_ptr, __as-is__, only follows
"basic/normal" thread-safety level)


Anyway, here is a fairly lengthy discussion on the difference between
basic/normal and strong thread-safety levels:


http://groups.google.com/group/comp.programming.threads/browse_frm/thread/e5167941d32340c6
(this thread substitutes the thread-safety level term 'normal' with 'basic',
and vise-versa, as they are one in the same...)




For some working examples, the following code can handle both basic _and_
strong models:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/3c763698da537d8f

http://home.comcast.net/~appcore/vzoom/refcount

http://atomic-ptr-plus.sourceforge.net

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/177609a5eff4a466
(POSIX based version)




Does that help clear some things up? If not, perhaps we can enlighten some
of the people who lurk around this group...


Any questions?
 
J

James Kanze

James Kanze said:
[...]
Thread-Safe in what context? basic or strong?
I'm not familiar with those terms. Thread safe means,
basically, specifying how objects of the class must behave in a
multithreaded environment.

Darn... I was thinking of something else.
To the OP: in order to sync stream objects, you need a level
of coarse-grain external sync... *Or* here is a possible
scheme that can increase granularity and scalability:

That's one solution. For logging, I generally use a temporary
instance of a wrapper class---the constructor acquires the lock
(and also sets up various other things, like ensuring a time
stamp and other standard information appears at the start of the
record), and the final destructor frees it.

This has the advantage that the tests whether logging is active
at the desired level occur immediately---since they only access
read-only data, no lock is necessary, and if logging is not
active, no lock is acquired, nor is much of anything else done.
Which means that you can throw in debug logging statements all
over the place without noticeably slowing the application down.
That being said:
__________________________________________________________________
http://www.boost.org/libs/smart_ptr/shared_ptr.htm#ThreadSafety
the ("Any other simultaneous accesses result in undefined
behavior." clause contained therein means that shared_ptr,
__as-is__, only follows "basic/normal" thread-safety level)

Note that shared_ptr pose a particular problem, because part of
the state is shared between different instances, and the client
code may not be aware of this. So you have to define your
guarantees very carefully.
Anyway, here is a fairly lengthy discussion on the difference
between basic/normal and strong thread-safety levels:
http://groups.google.com/group/comp.programming.threads/browse_frm/th...
(this thread substitutes the thread-safety level term 'normal'
with 'basic', and vise-versa, as they are one in the same...)
For some working examples, the following code can handle both
basic _and_ strong models:



http://groups.google.com/group/comp.programming.threads/browse_frm/th...
(POSIX based version)
Does that help clear some things up? If not, perhaps we can
enlighten some of the people who lurk around this group...
Any questions?

My main point was that whatever level you decide to implement
(and there are definitly more than two possible levels), what
makes code "thread safe" is the contract. Correctly documented
and implemented in consequence, even something like localtime()
is thread safe. (Posix, of course, chose the option to NOT make
any thread related contract apply here. A reasonable decision:
given the interface, any possible contract would be extremely
constraining on the user, and offering a new function, with a
contract more in line with the general principles of Posix, is
probably a better solution.)
 
C

Chris Thomasson

On Feb 13, 6:51 am, "Chris Thomasson" <[email protected]> wrote:
[...]
Thread-Safe in what context? basic or strong?
I'm not familiar with those terms. Thread safe means,
basically, specifying how objects of the class must behave in a
multithreaded environment.
[...]3
Darn... I was thinking of something else.
To the OP: in order to sync stream objects, you need a level
of coarse-grain external sync... *Or* here is a possible
scheme that can increase granularity and scalability:
http://groups.google.com/group/comp.programming.threads/msg/3f0362ba3...
That's one solution. For logging, I generally use a temporary
instance of a wrapper class---the constructor acquires the lock
(and also sets up various other things, like ensuring a time
stamp and other standard information appears at the start of the
record), and the final destructor frees it.

Yup. You are referring to on-stack logger instance. I also want on-stack
log-producer, with a log-consumer thread on the other side. I don't want log
threads to execute final output commands. I want to use shared monotonic
counter to assign log-entries a ordered timestamp (e.g., think Lamport). The
log-consumer will be able to query it's registered threads for log info;
order any results; then output. Advantage being that you have low-overhead
log-producers, and totally ordered log output wrt the dedicated log-consumer
logic.


This has the advantage that the tests whether logging is active
at the desired level occur immediately---since they only access
read-only data, no lock is necessary, and if logging is not
active, no lock is acquired, nor is much of anything else done.
Which means that you can throw in debug logging statements all
over the place without noticeably slowing the application down.


BTW, can your read-only data wrt loging rules mutate over application
lifetime? The soultion above has per-thread rules, and a GUI application can
issue writes into the PDR reader-pattern based distributed log-info (e.g.,
per-thread info that is)...

The log-level validation information can be per-thread, or in a global
lock-free PDR protected data-structure (e.g., think RCU for a moment) such
that log-threads are readers and log-level mutation operations would execute
on writers. This would allow for a GUI application to dynamically set
logging rules on a pr-thread basis at runtime. Search
comp.programming.threads for PDR for more info...



Humm. Well, I would expecting very something like:
___________________________________________________________________
struct per_thread_log {
atomicword_rules level;
per_thread_log* next;
per_thread_log* prev;
};


struct per_thread {
per_thread_log* next;
atomicword_rules level;
single_producer_consumer_queue logq;
};


struct log_thread {
per_thread_logging rules;
per_thread_log* head;
per_thread_log* tail;
};


extern per_thread_log(int level, const char* format, ...);



#define LOG_LEVEL_BASIC()0x1
#define LOG_LEVEL_IMPORTANT()0x2
#define LOG_LEVEL_SHI%T_HIT_THE_FAN()0x4



void* some_thread(void* state) {
per_thread_log
log(LOG_LEVEL_BASIC(), "%s\n", "Started...");
if (foo(state) == ERROR_666()) {
log(LOG_LEVEL_SHI%T_HIT_THE_FAN(),
"%s\n", "Stopped...");
}
log(LOG_LEVEL_BASIC(), "%s\n", "Stopped...");
return 0;
}
___________________________________________________________________



[...]
 
C

Chris Thomasson

I accidently sent that too early!


[...]

Fixing:
Humm. Well, I would expecting very something like:
___________________________________________________________________
struct per_thread_log {
atomicword_rules level;
per_thread_log* next;
per_thread_log* prev;
};


struct per_thread {
per_thread_log* next;

^^^^^^^^^^^^^^^^^^^^^^

rename per_thread::next -> per_thread::cache;

atomicword_rules level;
single_producer_consumer_queue logq;
};



struct log_thread {
per_thread_logging rules;
per_thread_log* head;
per_thread_log* tail;
};
^^^^^^^^^^^^^^^^^^^^^^

the struct above needs to be removed.

[...]


An example implementation of my wait-free single-producer/consumer
per_thread queue which can hold per_thread_log objects in the example (e.g.,
'per_thread::logq' member) can be found here:

http://appcore.home.comcast.net



Any comments on this approach?
 
J

James Kanze

"James Kanze" <[email protected]> wrote in message
On Feb 13, 10:49 am, "Chris Thomasson" <[email protected]>
wrote:
[...]
Thread-Safe in what context? basic or strong?
I'm not familiar with those terms. Thread safe means,
basically, specifying how objects of the class must behave in a
multithreaded environment.
[...]3
Darn... I was thinking of something else.
To the OP: in order to sync stream objects, you need a level
of coarse-grain external sync... *Or* here is a possible
scheme that can increase granularity and scalability:
http://groups.google.com/group/comp.programming.threads/msg/3f0362ba3....
That's one solution. For logging, I generally use a temporary
instance of a wrapper class---the constructor acquires the lock
(and also sets up various other things, like ensuring a time
stamp and other standard information appears at the start of the
record), and the final destructor frees it.
Yup. You are referring to on-stack logger instance. I also want on-stack
log-producer, with a log-consumer thread on the other side. I don't want log
threads to execute final output commands. I want to use shared monotonic
counter to assign log-entries a ordered timestamp (e.g., think Lamport). The
log-consumer will be able to query it's registered threads for log info;
order any results; then output. Advantage being that you have low-overhead
log-producers, and totally ordered log output wrt the dedicated log-consumer
logic.

I've used both solutions at different times. With everything
done in the thread generating the log, you hold the lock a lot
longer, which can significantly affect latency. On the other
hand, in case of a crash or such, any logs triggered before the
crash in the thread which crashed are guaranteed to have been
output. It's a bit more reliable for debugging. (In once case,
the logging thread was in another process, communicating via
shared memory, precisely to avoid this problem while still using
the log-consumer thread model.)
BTW, can your read-only data wrt loging rules mutate over application
lifetime? The soultion above has per-thread rules, and a GUI application can
issue writes into the PDR reader-pattern based distributed log-info (e.g.,
per-thread info that is)...

To date, I've not had to handle the case where the configuration
data could be changed while running AND we had multi-threading.
I've thought about it, though---the only solution that came to
mind immediately was to use a rwlock, held for read by all of
the threads, except when they are going to block; the thread
which will modify the configuration acquires it for write.

I don't want to use a mutex around the check whether a specific
level of logging is active; that check should be as fast as
possible.
The log-level validation information can be per-thread, or in a global
lock-free PDR protected data-structure (e.g., think RCU for a moment) such
that log-threads are readers and log-level mutation operations would execute
on writers. This would allow for a GUI application to dynamically set
logging rules on a pr-thread basis at runtime. Search
comp.programming.threads for PDR for more info...

If the need arises, I'll certainly research the topic; I'm sure
that there are better solutions than the one off the top of my
head. (There are a lot of issues that I've not investigated,
simply because I've not needed them yet.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,183
Messages
2,570,965
Members
47,512
Latest member
FinleyNick

Latest Threads

Top