STL & multithreading

Uenal Mutlu · Apr 24, 2005

Yes, but still it is application-level locking. At least this is how I call it. What do
you consider as application-level locking?

Application level locking is the opposite of locking on a per atomic object level.
Usually in application level locking one locks a whole object which can have
instances of other classes, or one uses a flag etc. But this is inefficient
because it usually locks too much. The less you can lock the better is it for
the performance of the application.
In some cases Application level locking can be the same as atomic object
level locking if the object doesn't have instances of other objects; ie. then both
are doing the same level of locking. But mostly application level locking
locks unnecessarily too much, and therefore it is slower.
BTW, a CriticalSection is a so called recursive-locking method: A thread which
already has the lock will not deadlock itself by calling a second lock on
the same object.

As non-application level locking I consider construct-level locking like the one provided
by OpenMP standard. An example I have provided in this thread is this:

#include <vector>

int main()
{
using namespace std;
vector<int> vec(100);

// ...

#pragma omp for
for(vector<int>::size_type i=0; i<vec.size(); ++i)
vec= i;
}

Do you know what happens if you delegate this to multiple threads
created by the application, ie. by the programmer himself
(that is: "vec" shall be shared by the threads)?
void ThreadProc()
{
#pragma omp for
for(vector<int>::size_type i=0; i<vec.size(); ++i)
vec= i;
}

Ie. if you are modifying the vector from more than one application thread.
I would say you will get garbage because there is no synchronization (ie. locking).
What you show us is distributing the job to all available processors or logical
units of it. This is of course a good feature too. But it has less to do with the
problem we are discussing in this thread.

Uenal Mutlu · Apr 24, 2005

No, that's not what application-level locking means. It means designing
the lock strategy with the entire application in mind. For example:

void log(const char *msg)
{
logfile << msg << '\n';
cout << msg << '\n';
}

Now, if you want to insure that both streams get all of their messages
in the same order, using object-level locks you have to first lock the
logfile, then lock cout, then write the text, then unlock the locks. If
you're not married to the notion of object locks, though, it's clear
that simply locking a lock on entry to the function and unlocking it on
exit does the job, and with only one lock instead of two.

As said: it all depends on the business logic.
This is a trivial case. Here, one usually would solve this by using
a lock which is valid for the duration of the log() function since the
requirement is: "insure that both streams get all of their messages
in the same order".
If this requirement weren't in the business logic then object level locking
would be faster because while thread x writes to logfile thread y could
write to cout at the same time.
The bottomline is: object level locking can be applied to any problem
whereas application level locking is is obviously too much
application dependent.

Pete Becker · Apr 24, 2005

Uenal said:
The bottomline is: object level locking can be applied to any problem

Maybe, although the discussion here hasn't shown that. But assuming that
that's true, the overhead is often excessive. If your application is too
slow all the object-oriented purity in the world won't make your
customers buy it.

Axter · Apr 24, 2005

Andre said:
Thing is... it's not a new step. Look up the concept of RAII or Resource
Acquisition Is Initialization. The management of any resource (not just
mutexes/critical sections) should be wrapped into a class of some sort
(generally speaking). This goes for files (see fstream), memory (see

std::auto_ptr or boost::shared_ptr), or mutexes (see ACE_Guard<>, if
you're using the ACE libraries, other threading platforms probably have
similar constructs).

Heck.. I recall an article in CUJ a couple of years ago that described a
std::auto_handle which templated against any type to do all of this.

What you're referring to are resource wrapper class that create
resource handles in the constructor and closes the resource handle in
the destructor.
Part of the ThreadSafeObject wrapper class is similar to that, in that
it creates a lock handle in the constructor, and deletes it in the
destructor.

But that's where the similarities end.

The rest of the logic is far removed from those type of generic
resource wrapper classes.

Uenal Mutlu · Apr 24, 2005

Maybe, although the discussion here hasn't shown that. But assuming that
that's true, the overhead is often excessive. If your application is too
slow all the object-oriented purity in the world won't make your
customers buy it.

Sorry, this is not true.
If you have multiple threads which must work on the same
container then I don't know how else you will solve the problem
except locking on a per object level for the smallest possible duration.
Here, an application level locking would do nothing else but the same,
if not worser (slower).
The access to the object must be done "interlocked", ie. one
thread lockes the object, does its operation, and releases the lock.
The next thread does similar. All accessing the same object at the
same time.
How else would you solve this concurrency problem?
Let me guess: lock the container, perform all operations in a thread,
then release the lock. This is not efficient and not parallel processing,
and not multi-threading, since other threads are blocked from access
to the object while the current thread does all his operations.
There is a faster way, and mostly requirements ask for this, esp.
for realtime data processing.
The solution is "interlocked" operations on the object by multiple threads.
And this can be achieved best usually using an object lock for
the smallest duration necessary, ie. for the duration of a method call.
Nothing hinders the user to keep the lock as long as necessary.
That is: object level locking allows finer locking granularity, and with it
the same is possible as with application level locking.
But the opposite is usually not true.

Pete Becker · Apr 24, 2005

Uenal said:
Sorry, this is not true.
If you have multiple threads which must work on the same
container then I don't know how else you will solve the problem
except locking on a per object level for the smallest possible duration.

And how does this template contribute to locking for the smallest
possible duration? It locks the entire call to a member function, which
is usually not the shortest possible duration. That's why, as I pointed
out elsewhere, Java's synchronized classes are rarely used any more.
They simply are the wrong level of granulatiry.

There's no simple solution. You have to think.

Axter · Apr 24, 2005

Pete said:
duration.

And how does this template contribute to locking for the smallest
possible duration? It locks the entire call to a member function, which
is usually not the shortest possible duration. That's why, as I pointed
out elsewhere, Java's synchronized classes are rarely used any more.
They simply are the wrong level of granulatiry.

There's no simple solution. You have to think.

There's never a simple solution when you think everything is
impossible.

My class is a simple solution, and if you can't see that, it's because
you don't want to.

Ioannis Vranos · Apr 24, 2005

Uenal said:
Application level locking is the opposite of locking on a per atomic object level.
Usually in application level locking one locks a whole object which can have
instances of other classes, or one uses a flag etc. But this is inefficient
because it usually locks too much. The less you can lock the better is it for
the performance of the application.
In some cases Application level locking can be the same as atomic object
level locking if the object doesn't have instances of other objects; ie. then both
are doing the same level of locking. But mostly application level locking
locks unnecessarily too much, and therefore it is slower.
BTW, a CriticalSection is a so called recursive-locking method: A thread which
already has the lock will not deadlock itself by calling a second lock on
the same object.

OK, thank you for the clarification.

As non-application level locking I consider construct-level locking like the one provided
by OpenMP standard. An example I have provided in this thread is this:

#include <vector>

int main()
{
using namespace std;
vector<int> vec(100);

// ...

#pragma omp for
for(vector<int>::size_type i=0; i<vec.size(); ++i)
vec= i;
}

Click to expand...

Do you know what happens if you delegate this to multiple threads
created by the application, ie. by the programmer himself
(that is: "vec" shall be shared by the threads)?
void ThreadProc()
{
#pragma omp for
for(vector<int>::size_type i=0; i<vec.size(); ++i)
vec= i;
}

Ie. if you are modifying the vector from more than one application thread.

Well in this case one should acquire the lock inside this function (in a system that
supports lock-based multithreading). OpenMP is independent (and more low-level) than a
system's multithreading facilities. Still although in a thread, the assignment itself will
be executed in new separate threads taking advantage of the presence of more than one
processors.

OpenMP does not replace a system's multithreading mechanism, it supplements it.

I would say you will get garbage because there is no synchronization (ie. locking).
What you show us is distributing the job to all available processors or logical
units of it. This is of course a good feature too. But it has less to do with the
problem we are discussing in this thread.

Click to expand...

Well, if vector is wrapped (although it is thread-safe) with the proposed wrapper, then
when operator[] was used, the wrapper would acquire the lock, making the rest of the
OpenMP created threads to wait (in the best case scenario).

Uenal Mutlu · Apr 25, 2005

Uenal said:
Uenal said:

Do you know what happens if you delegate this to multiple threads
created by the application, ie. by the programmer himself
(that is: "vec" shall be shared by the threads)?
void ThreadProc()
{
#pragma omp for
for(vector<int>::size_type i=0; i<vec.size(); ++i)
vec= i;
}

Ie. if you are modifying the vector from more than one application thread.

Click to expand...

Well in this case one should acquire the lock inside this function (in a system that
supports lock-based multithreading).

No
Inside the loop is the correct answer

I would say you will get garbage because there is no synchronization (ie. locking).
What you show us is distributing the job to all available processors or logical
units of it. This is of course a good feature too. But it has less to do with the
problem we are discussing in this thread.

Click to expand...

Well, if vector is wrapped (although it is thread-safe) with the proposed wrapper, then
when operator[] was used, the wrapper would acquire the lock, making the rest of the
OpenMP created threads to wait (in the best case scenario).

Click to expand...

Sorry, no. Yes, the class itself is thread-safe, but using it is not thread-safe
per se if used from multiple threads concurrently.
You can do on demand locking on element-wise access need.
Then each consumer would lock it, access an element or call
a single method of the object, and release the lock when
the method finishes (all locking & unlocking done automatically),
so would the next thread. All threads can access the object in an
interlocked fashion. You only need to lock the object inside the
loop with each iteration, not outside the loop.

Ioannis Vranos · Apr 25, 2005

Uenal said:
void ThreadProc()
{
#pragma omp for
for(vector<int>::size_type i=0; i<vec.size(); ++i)
vec= i;
}

Ie. if you are modifying the vector from more than one application thread.

Click to expand...

Well in this case one should acquire the lock inside this function (in a system that
supports lock-based multithreading).

Click to expand...

No
Inside the loop is the correct answer
?

Well, if vector is wrapped (although it is thread-safe) with the proposed wrapper, then
when operator[] was used, the wrapper would acquire the lock, making the rest of the
OpenMP created threads to wait (in the best case scenario).

Click to expand...

Sorry, no. Yes, the class itself is thread-safe, but using it is not thread-safe
per se if used from multiple threads concurrently.

Why? In VC++:

http://msdn.microsoft.com/library/d.../html/vclrfThreadSafetyInStandardCLibrary.asp

I assume this applies here:

"For writes to different objects of the same class, the object is thread safe for writing:

* From one thread when no readers on other threads.
* From many threads."

You can do on demand locking on element-wise access need.
Then each consumer would lock it, access an element or call
a single method of the object, and release the lock when
the method finishes (all locking & unlocking done automatically),
so would the next thread. All threads can access the object in an
interlocked fashion. You only need to lock the object inside the
loop with each iteration, not outside the loop.

Click to expand...

Yes but the function is performing only this loop and also acquiring and releasing the
lock is somewhat expensive. In addition an OpenMP directive is used here with which the
programmer provides the guarantee that each element access is independent of the others
(#pragma omp for).

With OpenMP aside, since we modify the entire container it makes more sense to acquire the
lock before the entire loop execution and release it afterwards (well it always depends on
context, it is a logical issue in reality).

Leaving this aside, given the thread-safety guarantee in the above URL for that compiler,
why it is not safe to be used by many threads concurrently without a lock, when each
threads operations do not affect the other threads, when all threads are reading different
parts of a vector for example? I am talking always about the specific compiler with the
aforementioned thread-safety guarantee.

Stephen Howe · Apr 25, 2005

Saying it a third time doesn't make it true.

Container locks do work.

They only work (and they are only worthwhile) if you execute one statement
that changes the internals of a container.
As soon as you have multiple executable statements that change the container
and need to be done as one unit to preserve integrity, the container locks
DO NOT work.

If you have a c

ontainer lock, then that makes the application lock

redundant.
Using an application lock is more of a procedural based logic
mentality.

It does the right thing, and that is all that matters.
If you know you need to apply a lock over several statements and it cannot
be unlocked before all those statements are executed, and most not be
unlocked between statements - what does the container lock do for this?
Nothing!!!
The application lock here is the "cheapest" lock - which is what is wanted.

This is fine for programs like C, but a more OO style language should
lean to a more Object Orientated approach.

By setting up the code so the object itself has the lock, you make the
code more OO, and safer then if you used the application level lock
approach.

Well

(i) It does not always work
(ii) Find me one vendor that is supplying thread-safe containers.
Among Dinkumware, Gnu, Metrowerks, STLPort, I don't know of any
- If your approach is so brilliant, why isnt it used anymore?
(iii) It is expensive. You provide more locking - even when it is not
needed.

Stephen Howe

Axter · Apr 25, 2005

Stephen said:
They only work (and they are only worthwhile) if you execute one statement
that changes the internals of a container.
As soon as you have multiple executable statements that change the container
and need to be done as one unit to preserve integrity, the container locks
DO NOT work.

That is completely incorrect. I've provided an example project that
shows, the lock able to work over multiple types of access that needs
integrity preserved between access.
Try the code out, and you'll see.
If you still think it's not functional, please post specific code
example where it will break.

ontainer lock, then that makes the application lock

It does the right thing, and that is all that matters.

If that's all that matter, C++ would never have come along.
Maintenance matter too.

If you know you need to apply a lock over several statements and it cannot
be unlocked before all those statements are executed, and most not be

unlocked between statements - what does the container lock do for this?
Nothing!!!

Again, incorrect. That RefLock can do exactly this. Please read the
code examples provided.

It's clear to me from your above remarks that you either have not
looked at the examples, or you just don't understand the code.
Either way, you're not in a good position to make a judgement call on
it.

Uenal Mutlu · Apr 25, 2005

Here is my commandline test program which demonstrates how to
use the ThreadSafeObject in an application.
It can be compiled verbatim, or split it into its .h and .cpp parts.
Requires the file ThreadSafeObject.h from http://code.axter.com/ThreadSafeObject.h

/*
TestCase1 v1.00

Testing 'ThreadSafeObject' of David Maisonave (http://code.axter.com/ThreadSafeObject.h)

Author ; U.Mutlu (uenal.mutlu at t-online.de)
Date : 050424Su
Compiler: VC++6 and later versions
AppType : Console application

Requirements:
- Get the file ThreadSafeObject.h from http://code.axter.com/ThreadSafeObject.h
- You can move the header part of the source below to TestCase1.h, but it's
not necessary for the purpose of this testing.

Compile and Link:
CL /GX /W3 /MD /Od TestCase1.cpp

What it does:
It populates a vector with 5 million items (each about 8 bytes) by 25
simultanously running threads (all do the same job, ie. adding items
at the end of the shared vector). The items are ascendingly numbered
so in the end we can check for consistency of the data (see code).
All threads do wait for a signal from the main thread before they
start their jobs.
After the collective job is done the main thread checks for consistency
and prints a report.
Locking of the vector happens "interlocked", ie. only for the duration
of adding a single item. By this method each thread has equal chance
to access the shared vector.
In the end the report shows how many items each thread has created
and whether the data is consistent or not.

In main() you can change nThreads and nItems.

*/

//--------------------------------------------------------------------
// this should go to TestCase1.h

#ifndef TestCase1_h
#define TestCase1_h

#include <windows.h>
#include "ThreadSafeObject.h" // see http://code.axter.com/ThreadSafeObject.h
#include <cassert>
#include <vector>
#include <iostream>
#include <conio.h> // for kbhit(), getch()

struct TSMyItem
{
int iVal;
int iThr; // creator of the element; 0 to nMaxThreads-1
};

class TCTestCase1
{
public:
TCTestCase1(int AnMaxItems, int AnMaxThreads);
~TCTestCase1();

void Start();
void Stop();
bool IsRunning() { return vectThr.GetLockedObject()->size() > 0; }

int CheckConsistency(bool AfDump = true); // dbg func

private:
const int nMaxThreads;
const int nMaxItems;
volatile bool fBegin, fQuit;

ThreadSafeObject<std::vector<TSMyItem> > vect;
ThreadSafeObject<std::vector<int> > vectThr; // 0 to nMaxThreads-1

static DWORD WINAPI ThreadProc(void* ApPar);
void ThreadProc_sub(int AiThr);

// no need to lock this b/c it is not accessed by multiple threads
std::vector<int> vectItemsPerThread;
};

#endif

//--------------------------------------------------------------------
// this should go to TestCase1.cpp:

// #include "TestCase1.h"

struct TSThrPar
{
int iThr;
void* pThis;
volatile bool fStarted;
TSThrPar(int AiThr, void* ApThis) : iThr(AiThr), pThis(ApThis), fStarted(false) {}
};

TCTestCase1::TCTestCase1(int AnMaxItems, int AnMaxThreads)
: nMaxItems(AnMaxItems),
nMaxThreads(AnMaxThreads),
fBegin(false),
fQuit(false),
vect(new std::vector<TSMyItem>),
vectThr(new std::vector<int>),
vectItemsPerThread(nMaxThreads)
{}

TCTestCase1::~TCTestCase1()
{
Stop();
}

void TCTestCase1::Start()
{
fQuit = false;
fBegin = false;
for (int i = 0;i < nMaxThreads; i++)
{
TSThrPar SPar(i, this);
DWORD dwThreadId;
HANDLE hThread = CreateThread(NULL, 0, ThreadProc, &SPar, 0, &dwThreadId);
while (!SPar.fStarted)
Sleep(1); // wait till thread sets fStarted
}
fBegin = true; // threads begin their work after this signal
}

void TCTestCase1::Stop()
{
fQuit = true;
while (vectThr.GetLockedObject()->size())
Sleep(100);
}

void TCTestCase1::ThreadProc_sub(int AiThr)
{
while (!fQuit)
{
// on demand locking on element-wise access need
ThreadSafeObject<std::vector<TSMyItem> >::RefLock lockedVect = vect.GetLockedObject();

TSMyItem S;
S.iVal = lockedVect->size() + 1; // this method helps us to check consistency at the end
S.iThr = AiThr;

if (S.iVal > nMaxItems)
break;

lockedVect->push_back(S);
}
}
DWORD WINAPI TCTestCase1::ThreadProc(void* ApPar)
{
TSThrPar* ApSPar = (TSThrPar*) ApPar;
TCTestCase1& rThis = *(TCTestCase1*) ApSPar->pThis;
int iThr = ApSPar->iThr;
rThis.vectThr.GetLockedObject()->push_back(iThr);
ApSPar->fStarted = true; // tell caller to continue; after this ApSPar becomes invalid
while (!rThis.fBegin)
Sleep(10);
rThis.ThreadProc_sub(iThr);
rThis.vectThr.GetLockedObject()->pop_back(); // only the size() is important...
return 0;
}

int TCTestCase1::CheckConsistency(bool AfDump) // dbg func
{ /* rc: 0=ok, -1=data consistency failed, -2=data was produced by 1 thread only,
-3=test still running (threads not finished)
the vector must be filled by at least 2 threads;
otherwise increase nMaxItems and/or nMaxThreads and try again
*/

if (IsRunning())
return -3; // Stop() must have been invoked before calling this

// clear vector:
size_t i;
for (i = 0; i < vectItemsPerThread.size(); i++)
vectItemsPerThread = 0;

ThreadSafeObject<std::vector<TSMyItem> >::RefLock lockedVect = vect.GetLockedObject();
bool fBad = false;
vectItemsPerThread[lockedVect->at(0).iThr] += 1;
for (i = 1; i < lockedVect->size(); i++)
{
TSMyItem& rS = lockedVect->at(i);
TSMyItem& rSprev = lockedVect->at(i - 1);

// count items created by each thread:
vectItemsPerThread[rS.iThr] += 1;

// checking consistency:
// 1) all items must be in ascending order,
// 2) must not have any gaps,
// 3) and also no dupes
if ((rS.iVal - rSprev.iVal) != 1)
{
fBad = true;
if (!AfDump)
break;
}
}

// dump num items created by each thread:
if (AfDump)
for (i = 0; i < vectItemsPerThread.size(); i++)
std::cout << "Thr" << i << ": " << vectItemsPerThread << std::endl;

if (fBad)
return -1;

// check whether the data was created by more than 1 thread
int cProducer = 0;
for (i = 0; i < vectItemsPerThread.size(); i++)
if (vectItemsPerThread)
cProducer++;

return cProducer < 2 ? -2 : 0;
}

//--------------------------------------------------------------------
int main(int argc, char* argv[])
{
const int nItems = 5000000; // about 38 MB
const int nThreads = 25;

std::cout << "Testing 'ThreadSafeObject' of David Maisonave "
"(http://code.axter.com/ThreadSafeObject.h)" << std::endl;

std::cout << "Creating " << nItems << " items using "
<< nThreads << " threads." << std::endl;

TCTestCase1 TC(nItems, nThreads);
TC.Start();

// it will Stop() automatically after nItems were added
// but we can also manually break by pressing the Esc key:
while (TC.IsRunning())
{
if (kbhit() && (getch() == 27))
break;
Sleep(1000);
}

TC.Stop();

int rc = TC.CheckConsistency(true);
if (rc == -1)
std::cout << "Data consistency failed!" << std::endl;
else if (rc == -2)
std::cout << "Data was produced by 1 thread only. "
"Increase nMaxItems and nMaxThreads and try again" << std::endl;
else if (rc == -3)
std::cout << "Some threads still running." << std::endl; // cannot happen
else if (rc == 0)
std::cout << "All tests passed successfully." << std::endl;
else
std::cout << "Unknown return code " << rc << std::endl;

return 0;
}

//--------------------------------------------------------------------

Uenal Mutlu · Apr 25, 2005

Uenal said:
Uenal said:

void ThreadProc()
{
#pragma omp for
for(vector<int>::size_type i=0; i<vec.size(); ++i)
vec= i;
}

Ie. if you are modifying the vector from more than one application thread.

Well in this case one should acquire the lock inside this function (in a system that
supports lock-based multithreading).

Click to expand...

No
Inside the loop is the correct answer

Click to expand...

?

Ok, the example code is not well suited for this.
I would recommend to take a look in my TestCase1 posting.
There the locking is done inside the loop (in the thread proc).
By doing so each thread has equal chance to get access to
the object, and the method is simple and "transparent".

Well, if vector is wrapped (although it is thread-safe) with the proposed wrapper, then
when operator[] was used, the wrapper would acquire the lock, making the rest of the
OpenMP created threads to wait (in the best case scenario).

Click to expand...

Click to expand...

Click to expand...

Sorry, I haven't worked with OpenMP yet, so I cannot comment on this.

Why? In VC++:

Click to expand...

Sorry, I really don't understand why this seems to be so hard to understand.
IMO it is so obvious that any object needs to be protected from
being corrupted by multiple threads. This is independent of STL.
It is false believing if some vendors say their STL implementation
is thread-safe. It is nonsense saying. It is obvious that the inner-workings
of each class must be thread safe; they are just meaning that, and this
does mean nearly nothing for practical usage. Nobody but the
programmer knows (has to) where locking is necessary, so the STL
implementer cannot know what my application logic is, therefore
it is useless to know that their STL is thread-safe.

Axter has posted a link to his test application. Therein is an option to
disable his locking mechanism. Ie. using STL directly. But then the
application crashes as expected since there is no thread-safety.

http://msdn.microsoft.com/library/d.../html/vclrfThreadSafetyInStandardCLibrary.asp

Click to expand...

For example this case:
"For writes to the same object, the object is thread safe for writing from one thread
when no readers on other threads"

It means: you can write only if there is nobody else accessing the object.
In practice all threads try to access it, so the consequence is: not thread safe,
meaning: you need to synchronize access to the object, meaning you need locking.

I assume this applies here:

"For writes to different objects of the same class, the object is thread safe for writing:

* From one thread when no readers on other threads.
* From many threads."

Click to expand...

This is for a class object with other objects in it (not container items).
For example a class with two std::vector objects.
Actually you can forget what MS writes. They are lulling you with nonsense
with such technical sounding, hard-to-understand terms, only to give people
the false illusion of thread-safety.

Yes but the function is performing only this loop and also acquiring and releasing the
lock is somewhat expensive.

Click to expand...

What other alternative do you have? If you need that multiple threads
have equal access to a shared resource then you need to lock it.
If you lock before the loop then the object is locked until the loop
finishes, and by this all other threads are blocked until the current
lock holder finishes his loop. I would put the lock inside the loop,
so each thread has the chance to continue its job. And by this the
overall performance is IMO better than to block all the other threads.

With OpenMP aside, since we modify the entire container it makes more sense to acquire the
lock before the entire loop execution and release it afterwards (well it always depends on
context, it is a logical issue in reality).

Click to expand...

Yes, true, it depends on the logical issue.

Leaving this aside, given the thread-safety guarantee in the above URL for that compiler,
why it is not safe to be used by many threads concurrently without a lock, when each
threads operations do not affect the other threads, when all threads are reading different
parts of a vector for example? I am talking always about the specific compiler with the
aforementioned thread-safety guarantee.

Click to expand...

If they read different parts then there should be no problem. But in practice
it is hard to do, ie. you have to use partitioning etc.

mihai · Apr 25, 2005

Very confusing.

So if I want to write a library I must become "paranoiac" and lock
everything or let the user make the correct logical lock for his
application. This applies to STL too; if I must secure my application
then the STL lock is redundant (if it has some). No?

Uenal Mutlu · Apr 25, 2005

Very confusing.

So if I want to write a library I must become "paranoiac" and lock
everything or let the user make the correct logical lock for his
application. This applies to STL too; if I must secure my application
then the STL lock is redundant (if it has some). No?

It depends on your project. But generally if there is the possibilty
that your data can be modified by more than one thread at the
same time then you need to take appropiate precautions.

But you can simply say that for your library to work in multithreading
environment the user of the library (the application writer) has to do
appropriate synchronizations by himself. The library writer cannot
solve this for him because it is nearly impossible to know how
his business logic is, ie. which user data need to be protected etc.

Normally it is the application writer who wants to take the benefits
of multithreading, so it is his job to make his app mt-safe.
If he does it well then nearly any library can be used safely.

Axter · Apr 25, 2005

mihai said:
Very confusing.

So if I want to write a library I must become "paranoiac" and lock
everything or let the user make the correct logical lock for his
application. This applies to STL too; if I must secure my application
then the STL lock is redundant (if it has some). No?

No, it's not redundant.

A few of you are missing the distinction between a thread-safe class
and a thread-safe object.
1. Thread Safe Class
2. Thread Safe Object

Thread Safe Class
Let me try to explain what is a thread safe CLASS, and why it's
needed.
Many of you may already know that most implementations for std::string
use reference counters.
The reference counter can be shared between multiple copies of the same
object.
Example:
string text1 = "Hello World";
string text2 = text1;
string text3(text1);

If you ran the above code, on most implementation you'll find that
all of the above variables are pointing to the same buffer, instead of
each one creating it's own copy of the string.
If you have this type of std::string implementation that is not a
thread safe class, you can run into problems if you passed text1 to
thread1, text2 to thread2 and text3 to thread3.
You can run into problems if these three different threads try to
simultaneously modify these three different strings at the same time,
since they're sharing reference counters and the same data pointer.
So in order to be able to use three DIFFERENT strings in different
threads, the std::string class needs to be thread safe.

Thread Safe Object
Now lets look at a different scenario.
Say you only have ONE object.
string text1 = "Hello World";

Now let say you want to use this single one and only string in three
different threads, and all three threads need to modify the string and
read from it.
In order to do this in a thread safe way, you need to either wrap this
object in a thread safe wrapper, or you need to add code to each thread
so that it creates a lock before accessing the object, and an unlock
when it's done accessing.
In either case, you're adding code that will make the instance of the
class thread safe. You're not making the std::string class itself
thread safe, you're making the usage of text1 thread safe.

So having a thread safe class is not the same as having a thread safe
instance of that class. A thread safe object wrapper class can not do
the job of a thread safe class, and a thread safe class can not do the
job of a thread safe object wrapper class.
These are two different things, and they're not redundant.

Our main debate here is whether it's better to have code for each
thread that would do the lock and unlock synchronization logic, or just
put the object in a wrapper class that would allow the object itself to
be access in a thread safe way.
IMHO, it would require more maintenance to add the code to each thread,
or to add the code in an application level. Also IMHO, this method
would lead to more bugs.

IMHO, by using the Thread Safe Wrapper class you would have less
maintenance and less bugs.

Also by using the wrapper class method, you encapsulate the code that
is being used to lock and unlock the object. That means if you want to
port your code to another platform, all you have to do is modify the
wrapper class, instead of having to modify each thread, or the entire
application base thread logic.

Ioannis Vranos · Apr 26, 2005

Uenal said:
Sorry, I haven't worked with OpenMP yet, so I cannot comment on this.

http://www.openmp.org. You can download the standard for free.

Upcoming VC++ 2005 supports OpenMP 2 and current Intel C++ compiler supports it too. It is
a multiplatform, portable standard and has not anything to do with application-logic, it
is structure-based.

Sorry, I really don't understand why this seems to be so hard to understand.

Perhaps because I do not know Win32/MFC. But I will provide .NET examples below.

For example this case:
"For writes to the same object, the object is thread safe for writing from one thread
when no readers on other threads"

It means: you can write only if there is nobody else accessing the object.

Actually it is case by case. The "From many threads" for writing below is a separate case.
I will provide .NET code demonstrating that below.

In practice all threads try to access it, so the consequence is: not thread safe,
meaning: you need to synchronize access to the object, meaning you need locking.

This is for a class object with other objects in it (not container items).

Actually the page mentions containers in the beginning.

For example a class with two std::vector objects.
Actually you can forget what MS writes. They are lulling you with nonsense
with such technical sounding, hard-to-understand terms, only to give people
the false illusion of thread-safety.

Here is .NET code writing to a vector with 5 separate threads. Each thread writes to a
separate block of a vector and no thread-locks are used. The output looks like OK, so the

"For writes to different objects of the same class, the object is thread safe for writing:

* From one thread when no readers on other threads.
==> * From many threads."

looks like it applies here.

#using <mscorlib.dll>

#include <vector>
#include <iostream>

__gc class SomeClass
{
std::vector<int> *pvec;

public:
SomeClass()
{
pvec= new std::vector<int>(1000);
}

~SomeClass()
{
delete pvec;
}

void Write1()
{
for(std::vector<int>::size_type i=0; i<200; ++i)
(*pvec)= i;
}

void Write2()
{
for(std::vector<int>::size_type i=200; i<400; ++i)
(*pvec)= i;
}

void Write3()
{
for(std::vector<int>::size_type i=400; i<600; ++i)
(*pvec)= i;
}

void Write4()
{
for(std::vector<int>::size_type i=600; i<800; ++i)
(*pvec)= i;
}

void Write5()
{
for(std::vector<int>::size_type i=800; i<1000; ++i)
(*pvec)= i;
}

void DisplayValues()
{
using namespace std;

for(vector<int>::iterator p= pvec->begin(); p!= pvec->end(); ++p)
cout<<*p<<"\t";
}
};

int main()
{
using namespace System;
using namespace System::Threading;
using namespace std;

SomeClass *pSomeClass= __gc new SomeClass;

Thread *pthread1= __gc new Thread (__gc new ThreadStart(pSomeClass,
&SomeClass::Write1) );
Thread *pthread2= __gc new Thread (__gc new ThreadStart(pSomeClass,
&SomeClass::Write2) );
Thread *pthread3= __gc new Thread (__gc new ThreadStart(pSomeClass,
&SomeClass::Write3) );
Thread *pthread4= __gc new Thread (__gc new ThreadStart(pSomeClass,
&SomeClass::Write4) );
Thread *pthread5= __gc new Thread (__gc new ThreadStart(pSomeClass,
&SomeClass::Write5) );

pthread1->Start();
pthread2->Start();
pthread3->Start();
pthread4->Start();
pthread5->Start();

// Main thread waits for some time to let the other threads finish
Thread::Sleep(5000);

pSomeClass->DisplayValues();
}

C:\c>temp
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
100 101 102 103 104 105 106 107 108 109
110 111 112 113 114 115 116 117 118 119
120 121 122 123 124 125 126 127 128 129
130 131 132 133 134 135 136 137 138 139
140 141 142 143 144 145 146 147 148 149
150 151 152 153 154 155 156 157 158 159
160 161 162 163 164 165 166 167 168 169
170 171 172 173 174 175 176 177 178 179
180 181 182 183 184 185 186 187 188 189
190 191 192 193 194 195 196 197 198 199
200 201 202 203 204 205 206 207 208 209
210 211 212 213 214 215 216 217 218 219
220 221 222 223 224 225 226 227 228 229
230 231 232 233 234 235 236 237 238 239
240 241 242 243 244 245 246 247 248 249
250 251 252 253 254 255 256 257 258 259
260 261 262 263 264 265 266 267 268 269
270 271 272 273 274 275 276 277 278 279
280 281 282 283 284 285 286 287 288 289
290 291 292 293 294 295 296 297 298 299
300 301 302 303 304 305 306 307 308 309
310 311 312 313 314 315 316 317 318 319
320 321 322 323 324 325 326 327 328 329
330 331 332 333 334 335 336 337 338 339
340 341 342 343 344 345 346 347 348 349
350 351 352 353 354 355 356 357 358 359
360 361 362 363 364 365 366 367 368 369
370 371 372 373 374 375 376 377 378 379
380 381 382 383 384 385 386 387 388 389
390 391 392 393 394 395 396 397 398 399
400 401 402 403 404 405 406 407 408 409
410 411 412 413 414 415 416 417 418 419
420 421 422 423 424 425 426 427 428 429
430 431 432 433 434 435 436 437 438 439
440 441 442 443 444 445 446 447 448 449
450 451 452 453 454 455 456 457 458 459
460 461 462 463 464 465 466 467 468 469
470 471 472 473 474 475 476 477 478 479
480 481 482 483 484 485 486 487 488 489
490 491 492 493 494 495 496 497 498 499
500 501 502 503 504 505 506 507 508 509
510 511 512 513 514 515 516 517 518 519
520 521 522 523 524 525 526 527 528 529
530 531 532 533 534 535 536 537 538 539
540 541 542 543 544 545 546 547 548 549
550 551 552 553 554 555 556 557 558 559
560 561 562 563 564 565 566 567 568 569
570 571 572 573 574 575 576 577 578 579
580 581 582 583 584 585 586 587 588 589
590 591 592 593 594 595 596 597 598 599
600 601 602 603 604 605 606 607 608 609
610 611 612 613 614 615 616 617 618 619
620 621 622 623 624 625 626 627 628 629
630 631 632 633 634 635 636 637 638 639
640 641 642 643 644 645 646 647 648 649
650 651 652 653 654 655 656 657 658 659
660 661 662 663 664 665 666 667 668 669
670 671 672 673 674 675 676 677 678 679
680 681 682 683 684 685 686 687 688 689
690 691 692 693 694 695 696 697 698 699
700 701 702 703 704 705 706 707 708 709
710 711 712 713 714 715 716 717 718 719
720 721 722 723 724 725 726 727 728 729
730 731 732 733 734 735 736 737 738 739
740 741 742 743 744 745 746 747 748 749
750 751 752 753 754 755 756 757 758 759
760 761 762 763 764 765 766 767 768 769
770 771 772 773 774 775 776 777 778 779
780 781 782 783 784 785 786 787 788 789
790 791 792 793 794 795 796 797 798 799
800 801 802 803 804 805 806 807 808 809
810 811 812 813 814 815 816 817 818 819
820 821 822 823 824 825 826 827 828 829
830 831 832 833 834 835 836 837 838 839
840 841 842 843 844 845 846 847 848 849
850 851 852 853 854 855 856 857 858 859
860 861 862 863 864 865 866 867 868 869
870 871 872 873 874 875 876 877 878 879
880 881 882 883 884 885 886 887 888 889
890 891 892 893 894 895 896 897 898 899
900 901 902 903 904 905 906 907 908 909
910 911 912 913 914 915 916 917 918 919
920 921 922 923 924 925 926 927 928 929
930 931 932 933 934 935 936 937 938 939
940 941 942 943 944 945 946 947 948 949
950 951 952 953 954 955 956 957 958 959
960 961 962 963 964 965 966 967 968 969
970 971 972 973 974 975 976 977 978 979
980 981 982 983 984 985 986 987 988 989
990 991 992 993 994 995 996 997 998 999

C:\c>

What other alternative do you have? If you need that multiple threads
have equal access to a shared resource then you need to lock it.

Click to expand...

Yes. If it is one object, however the story is different when we are dealing with separate
elements of a container which is our subject (STL).

Axter · Apr 26, 2005

Ioannis said:
Uenal said:

Sorry, I haven't worked with OpenMP yet, so I cannot comment on

Click to expand...

this.

http://www.openmp.org. You can download the standard for free.

Upcoming VC++ 2005 supports OpenMP 2 and current Intel C++ compiler supports it too. It is
a multiplatform, portable standard and has not anything to do with application-logic, it
is structure-based.

Sorry, I really don't understand why this seems to be so hard to

Click to expand...

understand.

Perhaps because I do not know Win32/MFC. But I will provide .NET examples below.

For example this case:
"For writes to the same object, the object is thread safe for writing from one thread
when no readers on other threads"

It means: you can write only if there is nobody else accessing the

Click to expand...

object.

Actually it is case by case. The "From many threads" for writing below is a separate case.
I will provide .NET code demonstrating that below.

In practice all threads try to access it, so the consequence is: not thread safe,
meaning: you need to synchronize access to the object, meaning you need locking.

This is for a class object with other objects in it (not container

Click to expand...

items).

Actually the page mentions containers in the beginning.

For example a class with two std::vector objects.
Actually you can forget what MS writes. They are lulling you with nonsense
with such technical sounding, hard-to-understand terms, only to give people
the false illusion of thread-safety.

Click to expand...

Here is .NET code writing to a vector with 5 separate threads. Each thread writes to a
separate block of a vector and no thread-locks are used. The output looks like OK, so the

"For writes to different objects of the same class, the object is thread safe for writing:

* From one thread when no readers on other threads.
==> * From many threads."

looks like it applies here.

#using <mscorlib.dll>

#include <vector>
#include <iostream>

__gc class SomeClass
{
std::vector<int> *pvec;

public:
SomeClass()
{
pvec= new std::vector<int>(1000);
}

~SomeClass()
{
delete pvec;
}

void Write1()
{
for(std::vector<int>::size_type i=0; i<200; ++i)
(*pvec)= i;
}

void Write2()
{
for(std::vector<int>::size_type i=200; i<400; ++i)
(*pvec)= i;
}

void Write3()
{
for(std::vector<int>::size_type i=400; i<600; ++i)
(*pvec)= i;
}

void Write4()
{
for(std::vector<int>::size_type i=600; i<800; ++i)
(*pvec)= i;
}

void Write5()
{
for(std::vector<int>::size_type i=800; i<1000; ++i)
(*pvec)= i;
}

void DisplayValues()
{
using namespace std;

for(vector<int>::iterator p= pvec->begin(); p!= pvec->end(); ++p)
cout<<*p<<"\t";
}
};

int main()
{
using namespace System;
using namespace System::Threading;
using namespace std;

SomeClass *pSomeClass= __gc new SomeClass;

Thread *pthread1= __gc new Thread (__gc new ThreadStart(pSomeClass,
&SomeClass::Write1) );
Thread *pthread2= __gc new Thread (__gc new ThreadStart(pSomeClass,
&SomeClass::Write2) );
Thread *pthread3= __gc new Thread (__gc new ThreadStart(pSomeClass,
&SomeClass::Write3) );
Thread *pthread4= __gc new Thread (__gc new ThreadStart(pSomeClass,
&SomeClass::Write4) );
Thread *pthread5= __gc new Thread (__gc new ThreadStart(pSomeClass,
&SomeClass::Write5) );

pthread1->Start();
pthread2->Start();
pthread3->Start();
pthread4->Start();
pthread5->Start();

// Main thread waits for some time to let the other threads finish
Thread::Sleep(5000);

pSomeClass->DisplayValues();
}

C:\c>temp
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
100 101 102 103 104 105 106 107 108 109
110 111 112 113 114 115 116 117 118 119
120 121 122 123 124 125 126 127 128 129
130 131 132 133 134 135 136 137 138 139
140 141 142 143 144 145 146 147 148 149
150 151 152 153 154 155 156 157 158 159
160 161 162 163 164 165 166 167 168 169
170 171 172 173 174 175 176 177 178 179
180 181 182 183 184 185 186 187 188 189
190 191 192 193 194 195 196 197 198 199
200 201 202 203 204 205 206 207 208 209
210 211 212 213 214 215 216 217 218 219
220 221 222 223 224 225 226 227 228 229
230 231 232 233 234 235 236 237 238 239
240 241 242 243 244 245 246 247 248 249
250 251 252 253 254 255 256 257 258 259
260 261 262 263 264 265 266 267 268 269
270 271 272 273 274 275 276 277 278 279
280 281 282 283 284 285 286 287 288 289
290 291 292 293 294 295 296 297 298 299
300 301 302 303 304 305 306 307 308 309
310 311 312 313 314 315 316 317 318 319
320 321 322 323 324 325 326 327 328 329
330 331 332 333 334 335 336 337 338 339
340 341 342 343 344 345 346 347 348 349
350 351 352 353 354 355 356 357 358 359
360 361 362 363 364 365 366 367 368 369
370 371 372 373 374 375 376 377 378 379
380 381 382 383 384 385 386 387 388 389
390 391 392 393 394 395 396 397 398 399
400 401 402 403 404 405 406 407 408 409
410 411 412 413 414 415 416 417 418 419
420 421 422 423 424 425 426 427 428 429
430 431 432 433 434 435 436 437 438 439
440 441 442 443 444 445 446 447 448 449
450 451 452 453 454 455 456 457 458 459
460 461 462 463 464 465 466 467 468 469
470 471 472 473 474 475 476 477 478 479
480 481 482 483 484 485 486 487 488 489
490 491 492 493 494 495 496 497 498 499
500 501 502 503 504 505 506 507 508 509
510 511 512 513 514 515 516 517 518 519
520 521 522 523 524 525 526 527 528 529
530 531 532 533 534 535 536 537 538 539
540 541 542 543 544 545 546 547 548 549
550 551 552 553 554 555 556 557 558 559
560 561 562 563 564 565 566 567 568 569
570 571 572 573 574 575 576 577 578 579
580 581 582 583 584 585 586 587 588 589
590 591 592 593 594 595 596 597 598 599
600 601 602 603 604 605 606 607 608 609
610 611 612 613 614 615 616 617 618 619
620 621 622 623 624 625 626 627 628 629
630 631 632 633 634 635 636 637 638 639
640 641 642 643 644 645 646 647 648 649
650 651 652 653 654 655 656 657 658 659
660 661 662 663 664 665 666 667 668 669
670 671 672 673 674 675 676 677 678 679
680 681 682 683 684 685 686 687 688 689
690 691 692 693 694 695 696 697 698 699
700 701 702 703 704 705 706 707 708 709
710 711 712 713 714 715 716 717 718 719
720 721 722 723 724 725 726 727 728 729
730 731 732 733 734 735 736 737 738 739
740 741 742 743 744 745 746 747 748 749
750 751 752 753 754 755 756 757 758 759
760 761 762 763 764 765 766 767 768 769
770 771 772 773 774 775 776 777 778 779
780 781 782 783 784 785 786 787 788 789
790 791 792 793 794 795 796 797 798 799
800 801 802 803 804 805 806 807 808 809
810 811 812 813 814 815 816 817 818 819
820 821 822 823 824 825 826 827 828 829
830 831 832 833 834 835 836 837 838 839
840 841 842 843 844 845 846 847 848 849
850 851 852 853 854 855 856 857 858 859
860 861 862 863 864 865 866 867 868 869
870 871 872 873 874 875 876 877 878 879
880 881 882 883 884 885 886 887 888 889
890 891 892 893 894 895 896 897 898 899
900 901 902 903 904 905 906 907 908 909
910 911 912 913 914 915 916 917 918 919
920 921 922 923 924 925 926 927 928 929
930 931 932 933 934 935 936 937 938 939
940 941 942 943 944 945 946 947 948 949
950 951 952 953 954 955 956 957 958 959
960 961 962 963 964 965 966 967 968 969
970 971 972 973 974 975 976 977 978 979
980 981 982 983 984 985 986 987 988 989
990 991 992 993 994 995 996 997 998 999

C:\c>

What other alternative do you have? If you need that multiple threads
have equal access to a shared resource then you need to lock it.

Click to expand...

Yes. If it is one object, however the story is different when we are dealing with separate
elements of a container which is our subject (STL).

Of course no thread locking is required for this type of code, because
your threads are not access the same data.
Each thread has it's own block of data.
Thread synchronization is needed when you're access and modifying the
same chunck of data.
This is not an example of such a requirement, and there for no need for
a ThreadSafeObject wrapper class, nor is there a need for application
level lock.

Ioannis Vranos · Apr 26, 2005

Axter said:
#using <mscorlib.dll>

#include <vector>
#include <iostream>

__gc class SomeClass
{
std::vector<int> *pvec;

public:
SomeClass()
{
pvec= new std::vector<int>(1000);
}

~SomeClass()
{
delete pvec;
}

void Write1()
{
for(std::vector<int>::size_type i=0; i<200; ++i)
(*pvec)= i;
}

void Write2()
{
for(std::vector<int>::size_type i=200; i<400; ++i)
(*pvec)= i;
}

void Write3()
{
for(std::vector<int>::size_type i=400; i<600; ++i)
(*pvec)= i;
}

void Write4()
{
for(std::vector<int>::size_type i=600; i<800; ++i)
(*pvec)= i;
}

void Write5()
{
for(std::vector<int>::size_type i=800; i<1000; ++i)
(*pvec)= i;
}

void DisplayValues()
{
using namespace std;

for(vector<int>::iterator p= pvec->begin(); p!=

Click to expand...

pvec->end(); ++p)

cout<<*p<<"\t";
}
};

int main()
{
using namespace System;
using namespace System::Threading;
using namespace std;

SomeClass *pSomeClass= __gc new SomeClass;

Thread *pthread1= __gc new Thread (__gc new
ThreadStart(pSomeClass,

&SomeClass::Write1) );
Thread *pthread2= __gc new Thread (__gc new
ThreadStart(pSomeClass,

&SomeClass::Write2) );
Thread *pthread3= __gc new Thread (__gc new
ThreadStart(pSomeClass,

&SomeClass::Write3) );
Thread *pthread4= __gc new Thread (__gc new
ThreadStart(pSomeClass,

&SomeClass::Write4) );
Thread *pthread5= __gc new Thread (__gc new
ThreadStart(pSomeClass,

&SomeClass::Write5) );

pthread1->Start();
pthread2->Start();
pthread3->Start();
pthread4->Start();
pthread5->Start();

// Main thread waits for some time to let the other threads
finish

Thread::Sleep(5000);

pSomeClass->DisplayValues();
}

Click to expand...

Of course no thread locking is required for this type of code, because
your threads are not access the same data.
Each thread has it's own block of data.
Thread synchronization is needed when you're access and modifying the
same chunck of data.
This is not an example of such a requirement, and there for no need for
a ThreadSafeObject wrapper class, nor is there a need for application
level lock.

Exactly. Still the vector along with the rest standard library in the specific compiler is
"thread-safe" in the sense that it allows multithreading operations on it.

If the vector wasn't "thread-safe", then the above assignments would not be well defined.

In the case of accessing the same data, of course thread-locking is needed, but this is in
the application's logic.

Wrapping the entire vector however, and acquiring the lock at each different element
access would be inefficient, wouldn't it?

IDE program help	0	Jan 26, 2025
Multithreading and compatibility library (libconfig)	1	Jan 23, 2013
Best Online OST to PST Converter Tool	9	Dec 19, 2024
MultiThreading	1	Sep 11, 2013
STL/CLR library in Visual Studio 2008	2	Jan 26, 2010
Multithreading	0	Sep 5, 2011
difference between plain C and a full object oriented programming	4	Jun 1, 2005
STL & Multithreading	8	Oct 20, 2004

STL & multithreading

Uenal Mutlu

Uenal Mutlu

Pete Becker

Axter

Uenal Mutlu

Pete Becker

Axter

Ioannis Vranos

Uenal Mutlu

Ioannis Vranos

Stephen Howe

Axter

Uenal Mutlu

Uenal Mutlu

mihai

Uenal Mutlu

Axter

Ioannis Vranos

Axter

Ioannis Vranos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads