Implementation of shared_ptr

fungus · Jan 29, 2009

This is a question about the implementation of shared_ptr:

Does shared_ptr allocate a little control block to hold the reference
count?

If so, this means double the number of memory allocations
(one for the object, one for the pointer).

If not, how does it work?

Bob Hairgrove · Jan 29, 2009

fungus said:
This is a question about the implementation of shared_ptr:

Does shared_ptr allocate a little control block to hold the reference
count?

If so, this means double the number of memory allocations
(one for the object, one for the pointer).

If not, how does it work?

Did you look at the source code? The Boost libraries are open source, AFAIK.

fungus · Jan 29, 2009

Did you look at the source code?

I thought somebody would have a simple yes/no answer.

FTB.

Kai-Uwe Bux · Jan 29, 2009

fungus said:
This is a question about the implementation of shared_ptr:

Does shared_ptr allocate a little control block to hold the reference
count?

If so, this means double the number of memory allocations
(one for the object, one for the pointer).

That is an obvious way to implement non-intrusive reference counting. Note,
however, that it's not as bad as you make it sound. You use a shared_ptr
when you have multiple pointers to the same pointee. Now, if n shared_ptr
objects share the same pointee, you still have just 2 memory blocks for the
whole thing. This overhead is usually acceptable when you need shared
ownership. If you don't, it could be that you are misusing shared_ptr
anyway.

If not, how does it work?

There are variants of how to implement shared_ptr. A common way is to have a
little control block. This can hold the reference count, the deleter, and
the pointer to the pointee (which means dereferencing causes double
indirection), or just the reference count and the custom deleter, in which
case the pointer itself is another member of shared_ptr (doubling its
size).

An alternative (ignoring the custom deleter) is something like this:

template < typename T >
class linked_ptr {
T * pointer;
mutable linked_ptr * next;
public:
...
};

Here the shared_ptr objects for a given pointee form a circular linked
singly list (maybe, a variant with a doubly linked list is easier). Of
course, getting the refcount requires counting and is expensive, but
detecting whether an element is last and therefore deletion of the_pointer
is required would be cheap. Note that the next field does not require
allocation. However, you have now 2n pointers for n linked_ptr objects but
only a single dynamic memory allocation. I don't know whether this is used
in shared_ptr implementations, though.

Best

Kai-Uwe Bux

Kai-Uwe Bux · Jan 29, 2009

Kai-Uwe Bux said:
fungus said:

This is a question about the implementation of shared_ptr:

Does shared_ptr allocate a little control block to hold the reference
count?

Click to expand...

[snip]
An alternative (ignoring the custom deleter) is something like this:

template < typename T >
class linked_ptr {
T * pointer;
mutable linked_ptr * next;
public:
...
};

Here the shared_ptr objects for a given pointee form a circular linked
singly list (maybe, a variant with a doubly linked list is easier).

[snip]

Thinking about the assignment operator now, a doubly linked list is
definitely easier.

Best

Kai-Uwe Bux

fungus · Jan 29, 2009

Thinking about the assignment operator now, a doubly linked list is
definitely easier.

I was also thinking of schemes like this, there's many
ways of doing it.

At a practical level though, I was more interested in what
current implementations do. If this is going to be the
"standard" reference counted pointer then this choice
is very important.

FTB.

Kai-Uwe Bux · Jan 29, 2009

fungus said:
I was also thinking of schemes like this, there's many
ways of doing it.

At a practical level though, I was more interested in what
current implementations do.

Just a data point: I had a look at tr1 shipping with g++. It allocated a
small control block. Since this is based upon Boost, I think that Boost
uses this technique, too.

If this is going to be the "standard" reference counted pointer then this
choice is very important.

I am not sure whether there will be a uniform implementation. The standard
clearly leaves some room for vendor specific implementations; and vendors
might use this to compete.

Best

Kai-Uwe Bux

fungus · Jan 29, 2009

That is an obvious way to implement non-intrusive reference counting.

On a related note: Is there a proposal for intrusive
pointers on C++0x? I just did a quick google and
couldn't see one.

PS: I found a performance comparison of different schemes here:

http://www.boost.org/doc/libs/1_37_0/libs/smart_ptr/smarttests.htm

Reading between the lines it seems that the "allocate a little
control block" method is favored.

FTB.

Marcel Müller · Jan 29, 2009

Hi!

This is a question about the implementation of shared_ptr:

Does shared_ptr allocate a little control block to hold the reference
count?
Yes.

If so, this means double the number of memory allocations
(one for the object, one for the pointer).

If you want to come around this you should have a look at intrusive_ptr.
It looks a bit complicated to implement the free functions
intrusive_ptr_add_ref and intrusive_ptr_release. But there is a very
simple and generic solution: write a simple abstract base class for the
reference count.

Something like that:

class ref_count
{ friend void intrusive_ptr_add_ref(ref_count*);
friend void intrusive_ptr_release(ref_count*);
private:
unsigned count;
protected:
ref_count : count(0) {}
virtual ~ref_count() {}
public:
bool ref_is_managed() { return ref_count != 0; }
bool ref_is_unique() { return ref_count == 1; }
// Only a return value of true is thread-safe.
};

void intrusive_ptr_add_ref(ref_count* ref)
{ interlocked_increment(&ref->count);
}

void intrusive_ptr_release(ref_count* ref)
{ interlocked_decrement(&ref->count);
if (!ref->is_managed())
delete ref;
}

Now all you have to do for an intrusive reference count is to inherit
from ref_count. This will add the size of the reference count to your
data object, and the intrusive pointer instances are binary identical to
ordinary C pointers. Of course, they are semantically different.
Whether the virtual table pointer of ref_count takes additional space
depends. If your object is polymorphic anyway and you derive from
ref_count without multiple inheritance, there is usually no additional
space required.

If you do not want ref_count to have to have virtual functions all you
have to change is to move the delete operation in a template function
that is aware of the real type of your object. In this case you can
safely downcast ref_count* to your type and delete the result. This will
allow you to use non-polymorphic objects with intrusive_ptr. Of course,
the destructor of ref_count has to be non-virtual in this case.

Note that the above implementation is thread-safe, as long as you
provide appropriate interlocked increment and decrement functions.

Marcel

Slot · Jan 29, 2009

This is a question about the implementation of shared_ptr:

Does shared_ptr allocate a little control block to hold the reference
count?

If so, this means double the number of memory allocations
(one for the object, one for the pointer).

If not, how does it work?

If you have the book, "The C++ standard library, a tutorial and
reference", look at page 222. There is an implementation of a
"counted_ptr" that reserves a little block for reference counting.

Juha Nieminen · Jan 29, 2009

Marcel said:
Yes.

In fact, boost::shared_ptr allocates a block which is rather larger
than one reference count (because it has to support, among other things,
a deleter function and weak pointers, and it must be thread-safe, so it
requires a mutex).

Something like that:

class ref_count
{ friend void intrusive_ptr_add_ref(ref_count*);
friend void intrusive_ptr_release(ref_count*);
private:
unsigned count;
protected:
ref_count : count(0) {}
virtual ~ref_count() {}
public:
bool ref_is_managed() { return ref_count != 0; }
bool ref_is_unique() { return ref_count == 1; }
// Only a return value of true is thread-safe.
};

void intrusive_ptr_add_ref(ref_count* ref)
{ interlocked_increment(&ref->count);
}

void intrusive_ptr_release(ref_count* ref)
{ interlocked_decrement(&ref->count);
if (!ref->is_managed())
delete ref;
}

Now all you have to do for an intrusive reference count is to inherit
from ref_count.

Your ref_count class lacks a proper copy constructor and assignment
operator, which means that it will fail spectacularly if objects
inherited from it are copied/assigned around (if the reference count in
the source and target are different).

This is an extremely typical mistake with intrusive reference
counting. It's trivial to fix, though.

Igor R. · Jan 29, 2009

Does shared_ptr allocate a little control block to hold the reference

count?

If so, this means double the number of memory allocations
(one for the object, one for the pointer).

IIRC, boost::make_shared performs single allocation for the both.

Marcel Müller · Jan 29, 2009

Juha said:
Your ref_count class lacks a proper copy constructor and assignment
operator, which means that it will fail spectacularly if objects
inherited from it are copied/assigned around (if the reference count in
the source and target are different).

You are right. The objects that I protect by a class like this have an
application wide primary key. So they are non-copyable anyway, because
if I copy them the key would no longer be unique.

This is an extremely typical mistake with intrusive reference
counting. It's trivial to fix, though.

class ref_count
{ friend void intrusive_ptr_add_ref(ref_count*);
friend void intrusive_ptr_release(ref_count*);
private:
unsigned count;
protected:
ref_count() : count(0) {}
ref_count(const ref_count&) : count(0) {}
virtual ~ref_count() {}
ref_count& operator=(const ref_count&) { return *this; }
public:
bool ref_is_managed() { return ref_count != 0; }
bool ref_is_unique() { return ref_count == 1; }
// Only a return value of true is thread-safe.
};

But maybe it makes more sense to derive from boost::non_copyable
instead, because copying reference counted objects is likely to be not
what you intended. A derived class may still implement copy and
assignment semantics explicitly.

But intrusive reference counting is still a extremely lightweight method
for moderate requirements. The runtime overhead is quite small. OK the
interlocked access to the reference counter does not scale that good on
x86/x64 SMP machines, but this is more related to x86/x64 than to the
method.

Marcel

Emil Dotchevski · Jan 30, 2009

This is a question about the implementation of shared_ptr:

Does shared_ptr allocate a little control block to hold the reference
count?

If so, this means double the number of memory allocations
(one for the object, one for the pointer).

If not, how does it work?

The correct attitude is that you shouldn't care if it allocates one or
two memory blocks, unless you have evidence (proof) that some
shared_ptr instances allocations are causing problems for you. At that
time, you can resort to custom allocators and deleters to fine tune
exactly how memory is allocated, including not allocating dynamic
memory at all.

Emil Dotchevski
http://www.revergestudios.com/reblog/index.php?n=ReCode

fungus · Jan 30, 2009

But maybe it makes more sense to derive from boost::non_copyable
instead, because copying reference counted objects is likely to be not
what you intended. A derived class may still implement copy and
assignment semantics explicitly.

In my own code I've been using intrusive reference counting
and inheritance to sort out the details. My classes look
like this:

class foo : public refcountable {
....
};

"refcountable" takes care of business for me. For new
code this works very well.

fungus · Jan 30, 2009

intrusive reference counting is still a extremely lightweight method
for moderate requirements.

You also have a lot more freedom to reset pointers to
point to any object. With shared_ptr you can only
copy other shared pointers because you need to know
where the reference count is.

James Kanze · Jan 30, 2009

You are right. The objects that I protect by a class like this
have an application wide primary key. So they are non-copyable
anyway, because if I copy them the key would no longer be
unique.

class ref_count
{ friend void intrusive_ptr_add_ref(ref_count*);
friend void intrusive_ptr_release(ref_count*);
private:
unsigned count;
protected:
ref_count() : count(0) {}
ref_count(const ref_count&) : count(0) {}
virtual ~ref_count() {}
ref_count& operator=(const ref_count&) { return *this; }
public:
bool ref_is_managed() { return ref_count != 0; }
bool ref_is_unique() { return ref_count == 1; }
// Only a return value of true is thread-safe.
};

In general, if you're allocating objects dynamically, it's
because they have identity, and aren't copiable. (There are
doubtlessly exceptions, but they aren't that common.) So just
ban copy and assignment. The client code can always reactivate
it for his classes, if it makes sense.

But maybe it makes more sense to derive from
boost::non_copyable instead, because copying reference counted
objects is likely to be not what you intended. A derived class
may still implement copy and assignment semantics explicitly.
Exactly.

But intrusive reference counting is still a extremely
lightweight method for moderate requirements. The runtime
overhead is quite small.

That's one reason to use intrusive reference counting, but it's
not the only one, nor even the most important one.
Non-intrusive reference counting is extremely brittle; with
boost::shared_ptr, for example, you need to take extrodinary
precautions to ensure that you don't end up with two distinct
counters. (The Boost documentation suggests making all of the
pointers to the object boost::shared_ptr. Which is fine, but
the compilers I use don't collaborate---this isn't a
boost::shared_ptr.)

That's for the cases where reference counting is appropriate, of
course. which aren't all that frequent to begin with.

OK the interlocked access to the reference counter does not
scale that good on x86/x64 SMP machines, but this is more
related to x86/x64 than to the method.

The interlocked access is necessary regardless of the strategy.
On the other hand, I find that when an object is being passed
between threads, auto_ptr is more appropriate: once the second
thread has access, you don't want to allow access from the first
thread.

Marcel Müller · Jan 30, 2009

James said:
That's one reason to use intrusive reference counting, but it's
not the only one, nor even the most important one.
Non-intrusive reference counting is extremely brittle; with
boost::shared_ptr, for example, you need to take extrodinary
precautions to ensure that you don't end up with two distinct
counters.

Well, I did not have this problem so far, but you are right. You have to
be very careful with non-const pointers to reference counted objects.
Intrusive reference counts are more fault-tolerant. Usually it is not
even required to pass intrusive pointers as function arguments, because
it is quite difficult to pass a pointer to a reference counted object to
a function without holding the ownership at the caller at least until
the end of the statement. But function return values have to be
intrusive pointers in general, because otherwise the reference count can
go temporarily to zero. So you have to be careful too.

(The Boost documentation suggests making all of the
pointers to the object boost::shared_ptr. Which is fine, but
the compilers I use don't collaborate---this isn't a
boost::shared_ptr.)

What's wrong with your compiler?

That's for the cases where reference counting is appropriate, of
course. which aren't all that frequent to begin with.
Hmm.

The interlocked access is necessary regardless of the strategy.

Of course, but on x86 any interlocked instruction is always a full
membar. This is not required in this case.

On the other hand, I find that when an object is being passed
between threads, auto_ptr is more appropriate: once the second
thread has access, you don't want to allow access from the first
thread.

I would like to say that the most frequent use for reference counts are
thread-safe objects. Because as long as the object is thread local or
dedicated to exactly one thread at one time, there is simply no need for
reference counts. But if the object has (some) thread-safe methods, it
makes sense, that more than one thread holds an active reference to the
object.

Of course, there may be single threaded applications with reference
counting too. E.g. if you have worker queues with items that refer to
some business objects. (Maybe simply the GUI message queue).

I often use reference counting in conjunction with a global, type
specific instance repository (in fact a template class with some static
members). The repository keeps track of all objects of a certain type
(or base type) with respect to their key. However, to cleanup the
repository intrusive reference counting is helpful.

Marcel

fungus · Jan 30, 2009

Of course, there may be single threaded applications with reference
counting too.

How do you manage memory without them?

I think the most common usage for shared_ptr
(or equivalent) is memory management - nothing
to do with threading.

James Kanze · Jan 30, 2009

On 2009-01-30 06:23:34 -0500, James Kanze <[email protected]> said:

Which says, sort of, that the interlocked access is not
necessary. That is, there's a good argument that shared_point
does not need to be thread-safe.

That's more or less my fealing, at least at present. YMMV, as
they say; I think that there are valid arguments both ways.

Given the cost of thread safety on a Sparc (my usual platform),
I'd probably make it thread safe if I were implementing it
today. My own (invasive) RefCntPtr goes back to 1993, however,
when there weren't threads, and it's never caused a problem, so
I've never bothered to change it.

cyclic references, shared_ptr, dangling pointer	2	Nov 5, 2012
shared_ptr and unique_ptr related question	1	Feb 22, 2014
shared_ptr cycles	4	May 7, 2009
shared_ptr implementation in the upcoming C++ standard	5	Aug 22, 2007
question regarding the shared_ptr use_count	6	Feb 3, 2014
shared_ptr; derived classes; ambiguitity in overloaded functions	5	Aug 26, 2008
cycles of shared_ptr avoid using weak_ptr.	4	Jul 28, 2008
Getting to grips with boost::shared_ptr	1	Mar 24, 2010

Implementation of shared_ptr

fungus

Bob Hairgrove

fungus

Kai-Uwe Bux

Kai-Uwe Bux

fungus

Kai-Uwe Bux

fungus

Marcel Müller

Slot

Juha Nieminen

Igor R.

Marcel Müller

Emil Dotchevski

fungus

fungus

James Kanze

Marcel Müller

fungus

James Kanze

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads