Initialising map member without copy

  • Thread starter Paul Brettschneider
  • Start date
P

Paul Brettschneider

Hello,

I have a use-case where I have a number of objects of a complex class.
Copying is very expensive and should be avoided. I want to access those
objects by ids (std::string objects). A std::map<std::string, MyClass>
object suggests itself but, at least under gcc, it does two gratuitous
copies:

#include <iostream>
#include <map>
#include <string>

class Test {
int x;
public: Test() : x(0) { std::cout << "Default!\n"; };
Test(const Test &t) : x(t.x) { std::cout << "Copy!\n"; };
void test() { std::cout << x << "\n"; }
};

int main()
{
std::map<std::string, Test> test_map;
test_map["hello"].test();
return 0;
}

outputs:

Default!
Copy!
Copy!
0

I suppose the first copy happens at the initialisation of a std::pair<key,
value> the second one when the pair is moved into the container proper.

For me it's not obvious why the object couldn't just be created in-place.
Does the standard say anything about this, or is this an QOI issue?

And what is the common way of avoiding this kind of problem?
I've pondered about either using "smart" pointers (i.e. refcounting), or
making the copy constructor aware of the fact that it is copying a freshly
created instance, and use init() function on the object once it is in
place. Both methods seem unnecessarily complicated. :(

Thanks,
Paul
 
A

Alf P. Steinbach

* Paul Brettschneider:
Hello,

I have a use-case where I have a number of objects of a complex class.
Copying is very expensive and should be avoided. I want to access those
objects by ids (std::string objects). A std::map<std::string, MyClass>
object suggests itself but, at least under gcc, it does two gratuitous
copies:

#include <iostream>
#include <map>
#include <string>

class Test {
int x;
public: Test() : x(0) { std::cout << "Default!\n"; };
Test(const Test &t) : x(t.x) { std::cout << "Copy!\n"; };
void test() { std::cout << x << "\n"; }
};

int main()
{
std::map<std::string, Test> test_map;
test_map["hello"].test();
return 0;
}

outputs:

Default!
Copy!
Copy!
0

I suppose the first copy happens at the initialisation of a std::pair<key,
value> the second one when the pair is moved into the container proper.

For me it's not obvious why the object couldn't just be created in-place.
Does the standard say anything about this, or is this an QOI issue?

And what is the common way of avoiding this kind of problem?
I've pondered about either using "smart" pointers (i.e. refcounting), or
making the copy constructor aware of the fact that it is copying a freshly
created instance, and use init() function on the object once it is in
place. Both methods seem unnecessarily complicated. :(

Store smart-pointers, e.g. boost::shared_ptr.

Or just raw pointers if the map isn't the owner but just referring to instances.

By the way, it might be a good idea to declare private copy constructor and copy
assignment operator (no need to implement).


Cheers, & hth.,

- Alf
 
R

Roland Pibinger

* Paul Brettschneider:

Store smart-pointers, e.g. boost::shared_ptr.
Or just raw pointers if the map isn't the owner but just referring to instances.

Using "smart pointers" in containers is always the wrong solution. I
can hardly imagine a situation where you'd need one resource-manager
per object.
 
P

Paul Brettschneider

Alf said:
* Paul Brettschneider:
Hello,

I have a use-case where I have a number of objects of a complex class.
Copying is very expensive and should be avoided. I want to access those
objects by ids (std::string objects). A std::map<std::string, MyClass>
object suggests itself but, at least under gcc, it does two gratuitous
copies:

#include <iostream>
#include <map>
#include <string>

class Test {
int x;
public: Test() : x(0) { std::cout << "Default!\n"; };
Test(const Test &t) : x(t.x) { std::cout << "Copy!\n"; };
void test() { std::cout << x << "\n"; }
};

int main()
{
std::map<std::string, Test> test_map;
test_map["hello"].test();
return 0;
}

outputs:

Default!
Copy!
Copy!
0

I suppose the first copy happens at the initialisation of a
std::pair<key, value> the second one when the pair is moved into the
container proper.

For me it's not obvious why the object couldn't just be created in-place.
Does the standard say anything about this, or is this an QOI issue?

And what is the common way of avoiding this kind of problem?
I've pondered about either using "smart" pointers (i.e. refcounting), or
making the copy constructor aware of the fact that it is copying a
freshly created instance, and use init() function on the object once it
is in place. Both methods seem unnecessarily complicated. :(

Store smart-pointers, e.g. boost::shared_ptr.

Up to now I tried to avoid the dependency on external libraries like boost
as far as possible. But it does seem to have the class that is needed for
this case: boost::ptr_map, so maybe it's time to bite the bullet...

Thanks for your help,
Paul
 
A

Alf P. Steinbach

* Paul Brettschneider:
Alf said:
* Paul Brettschneider:
Hello,

I have a use-case where I have a number of objects of a complex class.
Copying is very expensive and should be avoided. I want to access those
objects by ids (std::string objects). A std::map<std::string, MyClass>
object suggests itself but, at least under gcc, it does two gratuitous
copies:

#include <iostream>
#include <map>
#include <string>

class Test {
int x;
public: Test() : x(0) { std::cout << "Default!\n"; };
Test(const Test &t) : x(t.x) { std::cout << "Copy!\n"; };
void test() { std::cout << x << "\n"; }
};

int main()
{
std::map<std::string, Test> test_map;
test_map["hello"].test();
return 0;
}

outputs:

Default!
Copy!
Copy!
0

I suppose the first copy happens at the initialisation of a
std::pair<key, value> the second one when the pair is moved into the
container proper.

For me it's not obvious why the object couldn't just be created in-place.
Does the standard say anything about this, or is this an QOI issue?

And what is the common way of avoiding this kind of problem?
I've pondered about either using "smart" pointers (i.e. refcounting), or
making the copy constructor aware of the fact that it is copying a
freshly created instance, and use init() function on the object once it
is in place. Both methods seem unnecessarily complicated. :(
Store smart-pointers, e.g. boost::shared_ptr.

Up to now I tried to avoid the dependency on external libraries like boost
as far as possible. But it does seem to have the class that is needed for
this case: boost::ptr_map, so maybe it's time to bite the bullet...

Ah, didn't know about ptr_map, but then, never needed it.

Amazing what exists!

:)


Cheers,

- Alf
 
A

Alf P. Steinbach

* Roland Pibinger:
Using "smart pointers" in containers is always the wrong solution.

Rubbish (it does sound like very premature optimization).

I
can hardly imagine a situation where you'd need one resource-manager
per object.

That sentence is difficult to parse, but it seems you're saying that you can
hardly imagine any situation where you'd use smart pointers.

Or do you think of a container as a common resource-manager? It can be but
doesn't need to be. In this case I believe the container is primarily used for
access.


Cheers, & hth.,

- Alf
 
I

Ian Collins

Roland said:
Using "smart pointers" in containers is always the wrong solution. I
can hardly imagine a situation where you'd need one resource-manager
per object.
The first sentence is complete nonsense.

The use of smart pointers is an idiom preferred by some (including me)
and deplored by others (including you). There is no clear cut rule.

As for the second, using a smart pointer to manage the lifetime of an
object can be a very convenient labour and bug saving device.
 
J

Jerry Coffin

Using "smart pointers" in containers is always the wrong solution. I
can hardly imagine a situation where you'd need one resource-manager
per object.

Presumably that was intended to read "more than one resource-manager per
object." Right now the statement doesn't even make sense (not to mention
being close to correct).

Even with that correction, however, I think it's still wrong -- or,
depending on your viewpoint, inapplicable.

From one viewpoint, a container of smart pointers makes sense because
you're really dealing with two separate kinds of objects: the container
contains smart pointers, and the smart pointers own the objects we want
to avoid copying. We two separate objects, and precisely one resource
manager for each.

From the other viewpoint the container of smart pointers makes sense
because you're using normal engineering practice: resource management is
relatively complex. Given a complex problem, normal engineering practice
is to break that problem down into manageable sub-problems. That's what
we're doing here: we have one object to manage some aspects of resource
management, and another to manage other aspects.

Of course, you could combine the two as well. If you decide to do so
(e.g. Boost ptr_map) you've got a couple of choices: either you give up
some of the flexibility of using separate classes, or else you really do
keep them separate, but move the dividing line a bit (e.g. use a policy
class to specify part of how to manage the resources of the stored
object).

Neither viewpoint, however, changes the fundamental fact that a
container of smart pointers is a perfectly reasonable solution to some
problems. _You_ may be unable to imagine a use for it, but if so that
indicates the limits of your imagination, rather than of the technique.
 
P

Paul Brettschneider

Jerry said:
Presumably that was intended to read "more than one resource-manager per
object." Right now the statement doesn't even make sense (not to mention
being close to correct).

Even with that correction, however, I think it's still wrong -- or,
depending on your viewpoint, inapplicable.

From one viewpoint, a container of smart pointers makes sense because
you're really dealing with two separate kinds of objects: the container
contains smart pointers, and the smart pointers own the objects we want
to avoid copying. We two separate objects, and precisely one resource
manager for each.

From the other viewpoint the container of smart pointers makes sense
because you're using normal engineering practice: resource management is
relatively complex. Given a complex problem, normal engineering practice
is to break that problem down into manageable sub-problems. That's what
we're doing here: we have one object to manage some aspects of resource
management, and another to manage other aspects.

Of course there are uses for smart pointers in containers, I don't really
understand where Roland is coming from either. An obvious example is having
your objects stored in multiple containers.

But in my specific case, there is no complex resource management to speak
of: an object is part of the map and that's all. The use of smart pointers
is just a work-around for the fact that the gcc implementation of std::map
does two gratuitous copy operation when inserting via operator[]. Not a big
problem here: my objects are tens to hundreds of kB, therefore the use of a
smart pointer is negligible. But in general I don't see why you would have
to use smart pointers when there are no smarts required, or why you have to
implement a copy constructor when all you do is just insert into and erase
from a map, but never copy the map or anything like that. Keeping copy
constructors up-to-date is tedious and error prone after all.
 
R

Roland Pibinger

Of course there are uses for smart pointers in containers, I don't really
understand where Roland is coming from either.

shared_ptr is a resource manager which destructs one dynamically
allocated object (in the normal case). It's not a lightweight resource
manager because each shared_ptr internally needs a dynamically
allocated refernce counter.
Using shared_ptr for a container is like using an oven in each room
instead of a central heating.
An obvious example is having
your objects stored in multiple containers.

This is not at all 'obvious', quite the contrary.
But in my specific case, there is no complex resource management to speak
of: an object is part of the map and that's all. The use of smart pointers
is just a work-around for the fact that the gcc implementation of std::map
does two gratuitous copy operation when inserting via operator[].

This is not a gcc problem but the basic STL design philosphy. STL
works with values, only with values.
Not a big
problem here: my objects are tens to hundreds of kB, therefore the use of a
smart pointer is negligible. But in general I don't see why you would have
to use smart pointers when there are no smarts required, or why you have to
implement a copy constructor when all you do is just insert into and erase
from a map, but never copy the map or anything like that. Keeping copy
constructors up-to-date is tedious and error prone after all.

Your 'objects' probably are objects in the sense of OO, not values.
That's the problem. Try to figure out the difference between copyable
values and non-copyable objects.
 
R

Roland Pibinger

Up to now I tried to avoid the dependency on external libraries like boost
as far as possible. But it does seem to have the class that is needed for
this case: boost::ptr_map, so maybe it's time to bite the bullet...

You need to come to terms with the 'Clonable concept'. Good luck ;-)
 
P

Paul Brettschneider

Roland said:
shared_ptr is a resource manager which destructs one dynamically
allocated object (in the normal case). It's not a lightweight resource
manager because each shared_ptr internally needs a dynamically
allocated refernce counter.
Using shared_ptr for a container is like using an oven in each room
instead of a central heating.

The overhead is negligible in many (most?) cases.
But in my specific case, there is no complex resource management to speak
of: an object is part of the map and that's all. The use of smart pointers
is just a work-around for the fact that the gcc implementation of std::map
does two gratuitous copy operation when inserting via operator[].

This is not a gcc problem but the basic STL design philosphy. STL
works with values, only with values.

Oh, really? Does it say anywhere that insertion via operator[] must or
should do two copies? If not, then IMHO it's a QOI rather than a design
issue.

It's easy to initialise a std::pair in place without copy (ignore the
alignment issues):

#include <iostream>
#include <map>
#include <string>

class Test {
std::string x;
private:
Test &operator=(const Test &t);
Test(const Test &t) : x(t.x) { throw "Copy!\n"; };
public:
Test() : x("Some content") { std::cout << "Default!\n"; };
~Test() { std::cout << "Destroy!\n"; };
void test() { std::cout << x << "\n"; };
};

typedef std::pair<std::string, Test> pair_t;
int main()
{
char mem[sizeof(pair_t)];
pair_t &test = *(new(mem) pair_t());
test.second.test();
test.~pair_t();
return 0;
}

So I have to wonder why the gcc libraries aren't doing something like this.
Especially considering that std::pair is internal to the library, so it can
play all kind of dirty tricks not available to the application.
Your 'objects' probably are objects in the sense of OO, not values.
That's the problem. Try to figure out the difference between copyable
values and non-copyable objects.

I understand that for full use of STL you need copyable values. But I don't
see why, when using only a subset, in my case insertion into and deletion
from a map, it shouldn't work with non-copyable objects. As it is, I have
to use boost::ptr_map which does much more than needed for my simple
use-case.
 
J

James Kanze

Using "smart pointers" in containers is always the wrong solution.

NO. It's often suggested as a panacea, a solution for all
problems, which it isn't, but if the actual problem is simply to
avoid copying, as is the case here, it is a very appropriate
solution.
I can hardly imagine a situation where you'd need one
resource-manager per object.

When the objects are conceptually values, but you want to avoid
unnecessary copies for performance reasons. (It's a case where
garbage collection would be the ideal solution, but the context
is such that shared_ptr is almost as good, albeit probably a
little more expensive in terms of runtime---not to mention time
spent typing.)
 
J

James Kanze

Roland Pibinger wrote: [...]
Using "smart pointers" in containers is always the wrong solution. I
can hardly imagine a situation where you'd need one resource-manager
per object.
The first sentence is complete nonsense.

It's a dogma. Dogmas are almost always wrong.
The use of smart pointers is an idiom preferred by some
(including me) and deplored by others (including you). There
is no clear cut rule.

As soon as you say "preferred" or "deplored", I become
sceptical. Smart pointers are a tool. They can be a very
useful tool in some cases, but like any tool, they can be
abused. I'm sceptical of telling beginners to use them, because
they tend to cause subtle problems (as opposed to the not so
subtle problems due to dangling pointers), but if your analysis
shows them to be the appropriate tool, it would be an error to
not use them.
As for the second, using a smart pointer to manage the
lifetime of an object can be a very convenient labour and bug
saving device.

Exactly. In this case, the preferred solution is probably to
use the Boehm collector, and let garbage collection take care of
it. But that depends somewhat on the implementation of
std::map, and how you are using it---if std::map is not garbage
collection aware, and you are often erasing elements from the
map, you could very well end up leaking memory with the Boehm
collector. (One of the problems with the Boehm collector is
that you often do have to modify the implementations of the
containers in the standard library in order to use it
effectively.)
 
J

James Kanze

On 2 mar, 13:09, Paul Brettschneider <[email protected]>
wrote:

[...]
Oh, really? Does it say anywhere that insertion via operator[]
must or should do two copies?

The standard does say that insert takes an std::pair as an
operand. That's one copy. And that it must copy this object
into the map, that's the second one.
If not, then IMHO it's a QOI rather than a design issue.
It's easy to initialise a std::pair in place without copy
(ignore the alignment issues):
#include <iostream>
#include <map>
#include <string>
class Test {
std::string x;
private:
Test &operator=(const Test &t);
Test(const Test &t) : x(t.x) { throw "Copy!\n"; };
public:
Test() : x("Some content") { std::cout << "Default!\n"; };
~Test() { std::cout << "Destroy!\n"; };
void test() { std::cout << x << "\n"; };
};
typedef std::pair<std::string, Test> pair_t;
int main()
{
char mem[sizeof(pair_t)];
pair_t &test = *(new(mem) pair_t());
test.second.test();
test.~pair_t();
return 0;
}
So I have to wonder why the gcc libraries aren't doing
something like this.

They probably are. But it doesn't help. The problem is that
the first std::pair must be constructed before you've found the
"place" where the second is to be constructed, since you need it
for the lookup.
Especially considering that std::pair is internal to the
library, so it can play all kind of dirty tricks not available
to the application.
I understand that for full use of STL you need copyable values. But I don't
see why, when using only a subset, in my case insertion into and deletion
from a map, it shouldn't work with non-copyable objects. As it is, I have
to use boost::ptr_map which does much more than needed for my simple
use-case.

This is a recognized problem. The next version of the standard
implements something called move-semantics, which can be used to
implement a shallow copy when the compiler determines that the
source will immediately cease to exist. (The destructor of a
"moved" object will not be called.)
 
I

Ian Collins

Roland said:
This is not a gcc problem but the basic STL design philosphy. STL
works with values, only with values.
No, they work with objects. Smart pointers are objects, therefore STL
containers work with smart pointers.
 
T

Thomas J. Gritzan

James said:
This is a recognized problem. The next version of the standard
implements something called move-semantics, which can be used to
implement a shallow copy when the compiler determines that the
source will immediately cease to exist. (The destructor of a
"moved" object will not be called.)

Don't start rumours. The destructor will be called. You have to set the
'moved' object into a self consistet state (null-pointers, for example), so
that you don't run into undefined behaviour when the destructor will be called.

(search for "self consistent state) in N1377)
 
K

Kai-Uwe Bux

Paul said:
Hello,

I have a use-case where I have a number of objects of a complex class.
Copying is very expensive and should be avoided. [snip]

One thing you could consider is implementing the class using copy-on-write
and reference counting under the hood. That will make copy-construction and
assignment much cheaper (constant time) at the cost of adding one level of
indirection for the other operations. Of course, when a non-const method
needs to create a unique copy of the underlying data, the cost for actual
copy-construction still hits you; but all those intermediate temporaries
that C++ compilers may or may not generate will become very cheap, which
relieves you from thinking about ways of optimizing them away.

Very often, you can reimplement a class to use copy-on-write and reference
counting without changing its interface at all (a prominent exception being
methods that return references to internals, which is probably questionable
design anyway).


Best

Kai-Uwe Bux
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top