B
Ben Rudiak-Gould
Background: I have some structs containing std::strings and std::vectors of
other structs containing std::strings and std::vectors of .... I'd like to
make a std::vector of these. Unfortunately the overhead of the useless
copies made each time the vector is resized is too large for me to ignore. I
know that rvalue references fix this problem, but I don't think they'll be
widely available for years and I need something that works now. There's no
sensible value I can pass to reserve(). vector<shared_ptr> is faster, but
the per-item allocation overhead is still significant, and the interface is
different. I could wrap it with the interface I wanted, but that might be
error-prone since it would violate std::vector's memory layout guarantee.
I've seriously thought about writing a linear_vector which moves instead of
copying its elements, perhaps calling
relocate(T* dst, T& src)
which could be overloaded for each type. But writing relocate() for every
type I care about is a hassle; worse, there's no good place to put the
definitions. Putting them in the header with the struct itself feels like an
abstraction violation (why should classes need to anticipate that they might
be put in a linear_vector?), and putting it anywhere else is out of the
question since it will inevitably get out of sync with the struct, leading
to nasty bugs.
I would feel happier about writing boilerplate in each class if it were more
widely applicable. I thought about piggybacking on Boost's serialize()
functions, which often look like this:
struct Foo {
int a, b, c;
std::vector< std::map<std::string,int> > d;
// ...
template<class Archive>
void serialize(Archive& ar, unsigned version) {
ar & a & b & c & d;
}
};
but since you can only call serialize on one instance at once, you'd have to
make separate passes over the source struct and the destination memory. This
is unpleasant at the least, and I'm not sure it could be made to work at
all. But a static method supplying pointers-to-members avoids this problem:
struct Foo {
// ...
template<class T>
static void enum_children(T& has) {
has (&Foo::a) (&Foo::b) (&Foo::c) (&Foo::d);
}
};
Then I could write something like:
template<class T, class X>
void relocate_member(T* dst, T& src, X T::* mbr) {
relocate(&dst->*mbr, src.*mbr);
}
// SFINAE magic omitted
template<class T>
void relocate(T* dst, T& src) {
T::enum_children(bind(&relocate_member, dst, src));
}
and furthermore I could write generic implementations of a lot of other
useful functions, like
* a better default swap() than the standard library's
* componentwise operator==() and lexicographical operator<()
* Boost serialize()
* iostream inserters and extracters using (e.g.) a notation
resembling initializer lists
* the usual (deep) visitor pattern
all of which are pretty commonly needed and annoying to write by hand. A
similar idea is implemented in Haskell and described in the "scrap your
boilerplate" papers [1], which have additional motivating examples.
[1] http://research.microsoft.com/~simonpj/papers/hmap/
Of course, I don't want to invent my own private version of this technique;
I want to use a standardized library. I can't be the first person to propose
this for C++, but I can't find anything like it in Boost, to my surprise. Am
I missing something?
Addendum: you probably noticed (it took me annoyingly long to notice) that
my relocate() interface is not a very good one, since there is, to my
knowledge, no guarantee in the standard that any non-POD type can be
correctly relocated by relocating its members. I don't think the proposed
C++0x changes fix this. Couldn't there be a guarantee that a struct like
struct X { A a; B b; };
can be placement-constructed by placement-constructing its members,
regardless of the POD-ness of those members? This is a kind of "shallow POD"
as distinguished from the usual "deep" POD. Has this been discussed?
-- Ben
other structs containing std::strings and std::vectors of .... I'd like to
make a std::vector of these. Unfortunately the overhead of the useless
copies made each time the vector is resized is too large for me to ignore. I
know that rvalue references fix this problem, but I don't think they'll be
widely available for years and I need something that works now. There's no
sensible value I can pass to reserve(). vector<shared_ptr> is faster, but
the per-item allocation overhead is still significant, and the interface is
different. I could wrap it with the interface I wanted, but that might be
error-prone since it would violate std::vector's memory layout guarantee.
I've seriously thought about writing a linear_vector which moves instead of
copying its elements, perhaps calling
relocate(T* dst, T& src)
which could be overloaded for each type. But writing relocate() for every
type I care about is a hassle; worse, there's no good place to put the
definitions. Putting them in the header with the struct itself feels like an
abstraction violation (why should classes need to anticipate that they might
be put in a linear_vector?), and putting it anywhere else is out of the
question since it will inevitably get out of sync with the struct, leading
to nasty bugs.
I would feel happier about writing boilerplate in each class if it were more
widely applicable. I thought about piggybacking on Boost's serialize()
functions, which often look like this:
struct Foo {
int a, b, c;
std::vector< std::map<std::string,int> > d;
// ...
template<class Archive>
void serialize(Archive& ar, unsigned version) {
ar & a & b & c & d;
}
};
but since you can only call serialize on one instance at once, you'd have to
make separate passes over the source struct and the destination memory. This
is unpleasant at the least, and I'm not sure it could be made to work at
all. But a static method supplying pointers-to-members avoids this problem:
struct Foo {
// ...
template<class T>
static void enum_children(T& has) {
has (&Foo::a) (&Foo::b) (&Foo::c) (&Foo::d);
}
};
Then I could write something like:
template<class T, class X>
void relocate_member(T* dst, T& src, X T::* mbr) {
relocate(&dst->*mbr, src.*mbr);
}
// SFINAE magic omitted
template<class T>
void relocate(T* dst, T& src) {
T::enum_children(bind(&relocate_member, dst, src));
}
and furthermore I could write generic implementations of a lot of other
useful functions, like
* a better default swap() than the standard library's
* componentwise operator==() and lexicographical operator<()
* Boost serialize()
* iostream inserters and extracters using (e.g.) a notation
resembling initializer lists
* the usual (deep) visitor pattern
all of which are pretty commonly needed and annoying to write by hand. A
similar idea is implemented in Haskell and described in the "scrap your
boilerplate" papers [1], which have additional motivating examples.
[1] http://research.microsoft.com/~simonpj/papers/hmap/
Of course, I don't want to invent my own private version of this technique;
I want to use a standardized library. I can't be the first person to propose
this for C++, but I can't find anything like it in Boost, to my surprise. Am
I missing something?
Addendum: you probably noticed (it took me annoyingly long to notice) that
my relocate() interface is not a very good one, since there is, to my
knowledge, no guarantee in the standard that any non-POD type can be
correctly relocated by relocating its members. I don't think the proposed
C++0x changes fix this. Couldn't there be a guarantee that a struct like
struct X { A a; B b; };
can be placement-constructed by placement-constructing its members,
regardless of the POD-ness of those members? This is a kind of "shallow POD"
as distinguished from the usual "deep" POD. Has this been discussed?
-- Ben