std::vector<>::clear semantics

S

Stefan Höhne

Hi,

[this is a repost of a simmilar question I asked in
alt.comp.lang.learn.c-c++ recently]

as I recon, std::vector::clear()'s semantics changed from MS VC++ 6.0 to
MS' DOT.NET - compiler.

In the 6.0 version the capacity() of the vector did not change with the
call to clear(), in DOT.NET the capacity() is reduced to 0.

I relied on capacity() not changing when calling clear(). This is because
I do not want any reallocs within the following push_back() calls because
I use iterators that should keep valid. I've got an example below.

As I dicovered, some people claim that clear() is not required to keep
capacity() unchanged (as I thought it was). So I probably can not
blame M$ (but I do, nevertheless, because they changed semantics
silently and because of tradition) and need a workaround.

I know that I can use clear() in combination with reserve(). But this
would add some overhead with allocating and deallocating huge amounts
of memory and I want to avoid that.

I also want to avoid to give the elements of the std::vector a default
constructor. In the moment, I use resize(0) instead of clear(), but this
needs an default constructor in MS VC++'s lib.

Is there a (guaranteed to work-) way to do shrink a std::vector's size
without
- reducing its capacity()
- requiring the vectors elements to be default - constructable
?

Anybody a clue?

Stefan.


Example:
---
std::vector<int> v;
v.reserve(...);
while (...) {
v.clear();
//fill vector:
for (...)
v.push_back(...);

for(std::vector<int>::iterator i=v.begin(); i!=v.end();/*nothing*/) {
if(...)
v.push_back(...); // i should survive this because of reserve()
if(...)
++i;
else
i=v.erase(i);
}

/* some further use of v */
}
---
 
D

David B. Held

Stefan Höhne said:
[...]
Is there a (guaranteed to work-) way to do shrink a std::vector's
size without
- reducing its capacity()
- requiring the vectors elements to be default - constructable
[...]

An extreme solution would be to write a custom allocator that
pre-allocates a large chunk and never frees it until you make
an explicit call to a function of the allocator. That is, it would
honor deallocate() calls, but wouldn't deallocate the actual
storage it reserved.

Dave
 
G

Guest

Hi,

[this is a repost of a simmilar question I asked in
alt.comp.lang.learn.c-c++ recently]

as I recon, std::vector::clear()'s semantics changed from MS VC++ 6.0 to
MS' DOT.NET - compiler.

This is of topic here, I mean this is not a place for MS documentation.
In the 6.0 version the capacity() of the vector did not change with the
call to clear(), in DOT.NET the capacity() is reduced to 0.

that's kind of funny.
I relied on capacity() not changing when calling clear(). This is because
I do not want any reallocs within the following push_back() calls because
I use iterators that should keep valid. I've got an example below.

Trying to keep iterators valid through clear was another nifty trick,
nice.
As I dicovered, some people claim that clear() is not required to keep
capacity() unchanged (as I thought it was). So I probably can not
blame M$ (but I do, nevertheless, because they changed semantics
silently and because of tradition) and need a workaround.

I know that I can use clear() in combination with reserve(). But this
would add some overhead with allocating and deallocating huge amounts
of memory and I want to avoid that.

why ?
I also want to avoid to give the elements of the std::vector a default
constructor. In the moment, I use resize(0) instead of clear(), but this
needs an default constructor in MS VC++'s lib.

how about using resize(0,const value&)
Is there a (guaranteed to work-) way to do shrink a std::vector's size
without
- reducing its capacity()

maybe erase or pop_back ?
- requiring the vectors elements to be default - constructable
?

Anybody a clue?

Stefan.


Example:

empty vector
v.reserve(...);

still empty
while (...) {
v.clear();

clearing empty vector ?
//fill vector:
for (...)
v.push_back(...);

for(std::vector<int>::iterator i=v.begin(); i!=v.end();/*nothing*/) {
if(...)
v.push_back(...); // i should survive this because of reserve()

push_back will call reserve if needed
if(...)
++i;
else
i=v.erase(i);
}

/* some further use of v */
}

Good luck.
 
S

Stefan Höhne

Hi,

Hi,

[this is a repost of a simmilar question I asked in
alt.comp.lang.learn.c-c++ recently]

as I recon, std::vector::clear()'s semantics changed from MS VC++ 6.0 to
MS' DOT.NET - compiler.

This is of topic here, I mean this is not a place for MS documentation.

my actual question is not about some MS - specific stuff, its a pure
C++ - specific question. I thought it could be good to give some
background information. Its not an academic thought which I have,
its a real problem.
that's kind of funny.

It is whats happening.
Trying to keep iterators valid through clear was another nifty trick,
nice.

Im not sure wether my explaination above is clear enough or not, but
Im pretty sure that the example below should be.

I want an iterator to keep valid through subsequent push_back() calls.
For this, its nessesary that capacity() does not change.

I suppose that if capacity() shrinks to zero, this is a clear indicator
for memory beeing freed (depending on the behaviour of the default
allocator of std::vector) during the clear(). A subsequent call to
reserve() would clearly allocate memory. Is this the explanation you
wanted?
how about using resize(0,const value&)

Yep, that would do it, because the objects I collect in the vector are
assignable. But I'm not sure anymore wether it would be
guaranteed that this will not affect capacity().
maybe erase or pop_back ?

erase() is just another possible workaround. People in
alt.comp.lang.learn.c-c++ believe the behaviour of erase(begin(), end())
should be the same as of clear().

pop_back() is no alternative: it takes linear time. The object within the
vector is a POD, therefore clear() should take no time. Yes, this will
matter, its a huge vector.
empty vector


still empty


clearing empty vector ?

is no harm in such an example. It will not be empty
in the second iteration.

Im not sure why your quote is unindented. Reading
my own post in google groups I can see the
indentation. Where is the problem?


push_back will call reserve if needed

And this will make i invalid. So I dont want clear() or whatever I'll use
to shrink the vectors capacity(). Im sure that the vector will never grow
bigger than the amount I'm initially passing to reserve().
Good luck.

Thanks,
Stefan.
 
D

David B. Held

Stefan Höhne said:
[...]
erase() is just another possible workaround. People in
alt.comp.lang.learn.c-c++ believe the behaviour of
erase(begin(), end()) should be the same as of clear().

Clearly that's wrong (pun intended).
pop_back() is no alternative: it takes linear time.
[...]

Really??? I can believe that erase() takes linear time,
but would you like to explain how pop_back() takes
linear time?

Maybe what you want to do is look at the reverse iterators,
and use std::erase() with them instead.

Dave
 
T

tom_usenet

erase() is just another possible workaround. People in
alt.comp.lang.learn.c-c++ believe the behaviour of erase(begin(), end())
should be the same as of clear().

That isn't quite true. What is true is that a clear() implementation
must behave according to the semantics of erase(begin(), end()) in the
standard. That doesn't mean it has to behave the same as the
implementation of erase(begin(), end()) on the same platform.

In fact, erase(begin(), end()) leaves the capacity unchanged on
MSVC.NET, whereas clear() sets it to 0. So, for your particular
platform dependent problem, erase(begin(), end()) is a platform
dependent solution.

The only standard solution is not to let the size of the vector hit 0,
because, as soon as it does, the memory can be deallocated (according
to some, anyway).

Personally, I think any implementation that frees storage on a call to
clear() (and without documenting the fact!) is poor, since most users
expect capacity to be "sticky". I can only assume that it is an
accidental change to Dinkumware's std::vector.

Tom
 
H

Howard Hinnant

tom_usenet said:
The only standard solution is not to let the size of the vector hit 0,
because, as soon as it does, the memory can be deallocated (according
to some, anyway).

Imho, the standard solution is to use clear().

23.2.4.2/5:
It is guaranteed that no reallocation takes place during insertions that
happen after a call to reserve() until the time when an insertion would make
the size of the vector greater than the size specified in the most recent
call to reserve().

So:

vector<int> v;
v.reserve(2);
v.clear();
v.push_back(1);
int& i = v.front();
v.push_back(2);

After the second push_back, 23.2.4.2/5 guarantees that the reference "i"
will still be valid sense the most recent call to reserve specified that
there be room for at least two ints.

To be fair, 23.2.4.2/5 is being changed by DR 329:

http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-defects.html#329

to read:
Reallocation invalidates all the references, pointers, and iterators
referring to the elements in the sequence. It is guaranteed that no
reallocation takes place during insertions that happen after a call to
reserve() until the time when an insertion would make the size of the vector
greater than the value of capacity().

But I do not believe this new wording provides any flexibility for
clear() to reduce capacity. It merely clarifies the situation described
in the DR:

vec.reserve(23);
vec.reserve(0);

// capacity() still >= 23 here

-Howard
 
T

tom_usenet

Imho, the standard solution is to use clear().

23.2.4.2/5:


So:

vector<int> v;
v.reserve(2);
v.clear();
v.push_back(1);
int& i = v.front();
v.push_back(2);

After the second push_back, 23.2.4.2/5 guarantees that the reference "i"
will still be valid sense the most recent call to reserve specified that
there be room for at least two ints.

How about:

vector<int> v(2); //obviously capacity() >= 2
v.clear();
v.push_back(1);
int& i = v.front();
v.push_back(2);

The standard is underspecified here in a way that makes it hard to
write efficient (or even correct) code.
To be fair, 23.2.4.2/5 is being changed by DR 329:

http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-defects.html#329

to read:


But I do not believe this new wording provides any flexibility for
clear() to reduce capacity. It merely clarifies the situation described
in the DR:

vec.reserve(23);
vec.reserve(0);

// capacity() still >= 23 here

I have roughly the same reasoning (I'd studied the defect reports a
couple of days ago, along with old threads in std.c++), but Dinkumware
seem to disagree. But I think it must be a bug/oversight in their
implementation.

In any case, the standard is ridiculously unclear at the moment. It
needs to say that capacity will never decrease except during a call to
swap. I have no idea why it doesn't say that, and why none of the
defect reports have suggested that.

Tom
 
H

Howard Hinnant

So:

vector<int> v;
v.reserve(2);
v.clear();
v.push_back(1);
int& i = v.front();
v.push_back(2);

After the second push_back, 23.2.4.2/5 guarantees that the reference "i"
will still be valid sense the most recent call to reserve specified that
there be room for at least two ints.

How about:

vector<int> v(2); //obviously capacity() >= 2
v.clear();
v.push_back(1);
int& i = v.front();
v.push_back(2);[/QUOTE]

I think the standard may allow this, but I'm not positive. However the
vendor would have to go to extra expense (both storage and speed) to
pull it off. The vendor would not only need to store capacity, but also
how it was set (by an explicit call to reserve or not). So as long as
the previous semantics are respected (with reserve), this latter concern
might be academic as I can not see any motivation for a vendor to add
such an expense.
I have roughly the same reasoning (I'd studied the defect reports a
couple of days ago, along with old threads in std.c++), but Dinkumware
seem to disagree. But I think it must be a bug/oversight in their
implementation.

In any case, the standard is ridiculously unclear at the moment. It
needs to say that capacity will never decrease except during a call to
swap. I have no idea why it doesn't say that, and why none of the
defect reports have suggested that.

Sounds like you're the guy to write the next defect report! :)

-Howard
 
P

P.J. Plauger

I have roughly the same reasoning (I'd studied the defect reports a
couple of days ago, along with old threads in std.c++), but Dinkumware
seem to disagree. But I think it must be a bug/oversight in their
implementation.

In any case, the standard is ridiculously unclear at the moment. It
needs to say that capacity will never decrease except during a call to
swap. I have no idea why it doesn't say that, and why none of the
defect reports have suggested that.

At the time we prepared that particular version, the C++ Standard was
at least as unclear. We intentionally made clear reduce capacity, in
response to customer demands for some mechanism to do so. Consensus
seems to be building against that approach, however, so future versions
will no longer reduce capacity.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
T

tom_usenet

At the time we prepared that particular version, the C++ Standard was
at least as unclear. We intentionally made clear reduce capacity, in
response to customer demands for some mechanism to do so.

You could just have in the documentation for the clear function:

clear() will not reduce the capacity of a vector. To reduce the
capacity call vector<T>().swap(v);

or similar.

Consensus
seems to be building against that approach, however, so future versions
will no longer reduce capacity.

It's the silent (and undocumented as far as I could find) change in
the meaning of code between versions that's the main problem...

Tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,816
Latest member
SapanaCarpetStudio

Latest Threads

Top