reserve, resize, how about something in between

Jim Langston · Sep 28, 2010

I am working with triangle lists and clases dealing with them to create 3d
objets. One of the things I prefer to do is store my data in vectors. If I
am going to be iterating over all the elements in the vector (which I would
be while loading it) I want to be able to resize the vector and be done with
it. However. I understand that resize not only sets aside the memory but
initializes the memory to 0. Looking over the definitions seems to confirm
this.

This is where I hope my understanding is wrong, because I really don't want
to add the overhead of initializing the memory when I will be iterating over
and setting the memory in the next few lines of code. But I seem to be
missing that third option, resize_dont_clear() or whatever it would be.

Is this just something I'm not going to get without writing my own container
class or am I missing something. Or the 3rd choice, I shouldn't worry about
the time required to clear memory.

Rui Maciel · Sep 28, 2010

Jim said:
I am working with triangle lists and clases dealing with them to create 3d
objets. One of the things I prefer to do is store my data in vectors. If
I am going to be iterating over all the elements in the vector (which I
would be while loading it) I want to be able to resize the vector and be
done with it. However. I understand that resize not only sets aside the
memory but initializes the memory to 0. Looking over the definitions
seems to confirm this.

This is where I hope my understanding is wrong, because I really don't
want to add the overhead of initializing the memory when I will be
iterating over and setting the memory in the next few lines of code. But
I seem to be missing that third option, resize_dont_clear() or whatever it
would be.

In that scenario is it possible to use a std::list instead of a std::vector?

Rui Maciel

Kai-Uwe Bux · Sep 28, 2010

Jim said:
I am working with triangle lists and clases dealing with them to create 3d
objets. One of the things I prefer to do is store my data in vectors. If
I am going to be iterating over all the elements in the vector (which I
would be while loading it) I want to be able to resize the vector and be
done with
it. However. I understand that resize not only sets aside the memory but
initializes the memory to 0. Looking over the definitions seems to
confirm this.

This is where I hope my understanding is wrong, because I really don't
want to add the overhead of initializing the memory when I will be
iterating over
and setting the memory in the next few lines of code. But I seem to be
missing that third option, resize_dont_clear() or whatever it would be.

From your explanation, I don't understand what is wrong with reserve().

Is this just something I'm not going to get without writing my own
container
class or am I missing something. Or the 3rd choice, I shouldn't worry
about the time required to clear memory.

First, it is not clear to me, that the memory will be cleared at all. A
compiler might very well be smart enough to see that you are "setting the
memory in the next few lines of code".

Best

Kai-Uwe Bux

Juha Nieminen · Sep 28, 2010

Jim Langston said:
I am working with triangle lists and clases dealing with them to create 3d
objets. One of the things I prefer to do is store my data in vectors. If I
am going to be iterating over all the elements in the vector (which I would
be while loading it) I want to be able to resize the vector and be done with
it. However. I understand that resize not only sets aside the memory but
initializes the memory to 0. Looking over the definitions seems to confirm
this.

This is where I hope my understanding is wrong, because I really don't want
to add the overhead of initializing the memory when I will be iterating over
and setting the memory in the next few lines of code.

So what's wrong with reserve() and then push_back()?

But I seem to be
missing that third option, resize_dont_clear() or whatever it would be.

Using uninitialized objects is safe only for POD structs (and primitive
types). You certainly wouldn't want to have uninitialized class objects
in your vector.

If your triangle is a POD struct, and reserve() + push_back() is
completely out of question and you hence must use resize() instead,
have you actually tried how much of a speed penalty there is? Because
there actually might be next to nothing.

Goran · Sep 28, 2010

I am working with triangle lists and clases dealing with them to create 3d
objets. One of the things I prefer to do is store my data in vectors. If I
am going to be iterating over all the elements in the vector (which I would
be while loading it) I want to be able to resize the vector and be done with
it. However. I understand that resize not only sets aside the memory but
initializes the memory to 0. Looking over the definitions seems to confirm
this.

When resize() constructs newly-added objects, it initializes them with
their default value (which ends up being 0 for e.g. primitive types).
It's not zero-fill, it's actually calling default ctors for each
element.

This is where I hope my understanding is wrong, because I really don't want
to add the overhead of initializing the memory when I will be iterating over
and setting the memory in the next few lines of code. But I seem to be
missing that third option, resize_dont_clear() or whatever it would be.

That's what reserve() is for. If you know the number of elements up-
front, you do:

vec.reserve(7);
for (wha'ever)
{
TYPE val(params);
vec.push_back(t);
}

Now, here's a potential problem: you have to create your value that's
to be put in the vector yourself, and, push_back will copy-construct
it into newly prepared element. You can't avoid that with a vector.
Essentially, you have one object creation more, which is something you
could avoid if you could just construct the object in-place.

But note that even if you used a raw array, you are not gaining much.
With an array, you still have default construction (comes with e.g.
new TYPE[size]), and then you have to put actual data into each
object. With an array, however, you avoid creating an actual instance
of TYPE, but you're still copying from somewhere into your objects.
How much of an overhead that is in your case, I don't know (did you
measure, BTW? It's not very smart asking if you don't already know
that you need to go faster ;-) ).

Is this just something I'm not going to get without writing my own container

Not gonna get. You need to write
placement_new_inside_uninitialized_memory(params) function ;-).
Clearly, vector (or any other container) can't write such function
simply because it doesn't know what parameters it should pass to said
placement new call for you.

Goran.

Öö Tiib · Sep 28, 2010

I am working with triangle lists and clases dealing with them to create 3d
objets. One of the things I prefer to do is store my data in vectors. If I
am going to be iterating over all the elements in the vector (which I would
be while loading it) I want to be able to resize the vector and be done with
it. However. I understand that resize not only sets aside the memory but
initializes the memory to 0. Looking over the definitions seems to confirm
this.

This is where I hope my understanding is wrong, because I really don't want
to add the overhead of initializing the memory when I will be iterating over
and setting the memory in the next few lines of code. But I seem to be
missing that third option, resize_dont_clear() or whatever it would be.

Is this just something I'm not going to get without writing my own container
class or am I missing something. Or the 3rd choice, I shouldn't worry about
the time required to clear memory.

Clarification: vector default-constructs the objects in resized space,
not zeroes them (constructing POD's means zeroing them). Very same
thing happens with raw array of Triangles. Do you have reported
performance issue there? If the profiling did not indicate that your
vectors construction takes some noticable percentage of running it
then you should not worry.

Lets assume that you have a performance issue. Then sure there are
techniques how to turn things quicker. For example if your Triangle
default constructor does not initialize the data members in it then
you get uninitialized Triangles in resized vector. You should try if
compiler optimizes calls to such (inline empty) constructor out or
not.

Anyway ... recommended is to avoid performance related works before
there are performance related issues ... otherwise you waste time to
deal with wrong issues.

Bart van Ingen Schenau · Sep 28, 2010

I am working with triangle lists and clases dealing with them to create 3d
objets. One of the things I prefer to do is store my data in vectors. If I
am going to be iterating over all the elements in the vector (which I would
be while loading it) I want to be able to resize the vector and be done with
it. However. I understand that resize not only sets aside the memory but
initializes the memory to 0. Looking over the definitions seems to confirm
this.

This is where I hope my understanding is wrong, because I really don't want
to add the overhead of initializing the memory when I will be iterating over
and setting the memory in the next few lines of code. But I seem to be
missing that third option, resize_dont_clear() or whatever it would be.

Is this just something I'm not going to get without writing my own container
class or am I missing something. Or the 3rd choice, I shouldn't worry about
the time required to clear memory.

All standard containers have the property that they can only contain
properly constructed objects. Otherwise it becomes impossible for the
container's destructor to tell which elements have to be cleaned up
and which are just raw memory.
This means that your understanding of vector<T>::resize() is correct
and you won't ever get a resize_no_construct function.

That said, not all is lost.
If you have a huge number of objects or objects that are expensive to
construct, you can start with a call to reserve() to pre-allocate the
required memory (this avoids expensive reallocations and possible
memory fragmentation) and then add all the objects to the container
with push_back.
Compared to resizing and then overwriting the elements, this saves you
a default-constructor call and costs you an integer increment (as the
size gets adjusted on every push_back call).

And, zero-initialisation is one of the cheapest forms of constructing
an object, so if you only have a collection of POD objects, don't
worry about the time needed for construction on resize. You probably
won't be able to measure it as a bottleneck.

Bart v Ingen Schenau

gwowen · Sep 28, 2010

Compared to resizing and then overwriting the elements, this saves you
a default-constructor

A copy constructor, isn't it?

My understanding is :
std::vector<T> vec(n); // Becomes
std::vector<T> vec(n, /* a_default_constructed_T */, /*
default_allocator */);

and on vec.resize() the default constructed T is *copied* as
necessary.
Is this wrong?

(I expect c++0x move-semantics to change this, at least a little).

Juha Nieminen · Sep 28, 2010

Goran said:
But note that even if you used a raw array, you are not gaining much.
With an array, you still have default construction (comes with e.g.
new TYPE[size]), and then you have to put actual data into each
object.

Actually if you call the global operator new, or std::allocator directly,
you will get uninitialized memory. You can then use placement-new to
initialize the objects in-situ (or just assignment if we are talking
about primitive types or POD structs).

Jim Langston · Sep 28, 2010

news:d7f2d294-f697-446b-af3e-> (e-mail address removed)...

All standard containers have the property that they can only contain
properly constructed objects. Otherwise it becomes impossible for the
container's destructor to tell which elements have to be cleaned up
and which are just raw memory.
This means that your understanding of vector<T>::resize() is correct
and you won't ever get a resize_no_construct function.

Okay, this is an explantion I can live with. If containers require
initialized memory then I'll just make sure they are initialized.

That said, not all is lost.
If you have a huge number of objects or objects that are expensive to
construct, you can start with a call to reserve() to pre-allocate the
required memory (this avoids expensive reallocations and possible
memory fragmentation) and then add all the objects to the container
with push_back.
Compared to resizing and then overwriting the elements, this saves you
a default-constructor call and costs you an integer increment (as the
size gets adjusted on every push_back call).

I looked at the code for push_back() and it has a bit more logic then I
would care for and using push_back() would change my algorithm. The over
head for push_back() in the end is a little lighter then using a naked array
I think, although I'm not sure.

Truth be told, however, using push_back() is probably better coding then
using a size_t index as using push_back() there is no way I can overflow the
array. 6 of one, half a dozen of the other I suppose.

And, zero-initialisation is one of the cheapest forms of constructing
an object, so if you only have a collection of POD objects, don't
worry about the time needed for construction on resize. You probably
won't be able to measure it as a bottleneck.

Yes, it is POD, just floating point values.

Jim Langston · Sep 28, 2010

Rui Maciel said:
In that scenario is it possible to use a std::list instead of a
std::vector?

Probably, but I am using std::vector because it's memory is guaranteed to be
contiguous and as I will be itereating over every triangle I believe the
data pipeline will be a little faster with contiguous memory.

Jim Langston · Sep 28, 2010

Kai-Uwe Bux said:
From your explanation, I don't understand what is wrong with reserve().

First, it is not clear to me, that the memory will be cleared at all. A
compiler might very well be smart enough to see that you are "setting the
memory in the next few lines of code".

A compiler might.

Joshua Maurice · Sep 28, 2010

So what's wrong with reserve() and then push_back()?

Using uninitialized objects is safe only for POD structs (and primitive
types). You certainly wouldn't want to have uninitialized class objects
in your vector.

C++03, 4.1 Lvalue-to-rvalue conversion / 1
"An lvalue (3.10) of a non-function, non-array type T can be converted
to an rvalue. [...] If the object to which the lvalue refers [...] is
uninitialized, a program that necessitates this conversion has
undefined behavior."

So, an implementation is free to add runtime checks for accessing
uninitialized data.

Goran · Sep 29, 2010

Probably, but I am using std::vector because it's memory is guaranteed to be
contiguous and as I will be itereating over every triangle I believe the
data pipeline will be a little faster with contiguous memory.

Given what you have written, list would be a horrible solution. You
seem to know the number of elements up front. If so, then you can
allocate space for the whole of your vector in one go. list (in common
implementations) would give you multiple allocations and a host of
other inconveniences later.

Goran.

Bart van Ingen Schenau · Sep 29, 2010

A copy constructor, isn't it?

Yes, you are right. Although that should not make a difference for the
overall performance picture.

My understanding is :
std::vector<T> vec(n); // Becomes
std::vector<T> vec(n, /* a_default_constructed_T */, /*
default_allocator */);

and on vec.resize() the default constructed T is *copied* as
necessary.
Is this wrong?

That is correct.

(I expect c++0x move-semantics to change this, at least a little).

I don't expect so for the constructor or resize. A single object can
only be the source for move-construction once. If you have one object
to initialise several container-elements with, you are still required
to use copy-construction. (You could replace the last of thos with
move-construction, but I don't think the implementations will bother.
Too little ROI)

Bart v Ingen Schenau

gwowen · Sep 29, 2010

Yes, you are right. Although that should not make a difference for the
overall performance picture.

Well, for POD types, default construction will give you junk, rather
than zeros, right? So if we could default construct the POD in a
vector.resize(), they could all be different junk. All std::vector
would have to do is maybe realloc() and increment size. But since
we're copy constructing, std::vector is (maybe) constrained to copy,
so every element (?must?) contain identical junk to what was created
at vector instantiation time.

Of course, since using any of the junk is probably UB, so the compiler
*might* be able to figure this out and "as if" us to efficiency, but
that seems a pretty big leap for the compiler to make.

NB: None of this is based on a close reading of the standard. I am
more than happy to be corrected.

Bart van Ingen Schenau · Sep 29, 2010

Well, for POD types, default construction will give you junk, rather
than zeros, right?

No. Default construction is not a term used in the standard in
relation to POD types (there is exactly one hit in N3092 in the
context of thread objects). You are thinking of 'default-
initialisation'.
When you invoke a default constructor on a POD object or create a
temporary of such type (using 'member()' or 'type()'), you get a value
initialised object.

So if we could default construct the POD in a
vector.resize(), they could all be different junk. All std::vector
would have to do is maybe realloc() and increment size. But since
we're copy constructing, std::vector is (maybe) constrained to copy,
so every element (?must?) contain identical junk to what was created
at vector instantiation time.

No, because if the user does not provide an object to copy from, the
elements are copied from a *value initialised* temporary.
This results in the value 0 for POD objects.

Bart v Ingen Schenau

Bo Persson · Sep 29, 2010

Joshua said:
So what's wrong with reserve() and then push_back()?

Using uninitialized objects is safe only for POD structs (and
primitive types). You certainly wouldn't want to have
uninitialized class objects in your vector.

Click to expand...

C++03, 4.1 Lvalue-to-rvalue conversion / 1
"An lvalue (3.10) of a non-function, non-array type T can be
converted to an rvalue. [...] If the object to which the lvalue
refers [...] is uninitialized, a program that necessitates this
conversion has undefined behavior."

So, an implementation is free to add runtime checks for accessing
uninitialized data.

Or, more likely, the host hardware might trap invalid values. I
believe that is the major reason for this being UB.

Bo Persson

Joshua Maurice · Sep 29, 2010

C++03, 4.1 Lvalue-to-rvalue conversion / 1
"An lvalue (3.10) of a non-function, non-array type T can be
converted to an rvalue. [...] If the object to which the lvalue
refers [...] is uninitialized, a program that necessitates this
conversion has undefined behavior."

Click to expand...

So, an implementation is free to add runtime checks for accessing
uninitialized data.

Click to expand...

Or, more likely, the host hardware might trap invalid values. I
believe that is the major reason for this being UB.

Indeed. I was using the term "implementation" in a very general way
ala the C++ standard. The compiler, the OS, the hardware, and probably
more, must all play along to some degree to be conforming to the C++
standard.

Rui Maciel · Sep 30, 2010

Jim said:
Probably, but I am using std::vector because it's memory is guaranteed to
be contiguous and as I will be itereating over every triangle I believe
the data pipeline will be a little faster with contiguous memory.

Oh I see. When I suggested a std::list I had interpreted your need to resize the vector not as a
way to set the vector's size upfront but to alter it's size again (re-size the vector) somewhere in
a later stage.

Rui Maciel

std::vector reserve	1	Mar 21, 2011
STL std::string difference between reserve and resize ()	2	Dec 18, 2007
Question about STL string reserve()	2	Oct 7, 2006
vector reserve() and clear()	3	Nov 2, 2006
Vector reserve in a for_each	9	Feb 24, 2007
how does reserve() work?	3	Apr 14, 2007
vector::reserve,	5	Apr 29, 2006
How to go about building a crud app when you are a noob	1	Jan 2, 2023

reserve, resize, how about something in between

Jim Langston

Rui Maciel

Kai-Uwe Bux

Juha Nieminen

Goran

Öö Tiib

Bart van Ingen Schenau

gwowen

Juha Nieminen

Jim Langston

Jim Langston

Jim Langston

Joshua Maurice

Goran

Bart van Ingen Schenau

gwowen

Bart van Ingen Schenau

Bo Persson

Joshua Maurice

Rui Maciel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads