Using reserved space in a vector defined?

Fred Zwarts · Jan 4, 2010

I have an application in which data from one place needs to copied to another place.
These places can be network sockets, or a files on disk, etc.
In all cases the data consists of unformatted unsigned 32-bit integers.
Each record starts with a 32-bit integer with the record size.
Then the record with the data follows.
Then another count follows, or the end of the data.
The records usually have up to a few thousand integers,
but occasionally it can be very much larger.
For efficiency, the records are read as a block.
Reading the data elements one by one would significantly reduce the I/O performance.

The application uses a vector <uint32_t> as a buffer for reading and writing the data.
First it reads the count.
Then it can do two things, either resize the vector if it is too small,
or increase the capacity of the vector with the reserve function.
Then it reads the record into the vector.
Then it writes the count and writes the data.

The question is about resizing the buffer.
Using a resize has the disadvantage that all new elements of the vector are initialized,
which is an unnecessary operation, because these values will be overwritten immediately
by the subsequent read operation.
Alternatively, the vector size is kept at 1, and only the capacity of the vector is increased.
The address of the first element of the vector and the size of the record is supplied
to the read function and after that to the write function.
This means data elements of the vector are used as a buffer that do not belong to its defined size.
However, the standard says that all data elements are contiguous in memory,
so, this means that also the reserved space must be contiguous.
So, logically, there should not be a problem to use this reserved space in this way.

Is there a reason to think that there are environments were this does not work?
Is there another way to resize the buffer, without initializing all new elements?

Gert-Jan de Vos · Jan 4, 2010

Is this initialization a bottleneck in the application?

If you don't want initialization, then you probably shouldn't use
vector. From the problem description, it should be pretty
straightforward to write a class that does what you need, without the
heavyweight features that vector provides.

I had cases where vector's initialization was a bottleneck. I changed
the buffer to
boost::scoped_array<T> buffer(new T[size]);

AnonMail2005 · Jan 4, 2010

I have an application in which data from one place needs to copied to another place.
These places can be network sockets, or a files on disk, etc.
In all cases the data consists of unformatted unsigned 32-bit integers.
Each record starts with a 32-bit integer with the record size.
Then the record with the data follows.
Then another count follows, or the end of the data.
The records usually have up to a few thousand integers,
but occasionally it can be very much larger.
For efficiency, the records are read as a block.
Reading the data elements one by one would significantly reduce the I/O performance.

The application uses a vector <uint32_t> as a buffer for reading and writing the data.
First it reads the count.
Then it can do two things, either resize the vector if it is too small,
or increase the capacity of the vector with the reserve function.
Then it reads the record into the vector.
Then it writes the count and writes the data.

The question is about resizing the buffer.
Using a resize has the disadvantage that all new elements of the vector are initialized,
which is an unnecessary operation, because these values will be overwritten immediately
by the subsequent read operation.
Alternatively, the vector size is kept at 1, and only the capacity of the vector is increased.
The address of the first element of the vector and the size of the record is supplied
to the read function and after that to the write function.
This means data elements of the vector are used as a buffer that do not belong to its defined size.
However, the standard says that all data elements are contiguous in memory,
so, this means that also the reserved space must be contiguous.
So, logically, there should not be a problem to use this reserved space in this way.

Is there a reason to think that there are environments were this does not work?
Is there another way to resize the buffer, without initializing all new elements?

I think you're fine - especially since you already have a size of
one. Any compliant C++ standard library should work.

But if you're using vector *just* for the memory management, you may
want to read Gert-Jan de Vos' reply below.

HTH

James Kanze · Jan 4, 2010

Fred Zwarts wrote:

It will work fine in all compliant environments.

It's undefined behavior. Although in practice, it's difficult
to imagine an implementation where it wouldn't work, there's
certainly no guarantee that it will work, and there is a
guarantee that anyone reading the code will be thoroughly
confused.

Ian Collins · Jan 5, 2010

Andy said:
BTW you seem to have lost the space after your hyphens.

James has to suffer google, so it's them rather than him who's lost the
space!

Daniel Pitts · Jan 6, 2010

Ian said:
James has to suffer google, so it's them rather than him who's lost the
space!

I use the $5/mo service from newsrazor <https://www.newsrazor.net/>
It's fast, cheap, fairly reliable, and it's not google

Jerry Coffin · Jan 6, 2010

I use the $5/mo service from newsrazor <https://www.newsrazor.net/>
It's fast, cheap, fairly reliable, and it's not google

Read through the archive (if Google's is working well enough to do
so). The problem is technical (only access via port 80) not monetary.

James Kanze · Jan 6, 2010

Read through the archive (if Google's is working well enough
to do so). The problem is technical (only access via port 80)
not monetary.

But that will soon change

. I'm in the midst of a total
reorganization of my environment, and I will have a correct news
access in the end. (But getting a comfortable development
environment on my new machine has precedence. And since it is
purely Windows, that's not an easy issue to solve.)

Alf P. Steinbach · Jan 6, 2010

* James Kanze:

But that will soon change. I'm in the midst of a total
reorganization of my environment, and I will have a correct news
access in the end. (But getting a comfortable development
environment on my new machine has precedence. And since it is
purely Windows, that's not an easy issue to solve.)

Hiya James. I know a thing or two about Windows. But perhaps not about the tools
you're using -- I've been out of the loop for some years. Anyway, for
Windows-specific issues I'd be glad to help (if I can).

Cheers,

- Alf

James Kanze · Jan 7, 2010

* James Kanze:

[...]

Hiya James. I know a thing or two about Windows. But perhaps
not about the tools you're using -- I've been out of the loop
for some years. Anyway, for Windows-specific issues I'd be
glad to help (if I can).

I already owe you a lot for your help some time back, but where
I'm now working, everyone is more or less up to speed on
Windows; I've not yet reached the point where I'm asking
questions they can't answer. And, of course, I've installed
Cygwin on my machine, which means that there are a lot of little
things that I can do faster than they can

.

James Kanze · Jan 9, 2010

"Undefined behavior" means that the compiler can add boundary
checks to the vector indexing, which some compilers do in
debug mode. Thus the program will fail if you try to index
out-of-bounds, even if it would be on reserved space. Thus you
cannot trust that the trick will work with all compilers in
all configurations.

If I understood correctly, however, he's using the [] operator
on the address returned by &v[0]. I'm not too sure of the
standard here; I sort of think that such checks would have to
involve the underlying allocated memory, and not what vector
knows about it. (On the other hand, vector is free to do
whatever it wants with that underlying memory, e.g. overwrite it
with nonsense patterns in every function. I just can't imagine
an implementation which does, however.)

Daniel Pitts · Jan 9, 2010

Jerry said:
Read through the archive (if Google's is working well enough to do
so). The problem is technical (only access via port 80) not monetary.

Ah, that is frustrating indeed. Though, newsrazor have a server which
listens on port 80, so it may still be a possibility:
<http://www.newsrazor.net/faq.php#ServerSettings>

I use the standard settings, so I'm not sure how well it works.

Ian Collins · Jan 9, 2010

James said:
I already owe you a lot for your help some time back, but where
I'm now working, everyone is more or less up to speed on
Windows; I've not yet reached the point where I'm asking
questions they can't answer. And, of course, I've installed
Cygwin on my machine, which means that there are a lot of little
things that I can do faster than they can.

Install VirtualBox, load a copy of your OS of choice, configure some
shared folders and you will soon be back to normal!

Add two functions into vector class?	2	Jan 5, 2013
How to represent "RESERVED 1 byte" & "RESERVED 4 byte" in a line ...!!	3	Nov 6, 2013
Decompressed bitmap image doesn't properly render when using WinGDI	2	Jun 14, 2024
Erratic Container Behavior (list & vector)	2	Apr 13, 2014
Question in pointers and it's reserved memory space?	7	Oct 14, 2007
Advancing Through std::vector	15	Sep 11, 2013
STL Vector Access	1	Feb 6, 2011
How does a HEAD pointer end up pointing to the first node in a linked list?	3	Jan 24, 2023

Using reserved space in a vector defined?

Fred Zwarts

Gert-Jan de Vos

AnonMail2005

James Kanze

Ian Collins

Daniel Pitts

Jerry Coffin

James Kanze

Alf P. Steinbach

James Kanze

James Kanze

Daniel Pitts

Ian Collins

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads