memset vs fill and iterators vs pointers

J

Joe C

I'm a hobbiest, and made the forray into c++ from non-c type languages about
a year ago. I was "cleaning up" some code I wrote to make it more "c++
like" and have a few questions. I'm comfortable using new/delete when
dealing with arrays, and, so-far haven't used the STL (eg vectors) very much
when dealing with POD. I'm using a class to dump files into. The class
puts the file data into a 32-bit array, then offers both 32-bit and char* to
the file data. The reason I've done this blasphomy is that I need to access
the file-header information from a byte-oriented viewpoint, but the actual
data is 32-bit and *will* be word-aligned with respect to the start of the
file.

first...is there a better way using streams to do the same?

second...I've been using memcopy and memset...for POD, are there compelling
reasons to use copy() and fill/fill_n() instead (when dealing with POD
arrays)?

Third...if I load a file into a 32-bit vector and the file byte length is
not an even multiple of 4-bytes...what happens to the last (incomplete) word
of the file?

I hope my questions are clear.

Joe
 
D

David Hilsee

Joe C said:
I'm a hobbiest, and made the forray into c++ from non-c type languages about
a year ago. I was "cleaning up" some code I wrote to make it more "c++
like" and have a few questions. I'm comfortable using new/delete when
dealing with arrays, and, so-far haven't used the STL (eg vectors) very much
when dealing with POD. I'm using a class to dump files into. The class
puts the file data into a 32-bit array, then offers both 32-bit and char* to
the file data. The reason I've done this blasphomy is that I need to access
the file-header information from a byte-oriented viewpoint, but the actual
data is 32-bit and *will* be word-aligned with respect to the start of the
file.

first...is there a better way using streams to do the same?

I don't understand your question. I'm sure you could use C++ stream classes
instead of C I/O functions, if that's what you're talking about. I'm not
sure if that's necessarily better, in your situation.
second...I've been using memcopy and memset...for POD, are there compelling
reasons to use copy() and fill/fill_n() instead (when dealing with POD
arrays)?

While std::memcpy is not guaranteed to work for overlapping regions of
memory, std::copy works for overlapping sequences. Also, std::memset sets a
region of memory to all bits zero, and that's not guaranteed to be the
representation of zero for certain types (e.g. pointers may not use all bits
zero to represent null). C++'s std::fill does not inherently write "all
bits zero" to a region of memory. Using the C++ equivalents may help you
escape some of the gotchas that the C functions bring to the table.
Third...if I load a file into a 32-bit vector and the file byte length is
not an even multiple of 4-bytes...what happens to the last (incomplete) word
of the file?

The std::vector is not so different from the array. In fact, the
std::vector uses an array internally. See the FAQ
(http://www.parashift.com/c++-faq-lite/), section 34 ("Container classes and
templates"), question 3 ("Is the storage for a std::vector<T> guaranteed to
be contiguous?").
 
D

David Hilsee

While std::memcpy is not guaranteed to work for overlapping regions of
memory, std::copy works for overlapping sequences. Also, std::memset sets a
region of memory to all bits zero, and that's not guaranteed to be the
representation of zero for certain types (e.g. pointers may not use all bits
zero to represent null). C++'s std::fill does not inherently write "all
bits zero" to a region of memory. Using the C++ equivalents may help you
escape some of the gotchas that the C functions bring to the table.

Here I assumed that zero was being passed to memset, which is the common
usage. I also failed to mention that std::fill and std::copy are more
type-safe than their C equivalents and do not require the programmer to
consider the size of the elements (using code like numElems *
sizeof(Element)). In general, they are easier to use.

The std::vector is not so different from the array. In fact, the
std::vector uses an array internally. See the FAQ
(http://www.parashift.com/c++-faq-lite/), section 34 ("Container classes and
templates"), question 3 ("Is the storage for a std::vector<T> guaranteed to
be contiguous?").

Here, I should have instead said "In fact, the std::vector uses contiguous
storage internally".
 
T

tom_usenet

I'm a hobbiest, and made the forray into c++ from non-c type languages about
a year ago. I was "cleaning up" some code I wrote to make it more "c++
like" and have a few questions. I'm comfortable using new/delete when
dealing with arrays, and, so-far haven't used the STL (eg vectors) very much
when dealing with POD. I'm using a class to dump files into. The class
puts the file data into a 32-bit array, then offers both 32-bit and char* to
the file data. The reason I've done this blasphomy is that I need to access
the file-header information from a byte-oriented viewpoint, but the actual
data is 32-bit and *will* be word-aligned with respect to the start of the
file.

first...is there a better way using streams to do the same?

Not really, at least not if your code is already working. You would be
better off having the class do the reading of the header, so that this
detail is encapsulated from users of the class. Then it would return a
pointer to the start of the *real* "32-bit array" (by which I assume
you mean an array of unsigned int or similar).
second...I've been using memcopy and memset...for POD, are there compelling
reasons to use copy() and fill/fill_n() instead (when dealing with POD
arrays)?

For POD, copy is like memmove in that it works with overlapping
ranges.

fill is different from memset. memset only allows you to set every
byte to the same value, whereas fill allows you to set every element
(which may be, e.g., an unsigned int) to the same value.
Third...if I load a file into a 32-bit vector and the file byte length is
not an even multiple of 4-bytes...what happens to the last (incomplete) word
of the file?

Assuming the vector is 0-initialized, it depends on the byte order
your platform uses. Either the high order or low order bytes of the
last value will be 0, which will give the word a particular value.

Tom
 
J

Joe C

tom_usenet said:
Not really, at least not if your code is already working. You would be
better off having the class do the reading of the header, so that this
detail is encapsulated from users of the class. Then it would return a
pointer to the start of the *real* "32-bit array" (by which I assume
you mean an array of unsigned int or similar).


For POD, copy is like memmove in that it works with overlapping
ranges.

fill is different from memset. memset only allows you to set every
byte to the same value, whereas fill allows you to set every element
(which may be, e.g., an unsigned int) to the same value.


Assuming the vector is 0-initialized, it depends on the byte order
your platform uses. Either the high order or low order bytes of the
last value will be 0, which will give the word a particular value.

Tom

Thanks, Tom. Your reply is really helpful. I think that I will leave
things as they are, since the prog is working and useful for me, and has
already been fairly thoroughly streamlined. One more question...suppose I
have large amounts of memory that I want to clear. Do you know if there is
a speed advantage if it's done using fill with the native integer data-type
vs using byte-oriented memset? It's a little hard to measure, since the
operation is really fast in either case...as such I suppose it makes no
practical difference, huh?

Thanks again for the reply.

Joe
 
T

tom_usenet

Thanks, Tom. Your reply is really helpful. I think that I will leave
things as they are, since the prog is working and useful for me, and has
already been fairly thoroughly streamlined. One more question...suppose I
have large amounts of memory that I want to clear. Do you know if there is
a speed advantage if it's done using fill with the native integer data-type
vs using byte-oriented memset? It's a little hard to measure, since the
operation is really fast in either case...as such I suppose it makes no
practical difference, huh?

You'll love this article:

http://www.cuj.com/documents/s=7990/cujcexp1910alexandr/alexandr.htm

Note that for zeroing large amounts of memory, all reasonable
techniques work out much the same, since the bottleneck is memory
bandwidth.

Tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,235
Members
46,821
Latest member
AleidaSchi

Latest Threads

Top