how much memory does vector<bool> take?

Z

zl2k

hi, there
I am using a big, sparse binary array (size of 256^3). The size may be
changed in run time. I first thought about using the bitset but found
its size is unchangeable. If I use the vector<bool>, does each element
takes 4 bytes instead of 1 bit? I am using gcc3.4.4. There is a
bit_vector which is kind of old so I wont use that. Any other choices?
Thanks ahead.
zl2k
 
V

Victor Bazarov

zl2k said:
hi, there
I am using a big, sparse binary array (size of 256^3). The size may be
changed in run time. I first thought about using the bitset but found
its size is unchangeable. If I use the vector<bool>, does each element
takes 4 bytes instead of 1 bit?

I am using gcc3.4.4. There is a
bit_vector which is kind of old so I wont use that. Any other choices?

You can always roll your own...

V
 
M

Marcus Kwok

zl2k said:
I am using a big, sparse binary array (size of 256^3). The size may be
changed in run time. I first thought about using the bitset but found
its size is unchangeable. If I use the vector<bool>, does each element
takes 4 bytes instead of 1 bit? I am using gcc3.4.4. There is a
bit_vector which is kind of old so I wont use that. Any other choices?

Since vector<bool> is required to be a specialization, you should read
the following articles by Herb Sutter before deciding on using it:

vector<bool> Is Nonconforming, and Forces Optimization Choice
http://www.gotw.ca/publications/N1185.pdf

vector<bool>: More Problems, Better Solutions
http://www.gotw.ca/publications/N1211.pdf

Technically, even though vector<bool> is mentioned in the Standard,
its use is unspecified. Quoting from the second article:

Curiously, vector<bool> is not actually specified, so no current use
of it invokes well specified behavior. Its declaration appears in
the standard, but not a single function is specified. Note that the
argument "it's just the same as vector" fails because a vector<bool>
is demonstrably not a vector: it has a different interface (i.e.,
flip()), a different structure (e.g., reference is a class, not a
typedef for T&), does not meet the same requirements (e.g., container
and iterator requirements), etc.

Since you are dealing with a sparse array, maybe you could use a
std::map<int, bool> or something.
 
G

Greg

Marcus said:
Since vector<bool> is required to be a specialization, you should read
the following articles by Herb Sutter before deciding on using it:

vector<bool> Is Nonconforming, and Forces Optimization Choice
http://www.gotw.ca/publications/N1185.pdf

vector<bool>: More Problems, Better Solutions
http://www.gotw.ca/publications/N1211.pdf

Neither of the quoted articles presents an argument against using
vector<bool>. They are merely criticisms of vector<bool>'s
classification as a vector - an issue of interest only to those
designing the C++ standard library.
Technically, even though vector<bool> is mentioned in the Standard,
its use is unspecified. Quoting from the second article:

Curiously, vector<bool> is not actually specified, so no current use
of it invokes well specified behavior. Its declaration appears in
the standard, but not a single function is specified. Note that the
argument "it's just the same as vector" fails because a vector<bool>
is demonstrably not a vector: it has a different interface (i.e.,
flip()), a different structure (e.g., reference is a class, not a
typedef for T&), does not meet the same requirements (e.g., container
and iterator requirements), etc.

Since you are dealing with a sparse array, maybe you could use a
std::map<int, bool> or something.

Why? Whether vector<bool> should be called a "vector" or not - makes no
difference to the issue of how well it can solve a particular problem.
The only question that the programmer needs to decide is whether a
std:vector<bool> can do what the program needs it to do. And if that
task is to store a dynamically-resizable container of one-bit boolean
values - then the answer is clearly "yes." And there would no reason
for a program not to use a std::vector<bool> in that case.

Greg
 
M

Marcus Kwok

Greg said:
Neither of the quoted articles presents an argument against using
vector<bool>. They are merely criticisms of vector<bool>'s
classification as a vector - an issue of interest only to those
designing the C++ standard library.

From the first article:

2. vector<bool>::iterator does not meet the requirements of a
forward, bidirectional, or random-access iterator, although the
last is strongly implied by the specialization's naming and
position. This means that it may not work with a conforming
implementation of a standard library algorithm.

The possibility of not being able to use a vector<bool> with standard
library algorithms is an argument against using it in my book.

5. vector<bool>'s name is misleading because the things inside aren't
bools.

// Example 1: Works for every T except bool
//
template<class T>
void g( vector<T>& v ) {
T& r = v.front();
T* p = &*v.begin();
// ... do something with r and *p ...
}

If something is explicitly stated as being a vector, is it unreasonable
to assume that it should behave as a vector?

6. vector<bool> forces a specific optimization choice on all users by
enshrining it in the standard. That's probably not a good idea,
even if the actual performance overhead turns out to be negligible
for a given compiler for most applications; different users have
different requirements.

In this case, vector<bool> forces the "favour less space at the
expense of potentially slower speed" optimization choice on all
programs. The implicit assumption is that virtually all users of
a vector of bools will prefer "less space" at the expense of
"potentially slower speed," that they will be more
space-constrained than performance-constrained. This is clearly
untrue.

The key word that triggered my response was "sparse". If it is known
that the data is sparse, then it may not be necessary to store all 256^3
entries.
Whether vector<bool> should be called a "vector" or not - makes no
difference to the issue of how well it can solve a particular problem.
The only question that the programmer needs to decide is whether a
std:vector<bool> can do what the program needs it to do. And if that
task is to store a dynamically-resizable container of one-bit boolean
values - then the answer is clearly "yes." And there would no reason
for a program not to use a std::vector<bool> in that case.

....unless speed requirements are more important than memory
requirements.

Since vector<bool> uses a proxy class instead of storing true bools,
there is some additional overhead associated with every element access.
If these values are accessed in a tight loop, performance considerations
can be substantial.

I'm not saying not to use vector<bool> at all or that vector<bool> won't
meet the OP's requirements; I'm just saying that the OP should be aware
of the issues with it before deciding that he should "clearly" use it.
 
J

Jerry Coffin

[ ... ]
In this case, vector<bool> forces the "favour less space at the
expense of potentially slower speed" optimization choice on all
programs. The implicit assumption is that virtually all users of
a vector of bools will prefer "less space" at the expense of
"potentially slower speed," that they will be more
space-constrained than performance-constrained. This is clearly
untrue.

Although the required specialization of vector<bool> has
the _potential_ to reduce speed, the reality is that on
almost any reasonably recent processor, it will generally
_increase_ speed unless the vector involved is _quite_
small.

The situation is pretty simple: on most current
processors, the reduced size also means reduced memory
access and improved cache utilization. Currently, memory
is a lot slower than the processor as a rule (and the
ratio favors processors more strongly all the time). This
means that even if you have to do quite a lot of
computation to avoid a memory access, it's usually worth
it. In this case, there's not really a lot of extra
computation at all, and you're typically reducing memory
references by a ratio of at least 8:1 (and 32:1 isn't
unheard of).

[ ... ]
...unless speed requirements are more important than memory
requirements.

Depending heavily upon your target -- if your target has
a cache, chances are good that vector<bool> actually
improves speed (unless your memory use is so low in
general that even without the reduction in memory usage
it would all fit in the cache).

My own advice would be to use a typedef:

vector<boolean> my_vect;

And then test and profile with both:

typedef char boolean;

and:

typedef bool boolean;

and possibly even:

typedef int boolean;

and see which works better for your situation. The
conversion rules for bool in C++ are loose enough that
this will normally work without any extra work on your
part as to how you use your boolean values. The one place
you're at all likely to run into a problem is if you want
to provide separate overloads (or specializations) for
your booleans and some of the other types mentioned above
-- the typedef only creates a new name, not a new type.
 
M

Marcus Kwok

Jerry Coffin said:
My own advice would be to use a typedef:

vector<boolean> my_vect;

And then test and profile with both:

typedef char boolean;

and:

typedef bool boolean;

and possibly even:

typedef int boolean;

and see which works better for your situation.

Yes, I will agree that performance claims can only truly be determined
through profiling. Maybe even include tests with std::map<int, boolean>
(for the various types of "boolean" above) to see how they compare too
(assuming that the map satisfies the OP's requirements).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,996
Messages
2,570,238
Members
46,826
Latest member
robinsontor

Latest Threads

Top