wrapping std::vector<> to track memory usage?

J

jacek.dziedzic

Hi!

I need to be able to track memory usage in a medium-sized
application I'm developing. The only significant (memory-wise) non-
local objects are of two types -- std::vector<> and of a custom class
simple_vector<> that is a hand-rolled substitute for array<>. With the
latter I have code that tracks all allocations and destructions, so I
can account for all the memory.

The question is about std::vector<> -- how can I track memory usage
by individual std::vector's? I'm thinking along the lines of a wrapper
(templated) class, like std::tracked_vector<> which would have the
original std::vector<> as a private member, delegate relevant
operations to the underlying std::vector, while doing the accounting
job behind the scenes.

Are there any particular pitfalls to such a design? Would it suffice
to delegate only the methods I actually use,
like .size(), .reserve(), .at(), [] operators, etc. or would I need to
delegate all possible methods? I fear that since I use the
std::vector's in stl algorithms and other stl containers (vectors of
vectors), I might need to delegate all the iterator stuff and other
methods I don't use directly, is this the case? How do I account for
vectors of vectors, so that I don't bill the same memory twice?

Surely, someone must have gone that route, are there any particular
do's and don'ts there? Is it feasible? I don't want to reinvent the
wheel.

TIA,
- J.
 
M

Maxim Yegorushkin

  I need to be able to track memory usage in a medium-sized
application I'm developing. The only significant (memory-wise) non-
local objects are of two types -- std::vector<> and of a custom class
simple_vector<> that is a hand-rolled substitute for array<>. With the
latter I have code that tracks all allocations and destructions, so I
can account for all the memory.

  The question is about std::vector<> -- how can I track memory usage
by individual std::vector's? I'm thinking along the lines of a wrapper
(templated) class, like std::tracked_vector<> which would have the
original std::vector<> as a private member, delegate relevant
operations to the underlying std::vector, while doing the accounting
job behind the scenes.

[]

You could use a custom allocator that would maintain a counter of how
much memory has been allocated. Something like that:

#include <vector>
#include <iostream>

size_t allocated;

void print_allocated(int n)
{
std::cout << n << ": " << allocated << '\n';
}

template<class T>
struct counted_allocator : std::allocator<T>
{
template<class U>
struct rebind { typedef counted_allocator<U> other; };

typedef std::allocator<T> base;

typedef typename base::pointer pointer;
typedef typename base::size_type size_type;

pointer allocate(size_type n)
{
allocated += n * sizeof(T);
return this->base::allocate(n);
}

pointer allocate(size_type n, void const* hint)
{
allocated += n * sizeof(T);
return this->base::allocate(n, hint);
}

void deallocate(pointer p, size_type n)
{
allocated -= n * sizeof(T);
this->base::deallocate(p, n);
}
};

int main()
{
typedef std::vector<int, counted_allocator<int> > IntVec;

print_allocated(0);
{
IntVec v;
v.resize(1000);
print_allocated(1);
v.resize(2000);
print_allocated(2);
IntVec u = v;
print_allocated(3);
}
print_allocated(4);
}


Output:
0: 0
1: 4000
2: 8000
3: 16000
4: 0
 
J

Juha Nieminen

Would it suffice to delegate only the methods I actually use,
like .size(), .reserve(), .at(), [] operators, etc. or would I need to
delegate all possible methods?

std::vector itself obviously doesn't require anything about your
delegating functions. You can implement those functions which you need
and leave the rest. (Of course you will find that you will have to keep
adding delegating functions as you start using vector functions in your
code which you weren't using before. But as long as you have access to
the wrapper class, it shouldn't be a huge problem.)

Note, however, that by using this technique you will only be able to
track the amount of space requested from std::vector *explicitly*.
There's no way of knowing how much memory the std::vector is *really*
allocating behind the scenes. Also, even if you were able to do that, it
wouldn't help you knowing the real amount of RAM used. All allocations
have a certain overhead to them, and especially std::vector easily
causes memory fragmentation when it grows, and you might end up having a
significant amount of unused memory which is nevertheless allocated from
the system because of memory fragmentation. In the worst case scenario
the real memory usage of your program (ie. what your program requests
the OS to allocate for it) might be even over double the amount of
memory that you are *explicitly* allocating (and tracking).

Explicit memory allocation tracking is a lot less useful than one
might think, at least if what you are trying to do is to estimate how
much RAM your program is consuming. You will only get a very rough lower
limit, while the actual memory usage may be much larger (even
significantly larger in the worst cases).
 
J

jacek.dziedzic

You could use a custom allocator that would maintain a counter of how
much memory has been allocated. Something like that:
[helpful code snipped]

Thanks a lot, you've been very helpful, I'll look into that!

- J.
 
J

jacek.dziedzic

  Note, however, that by using this technique you will only be able to
track the amount of space requested from std::vector *explicitly*.
There's no way of knowing how much memory the std::vector is *really*
allocating behind the scenes.

What would std::vector<> be allocating "behind the scenes"?
Are you talking about the capacity-margin that usually causes
reallocations
to consume geometrically larger amounts of memory each time, to
satisfy
asymptotic performance requirements? Because apart from that I hope
that a typical implementation will not consume more than several
tens of bytes for any bookkeeping.
Also, even if you were able to do that, it
wouldn't help you knowing the real amount of RAM used. All allocations
have a certain overhead to them, and especially std::vector easily
causes memory fragmentation when it grows, and you might end up having a
significant amount of unused memory which is nevertheless allocated from
the system because of memory fragmentation. In the worst case scenario
the real memory usage of your program (ie. what your program requests
the OS to allocate for it) might be even over double the amount of
memory that you are *explicitly* allocating (and tracking).

Yes, I realize that. Using OS-based approaches (the 'top' command)
I can see how much memory is really used up. I'm mostly attempting
to track potential memory leaks, I'll be satisfied with the values
the program "would use if there were no fragmentation, overhead
and VM issues". This also to calculate how memory requirements change
wrt some parameters like system size.
  Explicit memory allocation tracking is a lot less useful than one
might think, at least if what you are trying to do is to estimate how
much RAM your program is consuming. You will only get a very rough lower
limit, while the actual memory usage may be much larger (even
significantly larger in the worst cases).

Right.

Thanks for the helpful answer. Are there any drawbacks to the
custom-allocator approach that Maxim Yegorushkin suggested?
Looks like a lot less work to do than creating a wrapper. It
seems to me that it would take care of the vector of vectors'
issue, as in not booking the same memory twice.

thanks,
- J.
 
J

Juha Nieminen

What would std::vector<> be allocating "behind the scenes"?

When you perform a "vec.push_back(value);" you might track that the
size of the vector increased by sizeof(value), when in fact it may well
be that the size of the vector increased by a lot more.
I'm mostly attempting to track potential memory leaks

Aren't there tools to do exactly that, such as valgrind?
Thanks for the helpful answer. Are there any drawbacks to the
custom-allocator approach that Maxim Yegorushkin suggested?

It's probably the easiest way to track the amount of memory explicitly
allocated in the program. Of course you would have to make sure that
everything is indeed using that allocator.
 
J

jacek.dziedzic

  Aren't there tools to do exactly that, such as valgrind?

Yes, there are three difficulties I'm experiencing with valgrind:
- not all platforms are supported (Itanium!),
- valid code sometimes crashes valgrind's "virtual machine",
especially hand-tuned, vendor-specific code that does weird
trickery to squeeze the last cycles out of every pipeline,
- can only be used for debugging, production code that begins
to leak after a month of uptime cannot be feasibly "simulated"
on valgrind.
  It's probably the easiest way to track the amount of memory explicitly
allocated in the program. Of course you would have to make sure that
everything is indeed using that allocator.

OK. Thank you!

cheers,
- J.
 
H

Hendrik Schober

Juha said:
When you perform a "vec.push_back(value);" you might track that the
size of the vector increased by sizeof(value), when in fact it may well
be that the size of the vector increased by a lot more.

The capacity, not the size.

Schobi
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top