Dynamic Memory Allocation

N

Nehil

can i find about following 2 things, by providing my own API of
functions for a C program :

1) static, global and const section of variables (pointers only), and
to pickout only those which are dynamically allocated.

2) Local variable which are pointers and have dynamically allocated
memory.

Actually, i want to develop an API, which will help in collection
garbage in C program. i.e. programmer will not explicitely call free()
function. The algorithm choosen is : Mark & Sweep.
if somehow i get the info about above mentioned points, the path
further will be easier for me.

I'm concentrating on pointer variables only, as they only can be
allocated memory dynamically. (plz correct me if i'm wrong).

please tell me, if i'm on wrong path. and if i'm not :), plz tell me
how to proceed to get information about dynamically allocated memory.
 
E

Eric Sosman

Nehil said:
can i find about following 2 things, by providing my own API of
functions for a C program :

1) static, global and const section of variables (pointers only), and
to pickout only those which are dynamically allocated.

2) Local variable which are pointers and have dynamically allocated
memory.

There is no portable way to test whether a particular
pointer points to dynamic, static, or automatic memory.
Actually, i want to develop an API, which will help in collection
garbage in C program. i.e. programmer will not explicitely call free()
function. The algorithm choosen is : Mark & Sweep.
if somehow i get the info about above mentioned points, the path
further will be easier for me.

I'm concentrating on pointer variables only, as they only can be
allocated memory dynamically. (plz correct me if i'm wrong).

You are wrong: A pointer variable can point to memory
of all three kinds. The same pointer variable can point to
different kinds of memory at different points during the
program's execution.
please tell me, if i'm on wrong path. and if i'm not :), plz tell me
how to proceed to get information about dynamically allocated memory.

There have been some implementations of garbage collection
for C. The resulting implementation cannot, I believe, be a
conforming C implementation (unless it never decides to collect
any "garbage" at all). GIYF.
 
S

santosh

Nehil said:
can i find about following 2 things, by providing my own API of
functions for a C program :

1) static, global and const section of variables (pointers only), and
to pickout only those which are dynamically allocated.

2) Local variable which are pointers and have dynamically allocated
memory.

Pointers can point to different types of objects, static, dynamic,
automatic etc. You cannot portably determine what type of memory it's
pointing to.
Actually, i want to develop an API, which will help in collection
garbage in C program. i.e. programmer will not explicitely call free()
function. The algorithm choosen is : Mark & Sweep.
if somehow i get the info about above mentioned points, the path
further will be easier for me.

I'm concentrating on pointer variables only, as they only can be
allocated memory dynamically. (plz correct me if i'm wrong).

please tell me, if i'm on wrong path. and if i'm not :), plz tell me
how to proceed to get information about dynamically allocated memory.

Try to study the internals of existing GCs like Boehm's
implementation. A quick Google search also turns up a lot of
interesting links with information on implementing GCs and their
associated issues.
 
B

Ben Bacarisse

Nehil said:
can i find about following 2 things, by providing my own API of
functions for a C program :

1) static, global and const section of variables (pointers only), and
to pickout only those which are dynamically allocated.

2) Local variable which are pointers and have dynamically allocated
memory.

No portable solution exists.

I'm concentrating on pointer variables only, as they only can be
allocated memory dynamically. (plz correct me if i'm wrong).

Your terminology is a little off, but the logic is way off. To
implement mark-and-sweep you need to find all pointers to allocated
memory, and these will not always be in "pointer variables".
Consider:

union either {
int i;
void *p;
};

and a code fragment like:

union either *ep = malloc(10 * sizeof *ep);
ep[0].i = 42;
ep[1].p = malloc(10);

You will have to find and mark ep[1].p. I use a union to show that
you can not tell, even by looking at the declaration, what may and may
not contain a pointer.
 
D

Default User

Nehil wrote:

Actually, i want to develop an API, which will help in collection
garbage in C program. i.e. programmer will not explicitely call free()
function. The algorithm choosen is : Mark & Sweep.

I recommend you thoroughly learn the standard C language before
embarking on any such task.

Oh, and "plz" is not a standard English abreviation for anything.




Brian
 
M

Malcolm McLean

Ben Bacarisse said:
Nehil said:
can i find about following 2 things, by providing my own API of
functions for a C program :

1) static, global and const section of variables (pointers only), and
to pickout only those which are dynamically allocated.

2) Local variable which are pointers and have dynamically allocated
memory.

No portable solution exists.

I'm concentrating on pointer variables only, as they only can be
allocated memory dynamically. (plz correct me if i'm wrong).

Your terminology is a little off, but the logic is way off. To
implement mark-and-sweep you need to find all pointers to allocated
memory, and these will not always be in "pointer variables".
Consider:

union either {
int i;
void *p;
};

and a code fragment like:

union either *ep = malloc(10 * sizeof *ep);
ep[0].i = 42;
ep[1].p = malloc(10);

You will have to find and mark ep[1].p. I use a union to show that
you can not tell, even by looking at the declaration, what may and may
not contain a pointer.
However it is only a problem is you cannot tolerate a few false positives.
Unless the program is deliberately engineered to break your system, it is
most unlikely that a valid 32-bit pointer will also be an integer of use. It
will happen once or twice, in which case you hold the data until the integer
changes. Only if the block pointed to happens to be very large do you have a
problem, although then it could be a severe one.
 
B

Ben Bacarisse

Malcolm McLean said:
Ben Bacarisse said:
Nehil said:
can i find about following 2 things, by providing my own API of
functions for a C program :

1) static, global and const section of variables (pointers only), and
to pickout only those which are dynamically allocated.

2) Local variable which are pointers and have dynamically allocated
memory.

No portable solution exists.

I'm concentrating on pointer variables only, as they only can be
allocated memory dynamically. (plz correct me if i'm wrong).

Your terminology is a little off, but the logic is way off. To
implement mark-and-sweep you need to find all pointers to allocated
memory, and these will not always be in "pointer variables".
Consider:

union either {
int i;
void *p;
};

and a code fragment like:

union either *ep = malloc(10 * sizeof *ep);
ep[0].i = 42;
ep[1].p = malloc(10);

You will have to find and mark ep[1].p. I use a union to show that
you can not tell, even by looking at the declaration, what may and may
not contain a pointer.
However it is only a problem is you cannot tolerate a few false
positives.

The problem is deeper than that. How do you know how big the area
pointed to be ep is? If can get that in some non-portable way from
ep, how do know which offsets within it have pointers? On a machine
which can align pointers on any boundary, you have to look at every
single byte offset from ep. All the false positives lead to further
such searches (because you have to assume there is valid block size,
wherever you look for it). Any of these may lead to a run-time fault
and you need to find some way round that.

But you can't do any of this in a portable way.
system, it is most unlikely that a valid 32-bit pointer will also be
an integer of use. It will happen once or twice, in which case you
hold the data until the integer changes. Only if the block pointed to
happens to be very large do you have a problem, although then it could
be a severe one.

The block may appear to be huge entirely by accident and not be an
allocated block at all. You will have to follow leads even when that
are daft.
 
M

Malcolm McLean

Ben Bacarisse said:
Malcolm McLean said:
Ben Bacarisse said:
can i find about following 2 things, by providing my own API of
functions for a C program :

1) static, global and const section of variables (pointers only), and
to pickout only those which are dynamically allocated.

2) Local variable which are pointers and have dynamically allocated
memory.

No portable solution exists.

<snip>
I'm concentrating on pointer variables only, as they only can be
allocated memory dynamically. (plz correct me if i'm wrong).

Your terminology is a little off, but the logic is way off. To
implement mark-and-sweep you need to find all pointers to allocated
memory, and these will not always be in "pointer variables".
Consider:

union either {
int i;
void *p;
};

and a code fragment like:

union either *ep = malloc(10 * sizeof *ep);
ep[0].i = 42;
ep[1].p = malloc(10);

You will have to find and mark ep[1].p. I use a union to show that
you can not tell, even by looking at the declaration, what may and may
not contain a pointer.
However it is only a problem is you cannot tolerate a few false
positives.

The problem is deeper than that. How do you know how big the area
pointed to be ep is?

The block may appear to be huge entirely by accident and not be an
allocated block at all. You will have to follow leads even when that
are daft.
That's another problem. In practical terms you need control of the malloc
system so that you have a list of blocks.
Then dynamic blocks can and often do contain pointers to other dynamic
blocks. But since you have a full list of blocks the problem is only O(N).
You can sweep, and remove orphan blocks. Ring blocks are a bit trickier;
ultimately you've got to track back to scope and caller scope.

Another snag is if the caller keeps a pointer in a disguised form. Again,
normally this would only be done in a deliberate attempt to subvert the
system, but it could be legitimate if, for instance, a pointer is
incremented to point to block plus some user-defined header.
 
B

Ben Bacarisse

Malcolm McLean said:
That's another problem.

We are in danger of going round in circles. I know that with enough
assumptions about the environment, sufficient non-portable code, and a
lot of machinery, something can be done that will work for a subset of
C programs. The OP seemed to think that "dynamically allocated"
memory could be found by looking only at "pointer variables". It
seemed worth showing that that is not enough.
In practical terms you need control of the malloc system so that you
have a list of blocks.

In practical terms you need to use a language that has garbage
collection (or is amenable to it). The big question hanging over all
this is why on Earth do it at all? Why struggle (and it would be a
struggle) to do something half-baked for C when there are so many fine
programming languages languishing in obscurity?

There are one or two complex execution patterns where it is not
possible to decide in the normal way when something can be safely
freed. For example: an interpreter for a functional programming
language. But in such cases you can portably implement a garbage
collected memory area for that application. In the example of an FP,
you provide garbage collection for that language, not for the C
program that implements it.
 
G

Gordon Burditt

Another snag is if the caller keeps a pointer in a disguised form. Again,
normally this would only be done in a deliberate attempt to subvert the
system, but it could be legitimate if, for instance, a pointer is
incremented to point to block plus some user-defined header.

Another type of snag is the pointer kept in a *file*. There's nothing
wrong with that if the pointer is written and later read by the same
instance of the running program.

That could horribly slow down garbage collection if the program has
a 4TB file open, even if there aren't any pointers in it, but there
*might* be.
 
C

CBFalconer

.... snip about garbage collection ...
In practical terms you need to use a language that has garbage
collection (or is amenable to it). The big question hanging over
all this is why on Earth do it at all? Why struggle (and it
would be a struggle) to do something half-baked for C when there
are so many fine programming languages languishing in obscurity?

One (of many) possibilities is that you don't want the obscurity of
program delays at possibly awkward times. Another is the
(non-)reliability of such methods.
 
J

jacob navia

Nehil said:
can i find about following 2 things, by providing my own API of
functions for a C program :

1) static, global and const section of variables (pointers only), and
to pickout only those which are dynamically allocated.

2) Local variable which are pointers and have dynamically allocated
memory.

Actually, i want to develop an API, which will help in collection
garbage in C program. i.e. programmer will not explicitely call free()
function. The algorithm choosen is : Mark & Sweep.
if somehow i get the info about above mentioned points, the path
further will be easier for me.

I'm concentrating on pointer variables only, as they only can be
allocated memory dynamically. (plz correct me if i'm wrong).

please tell me, if i'm on wrong path. and if i'm not :), plz tell me
how to proceed to get information about dynamically allocated memory.

The garbage collector of Boehm does this for C and C++.
Read that source code first, before embarking in your own
coding, since it is working since several years.

That garbage collector is distributed (among others) with the
lcc-win32 compiler
http://www.cs.virginia.edu/~lcc-win32
 
K

Keith Thompson

Ben Bacarisse said:
In practical terms you need to use a language that has garbage
collection (or is amenable to it).
[...]

In practical terms, my understanding is that the Boehm garbage
collector works well enough for many purposes, even in C. It cannot
be implemented in 100% portable C (for one thing, I think it has to
look in registers; for another, it probably has to do pointer
comparisons that aren't guaranteed by the standard), and it's always
possible to write C code that can break it (e.g., breaking a pointer
value down to its constituent bytes, negating each one, then
re-building the pointer later). But there are circumstances where the
benefits outweigh the problems.
 
B

Ben Bacarisse

Keith Thompson said:
Ben Bacarisse said:
In practical terms you need to use a language that has garbage
collection (or is amenable to it).
[...]

In practical terms, my understanding is that the Boehm garbage
collector works well enough for many purposes, even in C.

That implementation looks very good. They have certainly covered all
the bases save for those that are logically doomed.
It cannot
be implemented in 100% portable C (for one thing, I think it has to
look in registers; for another, it probably has to do pointer
comparisons that aren't guaranteed by the standard), and it's always
possible to write C code that can break it (e.g., breaking a pointer
value down to its constituent bytes, negating each one, then
re-building the pointer later). But there are circumstances where the
benefits outweigh the problems.

Yes, there must be situations where it is useful in C, but one of the
suggested uses is to remove memory leaks from existing code. That
just rubs all my aesthetic fur the wrong way. If I were still being
paid for fixing bugs, I am sure I would use it for that too, but I
would feel it was the wrong solution.

I have a gut feeling (which I am totally unable to justify with any
evidence) that the other use cases are very rare and are likely to
ones where other solutions such using another language or writing an
application specific memory management system are also options.

Did you have any uses cases in mind?
 
K

Keith Thompson

Ben Bacarisse said:
In practical terms, my understanding is that the Boehm garbage
collector works well enough for many purposes, even in C.

That implementation looks very good. They have certainly covered all
the bases save for those that are logically doomed. [...]
Yes, there must be situations where it is useful in C, but one of the
suggested uses is to remove memory leaks from existing code. That
just rubs all my aesthetic fur the wrong way. If I were still being
paid for fixing bugs, I am sure I would use it for that too, but I
would feel it was the wrong solution.

I suspect (and this is only speculation, really) that you should
designed your code from the start to use GC (garbage collection).
Applying GC to an existing application is likely to be difficult.
Applying GC to an existing application *for the purpose of fixing
memory leaks* is likely to be an act of desperation.
I have a gut feeling (which I am totally unable to justify with any
evidence) that the other use cases are very rare and are likely to
ones where other solutions such using another language or writing an
application specific memory management system are also options.

Did you have any uses cases in mind?

Nope. I've actually never used Boehm GC, or any other garbage
collector in C. (I do a lot of programming in Perl, which has
built-in GC; in Perl, I don't really think about it, because I don't
have to. But Perl is a very different language from C.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,690
Latest member
MacGyver

Latest Threads

Top