how to figure out the size of buffer from malloc

B

BartC

It started off with the buffers being NULL-free strings. However now I
need to work with strings with embedded NULL characters. This means I
can't find the length of the buffer with strlen. And it would be too much
change to add a new argument to pass the length of the buffer. So the
only good solution is for myfunc to work out by it's self 1) if the
buffer is a dynamic pointer or stack array and 2) the size of the buffer.

It is just nul-characters inside the strings, or could it be any char value?

If they are normal strings but with a zero-char included, then just use
another value to use as a terminator, with a special version of strlen().
(Such strings won't work with most of the standard string functions anyway.)

If all the char values could appear in the string (eg. codes 0 to 255 for an
8-bit char), then you can't use a terminator value.

But this is no different from knowing the size of a dynamically allocated
array.

If it's not possible to supply a separate length parameter, then there are
any number of ways to encode the length in the data. As 1, 2 or 4 bytes at
the start, or just before the start, of the string for example. Or perhaps
use an int-array instead, then you can use the terminator. Or instead pass a
pointer to a (char*, int) struct where the string and length are passed that
way. Or... it's difficult to know what's viable and what's isn't, or would
be too much work.

You can't reliably get the info you need from malloc() anyway.
 
I

ImpalerCore

On 02/17/2012 02:45 PM, pembed2012 wrote: ...
It's absolutely guaranteed, at this point, that sizeof(buf) ==
sizeof(void*). As a result, 2 of your next three lines are pointless,
which suggests that you were not aware of that fact.
Why did you think it was possible for the condition of that if() to not
be met?

Often this function will be called with a dynamic pointer from malloc.
However, it is also possible to call it with a stack array:

char ary[10];
char* ptr=malloc(10);
myfunc((void*)ary);  // should work
myfunc((void*)ptr);  // should also work

It started off with the buffers being NULL-free strings. However now I
need to work with strings with embedded NULL characters. This means I
can't find the length of the buffer with strlen. And it would be too much
change to add a new argument to pass the length of the buffer. So the
only good solution is for myfunc to work out by it's self 1) if the
buffer is a dynamic pointer or stack array and 2) the size of the buffer.

Since you seem quite interested in deriving how to access the size of
an allocation from just a pointer, one method is to store the size
yourself in your own allocator that wraps 'malloc'.

\code snippet
uintmax_t current_memory;

void* track_malloc( size_t size )
{
void* mem_blk = NULL;
void* p = NULL;

/*
* To track the amount of used memory, each memory allocation is
* prefixed with a size_t object that allows the free function to
* update 'current_memory' correctly when releasing memory.
*/
mem_blk = malloc( sizeof (size_t) + size );

if ( mem_blk )
{
*((size_t*)mem_blk) = size;
p = (unsigned char*)mem_blk + sizeof (size_t);

current_memory += size;
}

return p;
}

void track_free( void* p )
{
void* mem_blk = NULL;
size_t p_size;

if ( p )
{
mem_blk = (unsigned char*)p - sizeof (size_t);
p_size = *((size_t*)mem_blk);

free( mem_blk );

current_memory -= p_size;
}
}
\endcode

In essence, it adds a header to each pointer that stores the size of
the allocation in a 'size_t' prefix.

-------- ----------------------------
| size_t | ... object ... |
-------- ----------------------------
/ /
mem p

Everything seems fine until you decide to allocate an object that has
stricter alignment requirements than a 'size_t'.

-------- --------------
| size_t | double | (xxx not aligned xxx)
-------- --------------
/ /
mem p

One of the requirements for malloc is to provide a memory block that
is aligned for any object. So while 'mem' is aligned properly for a
double, the double object at 'p' is not. For example, on my 32-bit
Windows, 'size_t' has an alignment of 4, while 'double' has an
alignment of 8. The proper method to prefix the allocation pointer
for a 'double' is to insert a 'double' sized object.

4 4 8
-------- -------- --------------
| size_t | pad | double |
-------- -------- --------------
/ /
mem p

Now your 'double' object is guaranteed to be aligned. The general
solution is to prefix each memory allocation with the maximum
alignment detected from the system, referenced as ALIGN_MAX. This can
be defined by the compiler (under the newest C standard), or estimated
using offsetof and a union of a large set of distinct types. Search
for Chris Thomasson who has posted about it before.

So instead of using 'sizeof' to calculate the size of the 'size_t'
prefix, you'll need a different version to pad it to a multiple of
ALIGN_MAX.

\code snippet
uintmax_t current_memory;

size_t gc_maxalign_sizeof( size_t type_sz )
{ return ((type_sz + ALIGN_MAX - 1) / ALIGN_MAX) * ALIGN_MAX; }

#define c_maxalign_sizeof( type ) ( (gc_maxalign_sizeof( sizeof
(type) )) )

void* track_malloc( size_t size )
{
void* mem_blk = NULL;
void* p = NULL;

/*
* To track the amount of used memory, each memory allocation is
* prefixed with a size_t object that allows the free function to
* update 'current_memory' correctly when releasing memory.
*/
mem_blk = malloc( c_maxalign_sizeof (size_t) + size );

if ( mem_blk )
{
*((size_t*)mem_blk) = size;
p = (unsigned char*)mem_blk + c_maxalign_sizeof (size_t);

current_memory += size;
}

return p;
}

void track_free( void* p )
{
void* mem_blk = NULL;
size_t p_size;

if ( p )
{
mem_blk = (unsigned char*)p - c_maxalign_sizeof (size_t);
p_size = *((size_t*)mem_blk);

free( mem_blk );

current_memory -= p_size;
}
}
\endcode

Note that while this prefixes the size, none of the string functions
(strlen) know about this size prefix, so you still have to define your
own interface. It does allow you to get away with passing one
augmented pointer since you pack the allocation size with it.

size_t track_strlen( const char* str )
{
void* mem_blk = (unsigned char*)str - c_maxalign_sizeof (size_t);
return *((size_t*)mem_blk);
}

I'd consider using a special typedef to distinguish these augmented
string pointers to distinguish functions that expect them and those
that do not.

This is the just one possible path. Another path is to define a
string structure that stores the length and buffer size.

struct c_string
{
char* buf;
size_t buf_size;
size_t length;
};

Then you write a string interface that works with 'struct c_string',
which is able to handle embedded nul characters since you store the
length.

\code snippet
size_t c_string_length( const struct c_string* cstr )
{
return cstr->length;
}
\endcode

You can write your own (a lot of time/work), or decide to use one of
the string libraries out there, like bstring. I would strongly
consider using an external string library was created to handle
strings with embedded nulls. If you can walk on someone else's
shoulders to step over a mud pit ...

Best regards,
John D.
 
K

Kaz Kylheku

It started off with the buffers being NULL-free strings. However now I
need to work with strings with embedded NULL characters. This means I

So the requirements changed.
can't find the length of the buffer with strlen. And it would be too much
change to add a new argument to pass the length of the buffer.

That is the right thing to do. Guess how long it will take you to do it
and then give the estimate to whoever changed the requirement.
Aren't you being paid?
So the
only good solution is for myfunc to work out by it's self 1) if the
buffer is a dynamic pointer or stack array and 2) the size of the buffer.

This is not possible in any portable way. Some malloc implementations have a
function which can tell you how large the underlying block is (but not
necessarily with byte accuracy). As far as obtaining the size of an array in
automatic storage, forget it.

You can add a length field to string objects without introducing a new
parameter everywhere. Just change the type from "char *" to "string *",
where string is a struct type such as:

typedef struct { int size; char *data; } string;

Or even:

/* flexible array hack */
typedef struct { int size; char [1]; } string;
 
M

Michael Angelo Ravera

As others have told you, the actual allocation, the size of the buffer allocated, and the actual amount of space consumed are three implementation dependent details (and are not likely the same as each other or the value passed to malloc ())for which the definition of malloc () provides no information.

If you want what a string implementation, use one. It is likewise fairly easy to create a salloc () function that explicitly tags the size of the allocation as desired or to allocate a large pool and manage it yourself.
 
K

Kenny McCormack

As others have told you, the actual allocation, the size of the buffer
allocated, and the actual amount of space consumed are three
implementation dependent details (and are not likely the same as each
other or the value passed to malloc ())for which the definition of
malloc () provides no information.

If you want what a string implementation, use one. It is likewise fairly
easy to create a salloc () function that explicitly tags the size of the
allocation as desired or to allocate a large pool and manage it
yourself.

Ya-know, it is rather curious that all the answers are the typical CLC "you
can't do it portably" (and thus you should be ashamed for even asking) and
that "therefore you should invest a lot of time and effort to re-write your
application thusly" - so that you can do it in a way of which we approve.

When...

It is pretty obvious that with a little encouragement, the OP could probably
figure out how to do what he wants ON HIS IMPLEMENTATION (which is all that
matters!), with a lot less effort (on his part) and cost (to his employers)
than it would be to re-write it the way you people suggest.

--
One of the best lines I've heard lately:

Obama could cure cancer tomorrow, and the Republicans would be
complaining that he had ruined the pharmaceutical business.

(Heard on Stephanie Miller = but the sad thing is that there is an awful lot
of direct truth in it. We've constructed an economy in which eliminating
cancer would be a horrible disaster. There are many other such examples.)
 
S

Shao Miller

Ya-know, it is rather curious that all the answers are the typical CLC "you
can't do it portably" (and thus you should be ashamed for even asking) and
that "therefore you should invest a lot of time and effort to re-write your
application thusly" - so that you can do it in a way of which we approve.

When...

It is pretty obvious that with a little encouragement, the OP could probably
figure out how to do what he wants ON HIS IMPLEMENTATION (which is all that
matters!), with a lot less effort (on his part) and cost (to his employers)
than it would be to re-write it the way you people suggest.

I'd agree with you, if:
- the OP was sincere, and
- they hadn't asked for 90% portability, and
- if they specified one or more targets (after being asked), and
- if they didn't appear to have some misunderstandings about 'sizeof'
and 'void *'

:)

So how about it, pembed2012: Do you have a handful of target
implementations that you'd like to consider? Maybe each one has a
method to accomplish your goal, as Mr. Kenny McCormack kindly suggests.
 
S

Stephen Sprunk

Ya-know, it is rather curious that all the answers are the typical CLC "you
can't do it portably" (and thus you should be ashamed for even asking) and
that "therefore you should invest a lot of time and effort to re-write your
application thusly" - so that you can do it in a way of which we approve.

When...

It is pretty obvious that with a little encouragement, the OP could probably
figure out how to do what he wants ON HIS IMPLEMENTATION (which is all that
matters!), with a lot less effort (on his part) and cost (to his employers)
than it would be to re-write it the way you people suggest.

I doubt anyone here objects, even in principle, to writing non-portable
code. However, at minimum people should be aware that's what they're
doing, and there are many other newsgroups available to help them with
the particular implementation(s) they care about.

Yes, in this specific case there is likely a platform-specific way to
determine the size of a dynamic allocation from a pointer. However,
Standard C does not guarantee that there is or specify what it might be,
so CLC _as an implementation-neutral forum_ is not the right place to
find out how to do it.

S
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,082
Messages
2,570,589
Members
47,211
Latest member
Shamestone

Latest Threads

Top