Pointers and Allocated Memory

D

dgoodmaniii

+AMDG

This may be a stupid question, but I'm having trouble
finding an answer. Even Google is yielding precious little
information comprehensible to my beginning mind.

I am writing a function itoa(int num, char *s), which
converts an integer into a character string. I've
successfully removed all the array subscripting from it in
favor of pointers, as K&R hath commanded me. However,
there's one in the calling function that doesn't seem to
work. If I call it thus:
*******
char s[MAXLINE];
itoa(num,s);
*******
It works. If I call it thus:
*******
char *s;
itoa(num,s);
*******
It works sometimes, and sometimes it throws a segfault. I
can only assume that this is due to memory allocation; it
works when I'm lucky enough to have the (s+i) characters
assigned to unallocated memory, and it doesn't when I'm not.
Is this assumption correct?

Is there a way to do this where the length of s isn't
limited to some arbitrary value? Or is this what all that
mysterious malloc() stuff is about, and I just need to wait
until later in the book?

Thanks.
 
J

Julienne Walker

Is there a way to do this where the length of s isn't
limited to some arbitrary value?

There are ways, yes. However, you'll find that all of them involve a
trade-off. C is especially bad at working with strings. In your case,
you're lucky enough that the maximum size needed for s is
deterministic. All you need to do is figure out how many digits are
allowed for INT_MAX, and add 2 (one for the sign and one for the
terminating null character).
Or is this what all that mysterious malloc() stuff is about

For the most part, yes.
 
T

Tom St Denis

There are ways, yes. However, you'll find that all of them involve a
trade-off. C is especially bad at working with strings. In your case,
you're lucky enough that the maximum size needed for s is
deterministic. All you need to do is figure out how many digits are
allowed for INT_MAX, and add 2 (one for the sign and one for the
terminating null character).

The simpler solution would be [if you're to write a new function] have
IT allocate the memory and pass a pointer to the caller. That way if
INT_MAX changes (e.g. compiling your app on a new platform) the
calling application doesn't need to be ported.

e.g.

char *itoa(int num);

[Hint: this is what us software developers call design...]

Tom
 
S

Seebs

Is there a way to do this where the length of s isn't
limited to some arbitrary value? Or is this what all that
mysterious malloc() stuff is about, and I just need to wait
until later in the book?

Exactly so. "char *s" doesn't give you a pointer TO anything, it
just gives you a thing which could hold the address of some space,
if you initialized it to do so.

So, yeah, this is where stuff like malloc() comes in.

-s
 
S

Stefan Ram

Is there a way to do this where the length of s isn't
limited to some arbitrary value? Or is this what all that
mysterious malloc() stuff is about, and I just need to wait
until later in the book?

I have written a function »saldbl« to do this.

It can be used as follows:

{ double const test = 3.579;
s_type const r = saldbl( test );
if( r ){ printf( "r = \"%s\"\n", r ); sfree( r ); }}

This will possibly print:

r = "3.579"

The definition is part of my German-language C FAQ at:

http://www.purl.org/stefan_ram/pub/c_faq_de
 
D

dgoodmaniii

+AMDG
char *itoa(int num);

[Hint: this is what us software developers call design...]

Okay, so once I learn about malloc() and friends, I would
just call malloc() for the initial allocation, then
realloc() for an additional byte every time a digit is
added, until I'm through?

It sounds like that, for my stage of learning, then (I know
K&R through this part of Chapter 5), continuing with arbitrary
memory limits (arbitrary in the sense that they have nothing
to do with available memory) is unavoidable. Am I correct?

Thanks again.
 
T

Tom St Denis

+AMDG
char *itoa(int num);
[Hint: this is what us software developers call design...]

Okay, so once I learn about malloc() and friends, I would
just call malloc() for the initial allocation, then
realloc() for an additional byte every time a digit is
added, until I'm through?

It sounds like that, for my stage of learning, then (I know
K&R through this part of Chapter 5), continuing with arbitrary
memory limits (arbitrary in the sense that they have nothing
to do with available memory) is unavoidable.  Am I correct?

The point is you can figure out the determination [from INT_MAX, etc]
at compile time of your library. As a user of it, I just pass you an
int, and you sort it out.

Suppose you're not converting to ascii or it's not an int. Say it's
compression. Say I give you a buffer to compress. One way to do the
API would be that I pass you an output buffer of roughly equal size
[if it doesn't compress just memcpy it out + header] or I pass you the
input and you return me the output in a buffer you allocate somehow
based on what you're doing internally.

In this case, you know you're converting values of a known range, so
you could, for example, allocate [say] 32 bytes which would be way
more than enough for 32/64 bit platforms. Or you could do the realloc
trick. The problem with that is not only the overhead of reallocing
but that you waste memory anyways with a heap block. Might as well
just allocate once with a buffer that shouldn't be too small [bonus
points for bounds checking] and then go on your way.

Tom
 
J

Julienne Walker

On Dec 29, 11:28 am, (e-mail address removed) wrote:
There are ways, yes. However, you'll find that all of them involve a
trade-off. C is especially bad at working with strings. In your case,
you're lucky enough that the maximum size needed for s is
deterministic. All you need to do is figure out how many digits are
allowed for INT_MAX, and add 2 (one for the sign and one for the
terminating null character).

The simpler solution would be [if you're to write a new function] have
IT allocate the memory and pass a pointer to the caller.

I'm not a fan of that approach because it drastically increases the
chances of a memory leak. Having the caller manage memory and the
callee use it (with a size parameter) is the more robust solution in
my opinion. It's easier to keep track of memory and easier to verify
correctness all around.
That way if INT_MAX changes (e.g. compiling your app on a new
platform) the calling application doesn't need to be ported.

That makes no sense. If INT_MAX changes between implementations,
recompiling will adjust it accordingly. The only portability concern
is if you make unwarranted assumptions about the value represented by
INT_MAX. I fail to see how your suggested approach and rationale are
connected.
 
S

Seebs

Okay, so once I learn about malloc() and friends, I would
just call malloc() for the initial allocation, then
realloc() for an additional byte every time a digit is
added, until I'm through?

Typically, you'd keep track of how much space you've allocated, and realloc
using some strategy that involves fewer reallocations.
It sounds like that, for my stage of learning, then (I know
K&R through this part of Chapter 5), continuing with arbitrary
memory limits (arbitrary in the sense that they have nothing
to do with available memory) is unavoidable. Am I correct?

Probably. It's often more practical, anyway -- there's a ton of stuff
you will benefit from developing familiarity with before you start trying
to deal with memory allocation.

Also, assuming a reasonably modern system, you can set the arbitrary memory
limits "large enough". Gone are the days when there was a compelling reason
to stick to a 256-byte buffer in a test program and risk overruns.

-s
 
J

Julienne Walker

Mysterious?

Q: Can I have enough memory for 20 ints please
A: Sure, put this address in your pointer. No cost. Just return it when you are
finished with it.

Hardly rocket science.

Concepts which are trivial to us can be baffling to a beginner. A
little sympathy goes a long way in helping someone struggle through
figuring out the basics.
 
S

Seebs

Concepts which are trivial to us can be baffling to a beginner. A
little sympathy goes a long way in helping someone struggle through
figuring out the basics.

Yeah, but you're talking to one of the forum trolls, there. As you might
expect, part of their dogged insistance that the regulars are not helpful
to newbies is that they seem to go out of their way to try to drive
newbies away.

-s
 
J

John Bode

+AMDG

This may be a stupid question, but I'm having trouble
finding an answer.  Even Google is yielding precious little
information comprehensible to my beginning mind.

I am writing a function itoa(int num, char *s), which
converts an integer into a character string.  I've
successfully removed all the array subscripting from it in
favor of pointers, as K&R hath commanded me.  However,
there's one in the calling function that doesn't seem to
work.  If I call it thus:
*******
char s[MAXLINE];
itoa(num,s);
*******
It works.  If I call it thus:
*******
char *s;
itoa(num,s);
*******
It works sometimes, and sometimes it throws a segfault.  I
can only assume that this is due to memory allocation; it
works when I'm lucky enough to have the (s+i) characters
assigned to unallocated memory, and it doesn't when I'm not.
Is this assumption correct?

Since s is uninitialized (I'm assuming it's not declared static or at
file scope, which would implicitly initialize it to NULL), it contains
a random bit pattern that may not correspond to a writable memory
address, or it corresponds to a writable memory address containing
something important that you're clobbering, both of which could cause
a segfault.
Is there a way to do this where the length of s isn't
limited to some arbitrary value?  Or is this what all that
mysterious malloc() stuff is about, and I just need to wait
until later in the book?

Thanks.

As others have mentioned, for this particular problem you already know
how big your buffer needs to be based on the maximum integer value
your system can support; for example, a 64-bit integer requires up to
16 hex digits, 20 decimal digits, 22 octal digits, or 64 binary digits
to be represented (not accounting for sign and the 0 terminator). You
could statically allocate a big enough buffer (22 for decimal
representation) and just not worry about memory management at all.

In the general case, it's possible to extend a dynamically allocated
buffer as needed using realloc():

size_t i = 0;
size_t currentSize = 0;
char *buffer = malloc(SOME_INITIAL_SIZE);
if (!buffer)
{
/**
* The call to malloc() failed; for this example, we simply bail
*/
exit(0);
}

/**
* Track the current size of the buffer
*/
currentSize = SOME_INITIAL_SIZE;
while (not_done_yet())
{
/**
* Do we need to extend the buffer?
*/
if (i == currentSize)
{
/**
* Yes. Use realloc() to double the buffer size.
* Assign the result to a temporary in case the
* realloc() call fails and returns a NULL so
* we don't lose our original pointer
*/
char *tmp = realloc(buffer, currentSize * 2);
if (tmp)
{
buffer = tmp;
currentSize *= 2;
}
else
{
/**
* The call to realloc() failed; again, this example
* treats it as a fatal error and bails immediately.
* Free the memory we've already allocated to this
* point and exit.
*/
free(buffer);
exit(0);
}
}

buffer[i++] = whatever();
}

There are a number of strategies for extending the buffer size; you
can either extend it by a fixed amount, multiply by some scaling
factor (e.g., double it as in the example above), or extend it based
on some other quantity (say the length of a string that's being
appended to the buffer). All have their pluses and minuses.
Generally, you want to minimize the number of calls to realloc(); you
definitely do *not* want to call it for every single byte.
 
D

dgoodmaniii

John Bode said:
There are a number of strategies for extending the buffer size; you
can either extend it by a fixed amount, multiply by some scaling
factor (e.g., double it as in the example above), or extend it based
on some other quantity (say the length of a string that's being
appended to the buffer). All have their pluses and minuses.
Generally, you want to minimize the number of calls to realloc(); you
definitely do *not* want to call it for every single byte.

Thanks to everyone for your responses; I'm responding to
this one particularly for utility's sake, not because I
don't appreciate the others, which were very helpful.

The general run of the responses seems to be as follows:

a.) Don't worry about it right now; allocate your strings
with an array that's larger than anything you'll want and
focus on other concepts until you get to malloc() later.
b.) When you get to malloc, allocate more than you'll need;
check on it; if you need more, allocate more than you need
then; and so on. While calling realloc() for every byte
would reduce memory consumption, the processing overhead
makes it silly.

Am I following everyone correctly?
 
N

Nick

Thanks to everyone for your responses; I'm responding to
this one particularly for utility's sake, not because I
don't appreciate the others, which were very helpful.

The general run of the responses seems to be as follows:

a.) Don't worry about it right now; allocate your strings
with an array that's larger than anything you'll want and
focus on other concepts until you get to malloc() later.
b.) When you get to malloc, allocate more than you'll need;
check on it; if you need more, allocate more than you need
then; and so on. While calling realloc() for every byte
would reduce memory consumption, the processing overhead
makes it silly.

Am I following everyone correctly?

Spot on.
 
M

Moi

The general run of the responses seems to be as follows:

a.) Don't worry about it right now; allocate your strings with an array
that's larger than anything you'll want and focus on other concepts
until you get to malloc() later. b.) When you get to malloc, allocate
more than you'll need; check on it; if you need more, allocate more than
you need then; and so on. While calling realloc() for every byte would
reduce memory consumption, the processing overhead makes it silly.

Am I following everyone correctly?

Exactly.

There is a third fact hidden:

c) arrays and pointers are not the same.
At some instances they may _look_ the same, (even behave the same)
but they are not.

HTH,
AvK
 
J

John Bode

Thanks to everyone for your responses; I'm responding to
this one particularly for utility's sake, not because I
don't appreciate the others, which were very helpful.

The general run of the responses seems to be as follows:

a.)  Don't worry about it right now; allocate your strings
with an array that's larger than anything you'll want and
focus on other concepts until you get to malloc() later.

Allocate an array that's as big as it *needs* to be to handle the
general case. For this particular problem, assuming a decimal
representation, you need an array that can hold 22 characters
(including the sign and nul terminator) to handle any 64-bit value.
But yeah, you can worry about dynamic allocation later.
b.)  When you get to malloc, allocate more than you'll need;
check on it; if you need more, allocate more than you need
then; and so on.  While calling realloc() for every byte
would reduce memory consumption, the processing overhead
makes it silly.

It's not just processing overhead; excessive calls to realloc() can
lead to unnecessary heap fragmentation.

One problem with the doubling strategy I showed is that you can wind
up with a lot of unused space. If my initial array size is 10 bytes
and I need to store 41 characters, I'll wind up having allocated 80
bytes; almost half of the allocated space is wasted. You want your
initial guess to be as close to right as possible, and you want your
extension strategy to get as close to right as possible in the fewest
extents possible.
Am I following everyone correctly?

More or less.
 
J

James Harris

+AMDG

This may be a stupid question, but I'm having trouble
finding an answer.  Even Google is yielding precious little
information comprehensible to my beginning mind.

I am writing a function itoa(int num, char *s),

I'd avoid calling it itoa because that's rather similar to a standard
function name. In the comments below I'll call it write_int.
which
converts an integer into a character string.  I've
successfully removed all the array subscripting from it in
favor of pointers, as K&R hath commanded me.

That can be done but I can't think of a K&R command to say to do so.
However,
there's one in the calling function that doesn't seem to
work.  If I call it thus:
*******
char s[MAXLINE];
itoa(num,s);
*******
It works.  If I call it thus:
*******
char *s;
itoa(num,s);
*******
It works sometimes, and sometimes it throws a segfault.  I
can only assume that this is due to memory allocation; it
works when I'm lucky enough to have the (s+i) characters
assigned to unallocated memory, and it doesn't when I'm not.
Is this assumption correct?

It sounds like you are on the right lines. C pointers and arrays can
be a challenge when starting. To help grasp how they are used I'd
suggest drawing a sketch of memory. For a 32-bit machine the pointers
are 4 bytes. When you use "char *s" you get a pointer called s and
nothing else. The pointer doesn't necessarily point at anything.

s |____|

On the other hand with "char s[9]" you get a 9-byte block of memory
*and* a pointer called s which points to that 9-byte block. So in this
case the pointer is initialised.

s |____| --> |_________|

Does that help? In each case when you call your function you are
passing the pointer so C is happy. But in the first case the pointer
hasn't been initialised to point to anywhere in particular in memory
so doesn't work in the way you are trying to use it.

If it's not too much to take in just now you can use malloc or its
friends to allocate some space for s to point to. So, running "s =
malloc(99)" before you call your function would attempt to allocate 99
bytes and point s at that space.
Is there a way to do this where the length of s isn't
limited to some arbitrary value?  Or is this what all that
mysterious malloc() stuff is about, and I just need to wait
until later in the book?

As others have said, you could use malloc inside your function. Malloc
will attempt to allocate however much memory you request. If it
succeeds (you should always check) you can then return it to whoever
calls your function. However there are good reasons to avoid this way.
A safer and more C-esque approach might be to do as follows.

1. Define write_int(char *s, char *fmt, int i) to write to s in the
format specified by fmt the integer i. As well as allowing you more
control over how the integer is written (type of integer, output
alignment, etc) the format string allows you to limit how many
characters are written. This is important as you can ensure that your
function doesn't write more characters than you have allowed for. For
example, if you have ten characters available you can specify this in
the fmt string.

2. Define another function, len_int(int i, int base) to return the
number of digits needed for i in the number base given. Or, perhaps
better, you could replace "int base" with "char *fmt" and use the same
format string as for write_int.

The len_int function is there to return the number of characters
needed in case you don't want to specify it in the format string.

Either way, the above approach allows the *caller* to deal with any
memory needed.

Having said the above, there are standard functions to do similar.
Check out sprintf and snprintf.

James
 
B

Ben Bacarisse

I am writing a function itoa(int num, char *s), which
converts an integer into a character string. I've
successfully removed all the array subscripting from it in
favor of pointers, as K&R hath commanded me. However,
there's one in the calling function that doesn't seem to
work. If I call it thus:
*******
char s[MAXLINE];
itoa(num,s);
*******
It works. If I call it thus:
*******
char *s;
itoa(num,s);
*******

Something that has not been commented on (it is a detail) is that the
[]s in question do not represent subscripting so there is no need to
replace them with a * when you switch from using subscripts to
pointers.

Obviously, there is a connection between using []s to declare an array
and []s used to index it, but I doubt that K&R intended a switch from
a declared array to a pointer when suggesting that the program be
rewritten to "use pointers rather subscripting".

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

why 50% of allocated memory is not freed? 32
Pointers and Sequence Point 12
Sizes of pointers 233
Pointers 16
overwriting memory 42
pointers 4
Can I do ANYTHING on allocated memory? 17
Allocating Memory And Strings 6

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,740
Latest member
AdolphBig6

Latest Threads

Top