malloc() -- Performance?

T

tweak

When should malloc() and related functions (e.g. calloc(), realloc() )
be used?

I keep hearing to keep stuff out of memory as much as possible since
it's not as fast as when stuff is in the registers.

Brian
 
R

Randy Howard

When should malloc() and related functions (e.g. calloc(), realloc() )
be used?

I keep hearing to keep stuff out of memory as much as possible since
it's not as fast as when stuff is in the registers.

Based upon your question, your best bet at this point is to not worry
about it. Modern compilers do a very good job of knowing when to
use CPU registers for performance and when not to.

Later on, if you have a specific need to try and outrun the optimizer
in your compiler, that may change.
 
J

Jack Klein

When should malloc() and related functions (e.g. calloc(), realloc() )
be used?

These function should be used when you need to dynamically allocate
memory. One reason is when you do not know the amount of data until
run time.
I keep hearing to keep stuff out of memory as much as possible since
it's not as fast as when stuff is in the registers.

You keep hearing this from whom? Certainly not from any reputable
source about C.

Once upon a time the most popular computers in the world were the
Apple II with an 8-bit 6502 processor running at about 1 MHz and the
original TRS-80 with an 8-bit Z80 processor running at 1.77 MHz.
Neither of these processors could execute 1,000,000 8-bit instructions
per second on average.

Today, desk top processors are exceeding 3 GHz clock speeds, and
averaging more than one 32-bit or 64-bit instruction per clock cycle.
They have at least 5 orders of magnitude greater performance than the
computers of 20 years ago.

Here's when you need to start thinking about registers versus memory,
after you meet all of these conditions:

1. You know a lot more about C than you do today.

2. You have produced a program that is correct in all respects, that
it that it meets all of its requirements except for failing to execute
fast enough.

3. You have verified that no further improvements can be made to your
program in terms of selecting a more efficient algorithm or coding it
more efficiently in C.

4. You have profiled your program and proved that memory bandwidth is
the bottle neck.

Then, and only then, should you start worrying about registers versus
memory.
 
E

Emmanuel Delahaye

In 'comp.lang.c' said:
When should malloc() and related functions (e.g. calloc(), realloc() )
be used?

When flexibility is required.
I keep hearing to keep stuff out of memory as much as possible since
it's not as fast as when stuff is in the registers.

Flexibility has a cost. When not required, use static objects.
 
B

Bernhard Holzmayer

tweak wrote:

These are two different issues.
When should malloc() and related functions (e.g. calloc(),
realloc() ) be used?
1) if you cannot hold all variables in memory at the same time,
then you can ask the compiler to give you a portion from the heap,
where it stores stuff during run-time.
If you happen to release it early enough before heap space is all
used up, then the compiler will make the program run smoothly by
sharing the available memory between those variables which
don't need memory at the same time.
For all variables which need a lifetime as long as the program runs,
malloc() isn't helpful.
If it is fast or not, will only be important if you work with lot of
very huge memory or if you run malloc() and free() in loops,
which run very, very often.
I keep hearing to keep stuff out of memory as much as possible
since it's not as fast as when stuff is in the registers.
2) as the other posters said: this is up to the compiler, and should
be a no-no for you, unless you're working on something very special
and you have enough skill.
A hint: in embedded systems or on microcontrollers you might have
such special requirements.

Bernhard
 
T

Thomas Matthews

tweak said:
When should malloc() and related functions (e.g. calloc(), realloc() )
be used?
Other people have answered this one.

I keep hearing to keep stuff out of memory as much as possible since
it's not as fast as when stuff is in the registers.

Brian
One should keep stuff in storage according to its size,
frequency of use, and accessibility. Let me elaborate:

Registers are faster access then RAM, but they have a
smaller capacity and there are fewer register locations
than RAM locations. Items that are consistently and
frequently accessed are placed in registers. Registers
lose their contents when power is removed from the system.

Memory (RAM), is slower than registers, but faster than
harddrives. Memory has smaller capacity than harddrives
and systems usually have less memory space than harddrive
space. Memory loses its contents when power is removed
from the memory.

Harddrives are slower than memory, but faster than
Compact Discs (CDs) or Tape Drives. Harddrives retain
their data after loss of power.

The General Rule is to keep in memory what you can.
Don't worry about register usage unless you have no
other choice and profiling has shown that your optimized
C function is the bottleneck. In this case, move to
assembly. Otherwise, let the compiler worry about
register usage. You can design functions that suggest
a better register usage. You can suggest to the
compiler that it uses registers but it still doesn't
have to obey your suggestion.


--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book
 
D

Dr Justice

Perhaps it is worth adding that for e.g. embedded systems where the size of
the (possibly ROM'ed) executable matters,
you may want to malloc() writeable objects that would otherwise end up as
static (initializer) data in your executeable.

That is, if you care about the bits and bytes...

DJ
--
 
K

Keith Thompson

tweak said:
When should malloc() and related functions (e.g. calloc(), realloc() )
be used?

I keep hearing to keep stuff out of memory as much as possible since
it's not as fast as when stuff is in the registers.

You may be misunderstanding what malloc() is about. It's not about
memory vs. registers, it's about dynamically allocated memory vs.
other memory.

Using registers rather than memory can result in faster code, but
that's best left to the compiler. (There is a register keyword, but
there's little advantage in using it.)
 
M

Malcolm

Keith Thompson said:
You may be misunderstanding what malloc() is about. It's not about
memory vs. registers, it's about dynamically allocated memory vs.
other memory.
There is some confusion here. It is advantageous to keep data in registers
whilst possible, but you are not going to do that by cutting down on use of
malloc(). In any case modern platforms do the job for you so well that you
can essentially forget it.
Many systems have a cache, and it is probably true that stack data is more
likely to be cached than heap data. This may be what the OP has heard and
misreported. However you really need to know what you are doing before you
can speed up a program appreciably by replacing calls to malloc() with fixed
buffers.
 
T

tweak

Malcolm said:
There is some confusion here. It is advantageous to keep data in registers
whilst possible, but you are not going to do that by cutting down on use of
malloc(). In any case modern platforms do the job for you so well that you
can essentially forget it.
Many systems have a cache, and it is probably true that stack data is more
likely to be cached than heap data. This may be what the OP has heard and
misreported. However you really need to know what you are doing before you
can speed up a program appreciably by replacing calls to malloc() with fixed
buffers.
Can you recommend any books on optimizing C code? The only one that I
have is Michael Abrash's Graphics Programming Black Book.

And where I heard about leaving stuff out of memory when possible was
from the Assembly Language book "Assembly Step by Step".

But I don't want to use assembly because it will reduce my ability to
port my code. And I'm not good at assembly.

--

I am trying to improve my use of buffers. That's basically what I am
trying to accomplish. I am using them mainly for sockets, but I won't
go into detail since that off topic besides the related C part.

Currently, I am using fixed buffers (aka fixed sized arrays), not
dynamic memory allocation with malloc(). And I just wanted to
understand the performance impacts of using dynamic allocation.

Thanks for all the good information.

Brian
 
K

Keith Thompson

tweak said:
Can you recommend any books on optimizing C code? The only one that I
have is Michael Abrash's Graphics Programming Black Book.

First law of optimization: Don't do it.

Second law of optimization (expert programmers only): Don't do it yet.

(That's not original with me, but I was unable to find the author; I
may not be quoting it quite correctly.)

Or, as Donald Knuth says, "Premature optimization is the root of all
evil."

Use reasonably efficient algorithms (quicksort rather than bubblesort,
for example), write good clean code that's as portable as you can make
it, and use whatever optimization options your compiler provides. If
the code doesn't run quickly enough, use a profiler to find out where
it's wasting time. A clever tweak (no offense) that makes your code
run 10% faster on this year's computer may make it run 20% slower on
next year's computer, and if it makes the source more complicated it
may be harder to back it out than it was to put it in in the first
place. An optimizing compiler isn't necessarily smarter than you are,
but it probably knows more about the target system than you do, and
it's certainly more patient; it can repeatedly analyze large chunks of
code and discover relationships that you would have missed.

What I should say now is, "Having said all that, a good book on code
optimization is ..." -- but I don't know of one.
 
J

Jack Klein

Perhaps it is worth adding that for e.g. embedded systems where the size of
the (possibly ROM'ed) executable matters,
you may want to malloc() writeable objects that would otherwise end up as
static (initializer) data in your executeable.

That is, if you care about the bits and bytes...

DJ

It makes no sense at all to have any initializer at all for memory
that you could malloc(), since malloc() won't initialize it anyway.
There may be embedded tools this poor, but even then it should be
simple to work around it.

Besides, many embedded systems don't support malloc(), as free
standing systems are not required to.
 
J

James Moughan

There is some confusion here. It is advantageous to keep data in registers
whilst possible, but you are not going to do that by cutting down on use of
malloc(). In any case modern platforms do the job for you so well that you
can essentially forget it.

He's sort-of half-right. Dynamically allocated memory must be
accessed through a pointer, which is the real problem. The compiler
cannot determine whether assignments to different pointers affect the
same memory location, and so values which might be safely held in a
register during a loop may be loaded and written to memory each time
they are modified.

A (stupid) example of exactly this follows.

In any case, the OP definitely shouldn't worry about this - I'd
recommend that he avoid using malloc so as to have fewer memory leaks.
<wink>

bash-2.05b$ cat stpdex.c
int main(){

int i = 0, j = 0, *a = &i, *b = &j, count=0;
while (++count < 10){
*a += count;
*b += count;
}
}
bash-2.05b$ gcc -O3 -save-temps stpdex.c
bash-2.05b$ cat stpdex.s
.file "stpdex.c"
.text
.p2align 4,,15
..globl main
.type main,@function
main:
pushl %ebp
movl %esp, %ebp
pushl %eax
pushl %eax
movl $1, %eax
andl $-16, %esp
leal -4(%ebp), %ecx
movl $0, -4(%ebp)
movl $0, -8(%ebp)
leal -8(%ebp), %edx
.p2align 4,,7
..L5:
addl %eax, (%ecx)
addl %eax, (%edx)
incl %eax
cmpl $9, %eax
jle .L5
leave
ret
..Lfe1:
.size main,.Lfe1-main
.ident "GCC: (GNU) 3.2.2 (Mandrake Linux 9.1 3.2.2-3mdk)"
 
T

tweak

James said:
He's sort-of half-right. Dynamically allocated memory must be
accessed through a pointer, which is the real problem. The compiler
cannot determine whether assignments to different pointers affect the
same memory location, and so values which might be safely held in a
register during a loop may be loaded and written to memory each time
they are modified.

A (stupid) example of exactly this follows.

In any case, the OP definitely shouldn't worry about this - I'd
recommend that he avoid using malloc so as to have fewer memory leaks.
<wink>

bash-2.05b$ cat stpdex.c
int main(){

int i = 0, j = 0, *a = &i, *b = &j, count=0;
while (++count < 10){
*a += count;
*b += count;
}
}
bash-2.05b$ gcc -O3 -save-temps stpdex.c
bash-2.05b$ cat stpdex.s
.file "stpdex.c"
.text
.p2align 4,,15
.globl main
.type main,@function
main:
pushl %ebp
movl %esp, %ebp
pushl %eax
pushl %eax
movl $1, %eax
andl $-16, %esp
leal -4(%ebp), %ecx
movl $0, -4(%ebp)
movl $0, -8(%ebp)
leal -8(%ebp), %edx
.p2align 4,,7
.L5:
addl %eax, (%ecx)
addl %eax, (%edx)
incl %eax
cmpl $9, %eax
jle .L5
leave
ret
.Lfe1:
.size main,.Lfe1-main
.ident "GCC: (GNU) 3.2.2 (Mandrake Linux 9.1 3.2.2-3mdk)"

Thanks.

I read through the C-FAQ-list file again for malloc(). And I think
my question is very implementation specific as the only way I know of
to see how my compiler is treating my variables is to look at the
assembly, which is very platform (os and architecture) specific and
doesn't help at all with porting code across platforms. Nevertheless,
even with computers being more powerful today, I do not want to
write sloppy code. Back when I had my 80286 optimization was a big
deal. Now, I guess porting is a bigger deal.


I looked up the guy Knuth, which was mentioned in this thread, so I
think I might invest in his little book collection and spend my
time improving my ability to write better algorithms that will port,
rather than worrying about where my variables are stored right now.

Thanks Again,

Brian
 
L

LibraryUser

tweak said:
.... snip ...

I looked up the guy Knuth, which was mentioned in this thread,
so I think I might invest in his little book collection and
spend my time improving my ability to write better algorithms
that will port, rather than worrying about where my variables
are stored right now.

That is far and away the smartest thing you have said, or could
do.
Sedgewicks "Algorithms in C" is also recommended. Way back when
Sedgewick was a student of Knuths.
 
T

tweak

LibraryUser said:
tweak wrote:

... snip ...



That is far and away the smartest thing you have said, or could
do.
Sedgewicks "Algorithms in C" is also recommended. Way back when
Sedgewick was a student of Knuths.

Thanks.

I'm still learning. And I am often corrected here, which is why I
am here. I make a lot of mistakes, but hopefully, I learn from them.
And everyone has been helpful here.

Brian
 
B

Bernhard Holzmayer

tweak said:
Thanks.

I read through the C-FAQ-list file again for malloc(). And I
think my question is very implementation specific as the only way
I know of to see how my compiler is treating my variables is to
look at the assembly, which is very platform (os and architecture)
specific and
doesn't help at all with porting code across platforms.
Nevertheless, even with computers being more powerful today, I do
not want to
write sloppy code. Back when I had my 80286 optimization was a
big
deal. Now, I guess porting is a bigger deal.

Hi Brian,

if you consider writing portable code, you should at least stick to
code which is conformal to standards like ANSI/ISO.

And I would propose that you avoid ANY optimization unless it is
unavoidable.
Not only, because it's often silly to do the machine's job, and the
built-in optimizers are doing it pretty good nowadays.

The other and to my opinion more important aspect is, that
optimization usually aims at an increased performance with respect
to speed or memory usage ...

If you move your code from one system/architecture to another,
there's a good chance that the priorities change.

One machine might have enormous memory resources, but might be
rather slow, while the other one is rocket-fast, but lacks of
memory. If you optimize your code for this one, it might fail on
the other one because of the optimizations.
You might end up with lots of machine/architecture dependent
#ifdef's and almost unreadable code.

There's another way:
take a sheet of paper (or some more...) and make a good design of
your project. Then check it under different aspects:
- speed
- memory expense
- peripheral requirements
Reduce it to the minimum wherever possible; leave out all
unnecessary features and gimmicks.
Then start implementing it.

You'll probably find out that you end up with a fast and tiny
program which meets all requirements

without code level optimization.

That's my experience...
(but I don't do it always, therefore I know).

Bernhard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,145
Messages
2,570,824
Members
47,370
Latest member
desertedtyro29

Latest Threads

Top