Malloc Query

U

Uncle Steve

Not quite - it only has to be in the same translation unit. A
translation unit consists of a given source file plus all of the other
files merged into it by #include statements. The use of the function
must also occur within the scope of the inline declaration, which
basically means that the declaration should occur prior to the use.
These requirements also apply to macros, though the scope rules are a
bit different for them.

Sorry, I'm not up on the technical nomenclature pertinent to
compilers. I *meant* to say 'translation unit', but all my brain had
was 'source file'.
If you replace your function-like macro, wherever it is that you have it
defined, with an inline function definition in that same location, that
function will be usable pretty much wherever the the macro was usable.
It's probably feasible to come up with pathological contexts where a
simple in-place replacement won't work, but such contexts are not the norm.
Agreed.


Perhaps not, but it's one that most popular modern compilers do
routinely and very well; that's part of what makes them popular.

Inline functions are easy to use, but who can measure the performance
advantage of inlining versus the code size tradeoff? I have no idea
how to accurately measure and quantify the real advantages, which may
be different for every architecture and application. I suspect the
trend of blindly setting CFLAGS=-O9, which is not all that uncommon in
makefiles, masks a profound ignorance of what really goes on in a
running program. I also suspect that the complex interaction of
compile and run-time factors defies simplistic optimization
heuristics.
Any decent compiler can be relied upon to take those issues into
consideration when deciding whether or not to inline a function. This is
something it can decide, independently of whether or not the function is
declared inline. As far as actual inlining is concerned, the 'inline'
keyword is only a hint, which a compiler is free to ignore; and it's
perfectly legal for a compiler to decide to inline a function that is
not declared 'inline', as long as doing so doesn't change the observable
behavior. In fact, that was one of the arguments given against the
introducing of the 'inline' keyword. Any static function written to meet
the same special requirements that currently apply to 'inline' functions
could already have been inlined, even if not so declared, so long as the
compiler thought that doing so would be a good idea.

At least these compilers allow one to individually tune the
optimization characteristic of the compilation process. If you don't
want inlining, it's trivial to turn it off.
Unless you know a lot more about the target platform than your compiler
does, it's probably best to rely upon it to make inlining decisions.

All things being equal, that is decent advice.
Defining a function (whether or not you use 'inline') gives it that
option. Defining it as a function-like macro does not - it only allows
inlining. Well, technically, I suppose a sufficiently sophisticated
compiler could perform anti-inlining: recognizing a common code pattern,
and replacing it wherever used with a call to a compiler-generated
function definition. The fact that the common code was the result of a
macro expansion would make it easier to recognize the feasibility of
such an optimization. However, and it seems to me to be a harder
optimization to perform than inlining, and I doubt that it is a common
feature even of the most sophisticated compilers.

That kind of optimization would be the comp-sci equivalent of gilding
the lily.



Regards,

Uncle Steve
 
I

Ian Collins

Uncle Steve said:
*((int *) arena_obj_addr(h, i)) = -1;

this above instruction seems to write in a place where has not to write
Is it possible the space is for n elements but here it is written the
element n+1 [i=n not in 0..n-1]?

That's why I told him 5 days ago!
 
U

Uncle Steve

Uncle Steve said:
On Thu, May 19, 2011 at 03:22:18AM +0100, Ben Bacarisse wrote:
[snip]


Ok, here's a quick and dirty hack that measures a highly contrived
scenario. This code first allocates two arenas of different size, and
then ping-pongs the allocations between the two arenas for a defined
number of iterations. Then, I do more or less the same thing with
malloc.y

The preliminary results show that my special-purpose allocator is 2-3
times faster than glibc malloc. Not quite and order-of-magnitude
difference in performance (unless you think in base-2), but very
acceptable. In some programs there may be a larger difference in
performance becuase of reduceed memory fragmentation or reduced
overall memory use. I do not plan to isolate those factors at this
time to measure their effect.

Code is compiled with "gcc -O0 -o arena_test arena_test.c -lrt"

The test platform here is an Intel Atom netbook running at 1.333GHz,
1G RAM, OpenSuSE 11.4, libc-2.11.3, gcc 4.5.1 20101208.

Typical result:

[31/22:32] nb:pts/10 stevet ~/stuff/src/libs/tools/test: ./arena_test
Arena: Iterations: 100000; elapsed CPU time (msec): 39181
Malloc: Iterations: 100000; elapsed CPU time (msec): 107265


Code follows:

#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>


#define TEST_SIZE 1024 * 1024
#define ITERATIONS 100000

#define DIFFERENCE 5


struct arena_head_s {
int obj_size;
int arena_size;
unsigned char *
arena;
int free;
int free_list;
};

typedef struct arena_head_s arena_head;


#define arena_obj_addr(x, n) ((void *) &x->arena[n * x->obj_size])


arena_qa(arena_head *x)
{
int n;

n = x->free_list;

if(n != -1) {
x->free_list = *((int *) arena_obj_addr(x, n));
x->free--;
}

return(n);
}

void arena_free(arena_head *p, int n)
{
*((int *) arena_obj_addr(p, n)) = p->free_list;
p->free_list = n;
p->free++;

return;
}

arena_head * arena_creat(size_t s, int n)
{
arena_head * h;
int i;

h = malloc(sizeof(arena_head));
if(h == NULL) {
perror("malloc()");
exit(1);
}

h->obj_size = s;
h->arena_size = n;
h->free = n;

h->arena = malloc(s * n);
if(h->arena == NULL) {
perror("malloc()");
exit(1);
}

h->free_list = 0;

for(i = 0; i < n; i++)
*((int *) arena_obj_addr(h, i)) = i + 1;

*((int *) arena_obj_addr(h, i)) = -1;

this above instruction seems to write in a place where has not to write
Is it possible the space is for n elements but here it is written the
element n+1 [i=n not in 0..n-1]?

All that statement does is simulate defining the target arena object
as a union of int, and whatever the caller plans to store in each
object slot. It's actually a more convenient solution because in the
case where you do use a union, the code in the loop above would have
to look something like this.


for(i =0; i < (n -1); i++)
h->arena.un.freelist_next = i + 1;

h->arena.un.freelist_next = -1;

And accessing whatever it is that is actually being stored there would
have to be similar to *arena_obj_addr(h, i) = whatever. As the free
list management is internal to the alocator, the application should
not care what is stored in the free slots of the 'array'.
i'm not much smart, i not understand well all this code
but it seems to me the bigger error it is all can overflow and
too much confidence all goes well to the HLL

There was an off-by-one error in the above code from a transcription
error, which was resolved in a subsequent message. Otherwise, an
out-of-bounds access is the fault of the application programmer.
Caveat emptor.
how is good assembly :)
i have the difect i see something good each little instruction at time
[too much near and too much far]
return(h);
}

As others have said, the requirements of kernel programming are
different from conventional application logic. But this code is not
part of an operating system, or anything even remotely similar to an
operating system. It is a kernel only in a limited and specialized
sense.



Regards,

Uncle Steve
 
U

Uncle Steve

Uncle Steve said:
*((int *) arena_obj_addr(h, i)) = -1;

this above instruction seems to write in a place where has not to write
Is it possible the space is for n elements but here it is written the
element n+1 [i=n not in 0..n-1]?

That's why I told him 5 days ago!

Did you miss the previous message that corrected the typo?



Regards,

Uncle Steve
 
U

Uncle Steve

On 05/26/11 06:12 PM, io_x wrote:
"Uncle Steve"<[email protected]> ha scritto nel messaggio

*((int *) arena_obj_addr(h, i)) = -1;

this above instruction seems to write in a place where has not to write
Is it possible the space is for n elements but here it is written the
element n+1 [i=n not in 0..n-1]?

That's why I told him 5 days ago!

Did you miss the previous message that corrected the typo?

No, why?

No reason. just curious.



Regards,

Uncle Steve
 
D

David Thompson

On Mon, 23 May 2011 14:44:18 +0100, Ben Bacarisse
gcc -std=c99 -pedantic if you want to avoid extensions. gcc -ansi
-pedantic is you want to avoid more non-portable features. <snip>

Other than those using (implementation-reserved) double-underscore,
which are easy enough to search for (unless you construct them with
token-pasting, and if so you deserve whatever problems you get).

And things which are implementation-dependent (unspec or impl-def) in
the standard. Those can be nonportable in general, but not from gcc
specifically, and gcc doesn't flag them as such. (In some cases, the
impl-dependent range or signedness of a type may trigger value-range,
conversion-range, or unsigned-comparison warnings.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,091
Messages
2,570,605
Members
47,225
Latest member
DarrinWhit

Latest Threads

Top