Cryptic Syntax

W

WD

Can anybody explain what's going on in the extremely cryptic syntax
found in the return statement below? Based on observed program
functionality, it appears this statement directly launches a program
(that was created in memory during the /*create a program in memory
before returning*/ section) after main exits, but how?

main(x,y)
{
/*create a program in memory before returning*/

return(*(int(*)())*(int*)(z))(x,y);
}

Thanks,
William
 
G

Gene

Can anybody explain what's going on in the extremely cryptic syntax
found in the return statement below?  Based on observed program
functionality, it appears this statement directly launches a program
(that was created in memory during the /*create a program in memory
before returning*/ section) after main exits, but how?

main(x,y)
{
    /*create a program in memory before returning*/

    return(*(int(*)())*(int*)(z))(x,y);

}

Hard to say what it does because it's not syntactically correct.
There is no declaration for z. The comment is meaningless.
 
B

Ben Bacarisse

WD said:
Can anybody explain what's going on in the extremely cryptic syntax
found in the return statement below? Based on observed program
functionality, it appears this statement directly launches a program
(that was created in memory during the /*create a program in memory
before returning*/ section) after main exits, but how?

main(x,y)
{
/*create a program in memory before returning*/

return(*(int(*)())*(int*)(z))(x,y);

Lets break it down. The expression is in two parts:

(*(int(*)())*(int*)(z)) (x,y)

That is the form of a function call. The complex expression should
result in a pointer to a function (all C functions are called as if
through a pointer to them) and the arguments x and y are passed to it.

So what of the function part? It consists of four operators applied to
(z): two cast operators and two indirection operators.

* (int(*)()) * (int*) (z)

The ()s round z are redundant. So z is converted to a pointer to
integer (this may or may not be well-defined, it depends on what z is)
and that pointer is followed to obtain the int object is refers to.
This int is then converted to a pointer to a function (more on that
later) and then that pointer is dereferenced to get the function it
points to.

This suggest to me that the code is old. The way C is now defined,
this function is immediately converted /back/ to a pointer so the
left-most * is pointless.

We can simplify the statement in any of these ways:

return ((int (*)())*(int*)z)(x, y);

int i = *(int *)z;
return ((int (*)())i)(x, y);

typedef int function();
int i = *(int *)z;
function *f = (function *)i;
return f(x, y);

The conversion from an int to a function pointer is implementation
defined (you need to find out if it means anything for you
implementation and if so what).
 
G

Gene

Can anybody explain what's going on in the extremely cryptic syntax
found in the return statement below?  Based on observed program
functionality, it appears this statement directly launches a program
(that was created in memory during the /*create a program in memory
before returning*/ section) after main exits, but how?

main(x,y)
{
    /*create a program in memory before returning*/

    return(*(int(*)())*(int*)(z))(x,y);

}

Thanks,
William

Hard to say what it does because it's nonsense. There is no
declaration for z. The comment is meaningless.

If z were declared to be a pointer of any kind, then the (int*) would
cast it to a pointer to int. The * dereferences this pointer.
Consequently
*(int*)(z)
has type int. The
(int (*)())
casts this to a pointer to function returning int. The leftmost *
dereferences this to a function type. The function (if z actually
pointed to one) would be applied to x and y. The value of x is the
program argument count. The value of y is undefined, but in many
implementations, will be an integer version of the pointer argv.

Of course we know z doesn't point to anything because it's undefined.
The code would only make a rough, ugly kind of non-compliaent sense if
z contained a pointer to a pointer to a function accepting an int and
a pointer to pointer to char. Here:

#include <stdio.h>

f(x, y)
char **y;
{
printf("Hello! x=%d and y[0]=\"%s\"\n", x, y[0]);
return 0;
}

main(x,y)
{
int w = &f;
int *z = &w;
return(*(int(*)())*(int*)(z))(x,y);
}

C:\>gcc foo.c -o foo.exe
foo.c: In function 'main':
foo.c:12: warning: initialization makes integer from pointer without a
cast

C:\>foo
Hello! x=1 and y[0]="foo"

C:\>
 
W

WD

Can anybody explain what's going on in the extremely cryptic syntax
found in the return statement below?  Based on observed program
functionality, it appears this statement directly launches a program
(that was created in memory during the /*create a program in memory
before returning*/ section) after main exits, but how?
main(x,y)
{
    /*create a program in memory before returning*/
    return(*(int(*)())*(int*)(z))(x,y);

Thanks,
William

Hard to say what it does because it's nonsense. There is no
declaration for z.  The comment is meaningless.

If z were declared to be a pointer of any kind, then the (int*) would
cast it to a pointer to int.  The * dereferences this pointer.
Consequently
*(int*)(z)
has type int.  The
(int (*)())
casts this to a pointer to function returning int. The leftmost *
dereferences this to a function type.  The function (if z actually
pointed to one) would be applied to x and y.  The value of x is the
program argument count.  The value of y is undefined, but in many
implementations, will be an integer version of the pointer argv.

Of course we know z doesn't point to anything because it's undefined.
The code would only make a rough, ugly kind of non-compliaent sense if
z contained a pointer to a pointer to a function accepting an int and
a pointer to pointer to char.  Here:

#include <stdio.h>

f(x, y)
char **y;
{
  printf("Hello! x=%d and y[0]=\"%s\"\n", x, y[0]);
  return 0;

}

main(x,y)
{
    int w = &f;
    int *z = &w;
    return(*(int(*)())*(int*)(z))(x,y);

}

C:\>gcc foo.c -o foo.exe
foo.c: In function 'main':
foo.c:12: warning: initialization makes integer from pointer without a
cast

C:\>foo
Hello! x=1 and y[0]="foo"

C:\>- Hide quoted text -

- Show quoted text -

z is declared as an int, not a pointer, and it really works as I
described. After creating a program in memory (like a compiler, only
no executable is written to disk) it appears that the return statement
somehow invokes the compiled (in memory) program. The full program
compiles with gcc and runs in the Linux environment, if that somehow
makes a difference.

William
 
D

Dr Malcolm McLean

z is declared as an int, not a pointer, and it really works as I
described.  After creating a program in memory (like a compiler, only
no executable is written to disk) it appears that the return statement
somehow invokes the compiled (in memory) program.  The full program
compiles with gcc and runs in the Linux environment, if that somehow
makes a difference.
The program will assemble the machine code at an absolute address
which it holds in the integer z. It then has to fool the C compiler
into thinking that this is a C-language function, and make it call it.
Hence the funny syntax. There might be sme other trickery going on at
address z to make the whole thing work.
 
B

bartc

I had to do something similar once, in my case the function took no
parameters and returned no result.

I used a typedef to simplify things (not that it shows):

typedef void (*(*F))(void);
(**(F)(p))();

Here F is the name of the typedefed part, and p is an int containing, not a
function address, but a location where the function address resides (iirc).

I had to enlist the help of c.l.c to set this up, and now I keep it under
lock and key in case I need to use it again.
 
P

Paul N

Hard to say what it does because it's nonsense. There is no
declaration for z.  The comment is meaningless.
If z were declared to be a pointer of any kind, then the (int*) would
cast it to a pointer to int.  The * dereferences this pointer.
Consequently
*(int*)(z)
has type int.  The
(int (*)())
casts this to a pointer to function returning int. The leftmost *
dereferences this to a function type.  The function (if z actually
pointed to one) would be applied to x and y.  The value of x is the
program argument count.  The value of y is undefined, but in many
implementations, will be an integer version of the pointer argv.
Of course we know z doesn't point to anything because it's undefined.
The code would only make a rough, ugly kind of non-compliaent sense if
z contained a pointer to a pointer to a function accepting an int and
a pointer to pointer to char.  Here:
#include <stdio.h>
f(x, y)
char **y;
{
  printf("Hello! x=%d and y[0]=\"%s\"\n", x, y[0]);
  return 0;

main(x,y)
{
    int w = &f;
    int *z = &w;
    return(*(int(*)())*(int*)(z))(x,y);

C:\>gcc foo.c -o foo.exe
foo.c: In function 'main':
foo.c:12: warning: initialization makes integer from pointer without a
cast
C:\>foo
Hello! x=1 and y[0]="foo"

z is declared as an int, not a pointer, and it really works as I
described.  After creating a program in memory (like a compiler, only
no executable is written to disk) it appears that the return statement
somehow invokes the compiled (in memory) program.  The full program
compiles with gcc and runs in the Linux environment, if that somehow
makes a difference.

I think you're wrong in saying that the "compiled" program runs
*after* main exits; rather, it is the last thing to run before main
exits. For instance, if main ends:

return runstuff(x, y);

this is the same as

temp = runstuff(x, y);
return temp;

except that the first form doesn't need an extra variable to store the
result in. Either way, runstuff runs first, then main exits, passing
on the result from runstuff.

Other than that, you seem to realise exactly what is going on, even if
you don't realise that you realise it... By the looks of it, it is
indeed "compiling" a program into memory and running it. This
necessarily requires system-specific features, as different system
have different machine codes, and it also requires some nasty
conversions in order to persuade the computer that the data you've
just built up is actually a function it can run. Hence the cryptic
syntax to convert an int into a function pointer.

Hope that helps.
Paul.
 
B

Ben Bacarisse

bartc said:
I had to do something similar once, in my case the function took no
parameters and returned no result.

I used a typedef to simplify things (not that it shows):

typedef void (*(*F))(void);
(**(F)(p))();

You can simplify this a little if you want to:

typedef void (**F)(void);
(*(F)p)()

Functions are called through a pointer and functions get converted to
pointers so any extra *s are harmless but redundant so (*****(F)p)()
also works.

<snip>
 
B

BGB / cr88192

Can anybody explain what's going on in the extremely cryptic syntax
found in the return statement below? Based on observed program
functionality, it appears this statement directly launches a program
(that was created in memory during the /*create a program in memory
before returning*/ section) after main exits, but how?

main(x,y)
{
/*create a program in memory before returning*/

return(*(int(*)())*(int*)(z))(x,y);

}

Thanks,
William

<--
Hard to say what it does because it's nonsense. There is no
declaration for z. The comment is meaningless.
-->

I will partly disagree some on this point.

since the only thing done directly on z here is to cast it, the initial
declaration and type of z is not particular important (it could be an
integer, a pointer, a long long, ...). (it is sufficient to assume that z
exists and is not a struct or union or similar, as this would be a compiler
error...).

after the cast, types are known, and so the behavior can be understood.
(granted, in this form, it still could not be run).


so, we can infer:
z is treated as some sort of pointer to a pointer to an area of memory which
is then called as if it were a function, with the return value returning to
the caller.

it can also be noted that the code will work on some architectures but not
others (it will work on x86, but not on x86-64, ...).


but, anyways, this sort of thing is "par for the course" if doing really any
kind of self-modifying code in a C program...
 
R

Rod Pemberton

The basics of the casting was accurately described by Ben and Gene.
Hard to say what it does because it's nonsense. There is no
declaration for z. The comment is meaningless.

If z were declared to be a pointer of any kind, then the (int*) would
cast it to a pointer to int. The * dereferences this pointer.
Consequently
*(int*)(z)
has type int. The
(int (*)())
casts this to a pointer to function returning int. The leftmost *
dereferences this to a function type. The function (if z actually
pointed to one) would be applied to x and y. The value of x is the
program argument count. The value of y is undefined, but in many
implementations, will be an integer version of the pointer argv.

[...]
z is declared as an int, not a pointer

As BGB pointed out, not really important as to what it is, as long as it's
legally cast-able.
and it really works as I
described.

Yes, as Dr. Mclean described, it's likely an address for the memory location
of the code which is called. Some compilers allow memory locations to be
converted to C pointers correctly, although the C spec.'s don't require
this.
After creating a program in memory (like a compiler, only
no executable is written to disk) it appears that the return statement
somehow invokes the compiled (in memory) program.

Yes and No. I don't think anyone really clarified this part of the
process...

The call of the function, i.e., code in memory, is due to the dereferencing
of the function pointer. A function pointer was created by the cast:
(int(*)()) By placing a * in front: *(int(*)()) the * dereferences the
function pointer, i.e., it calls the "function" - which is likely some
assembly code in your case. Since the cast to a function pointer says the
function returns an int, the called code (or "function") should be returning
an int, likely in register eax - due to reasons outside C. Then, that int
value is returned from main, via return() statement. Of course,
"pedantically correct" C should use exit() and not return(). However,
exit() can only return a couple values.


Rod Pemberton
 
B

Ben Bacarisse

The call of the function, i.e., code in memory, is due to the dereferencing
of the function pointer.
True.

A function pointer was created by the cast:
(int(*)())

Also true.
By placing a * in front: *(int(*)()) the * dereferences the
function pointer, i.e., it calls the "function"

Not true or at least a little misleading. In standard C (i.e. after
K&R C) a function called though a pointer. I.e. the call operator
does the dereference. No * is needed though it does no harm. The *
turns the pointer into a function and that function is automatically
turned back into a pointer for the call to happen. Of course, none
this means that anything like this happens at run time -- the compiler
arranges for the code to called with or without the *.

<snip>
 
R

Rod Pemberton

Ben Bacarisse said:
Also true.


Not true or at least a little misleading. In standard C (i.e. after
K&R C) a function called though a pointer.

Ok, it was you who said this was old code a couple posts earlier, yes? So,
why are you now describing in terms of standard C? ... And, using that to
declare "misleading"?
I.e. the call operator
does the dereference.

False.

There is no "call operator" in any generation of C or C specification. So,
*it* can't possibly "do the dereference". Who's misleading who here?

But, let's say a "call operator" did exist. Even so, that's still false.
The "call operator" wouldn't be able to determine if it was trying to call a
function using function type or a function pointer. And, therefore the
operator wouldn't be able to determine when to and when not to dereference.

I think you mean that the function is called once the function pointer is
dereferenced. Isn't that exactly what I said? (No? Close enough? * is the
last syntactical element...) FYI, it's not a cast to a function, see the *
between the int and arg-list? That means function pointer instead of a
function.

There is a cast to a function pointer and a dereference (via indirection
operator) of said pointer. Once dereferenced, the function is called.
There are only a few things you can with a function, such as call it or
convert to function pointer.
No * is needed though it does no harm.

Yes, for ANSI C or later, the syntax optional. It's not implicit. But, the
functionality is required. Optional syntax doesn't mean my statement above
is misleading. The description describes what occurs. In this case, the
code's syntax accurately does so also. We have a function pointer and not a
function. That function pointer must still be dereferenced to a function
type so that the function can be called. The compiler implements the
indirection operator behind the scenes for the optional syntax. Once the
function pointer is dereferenced, then the function can be called. The
dereferencing is not implicit because the syntax for a function call, for
ANSI C or later, is for calling either a function or a function via a
function pointer. The compiler needs to dereference sometimes, but not
always. It must "know" which is which.
The *
turns the pointer into a function

Yes, that's called dereferencing a pointer. It's done by * - the
indirection operator. Once it's a function, the function is called.
and that function is automatically
turned back into a pointer for the call to happen.

Feel free to cite... (any C spec. or H&S will do.) I believe this to be
dependent on the implementation.


Rod Pemberton
 
E

Eric Sosman

Ben Bacarisse said:
[...]
I.e. the call operator
does the dereference.

False.

There is no "call operator" in any generation of C or C specification. So,
*it* can't possibly "do the dereference". Who's misleading who here?

(Should be "whom," by the way.)

While it's true that the C Standard does not use the phrase
"call operator" in normative text, "function-call operator" appears
in the index, with a reference to 6.5.2.2. That section is titled
"Function calls," and the section one level higher, 6.5.2, is called
"Postfix operators." So although you're correct about the letter
of the law, I think you're being unnecessarily picky about the
spirit thereof when you pull the pin on a "False" and hurl it
over the parapet.
But, let's say a "call operator" did exist. Even so, that's still false.
The "call operator" wouldn't be able to determine if it was trying to call a
function using function type or a function pointer. And, therefore the
operator wouldn't be able to determine when to and when not to dereference.

6.3.2.1p4: "[...] Except when it is the operand of the sizeof
operator or the unary & operator, a function designator with type
‘‘function returning type’’ is converted to an expression that has
type ‘‘pointer to function returning type’’. In other words, the
() operator (okay, okay, here's my wrist: slap it) never needs to
worry about encountering a function type, because the left-hand
operand is *always* a function pointer.

(Interested onlookers: Don't get too excited about the mention
of applying sizeof to a function designator. It's syntactically
valid, yes, but it violates a constraint. 6.5.3.4p1: "The sizeof
operator shall not be applied to an expression that has function
type [...]" So, `sizeof sqrt' is an error, not the number of
bytes in a function pointer -- and certainly not the number of
bytes in the square-root function.)
 
B

Ben Bacarisse

Rod Pemberton said:
Ok, it was you who said this was old code a couple posts earlier, yes? So,
why are you now describing in terms of standard C? ... And, using that to
declare "misleading"?

Yes, I said the code might be old but even if it is it is still valid
today. The only thing that is a little misleading is your statement
that the * calls the function. The function will be called with or
without the dereference.

It's often referred to as that, but I am happy to be more formal.
There is no "call operator" in any generation of C or C specification. So,
*it* can't possibly "do the dereference". Who's misleading who here?

But, let's say a "call operator" did exist. Even so, that's still false.
The "call operator" wouldn't be able to determine if it was trying to call a
function using function type or a function pointer. And, therefore the
operator wouldn't be able to determine when to and when not to
dereference.

I think you mean that the function is called once the function pointer is
dereferenced. Isn't that exactly what I said? (No? Close enough? * is the
last syntactical element...) FYI, it's not a cast to a function, see the *
between the int and arg-list? That means function pointer instead of a
function.

That is why the left-most * is not required for the function to be
called. A function call requires an expression denoting a pointer and
an argument list in parentheses. The example given pointlessly turned
the function pointer into a function. That's not (as it stands) a
valid function call. Fortunately, an expression with function type
(called a function designator) is automatically converted to a pointer
(with a few exceptions that don't apply here).
There is a cast to a function pointer and a dereference (via indirection
operator) of said pointer. Once dereferenced, the function is called.
There are only a few things you can with a function, such as call it or
convert to function pointer.

In standard C, all you can really do it convert it to a pointer.
Oddly enough, you can't call it unless it has been so converted.
Yes, for ANSI C or later, the syntax optional. It's not implicit. But, the
functionality is required. Optional syntax doesn't mean my statement above
is misleading. The description describes what occurs. In this case, the
code's syntax accurately does so also. We have a function pointer and not a
function. That function pointer must still be dereferenced to a function
type so that the function can be called. The compiler implements the
indirection operator behind the scenes for the optional syntax. Once the
function pointer is dereferenced, then the function can be called. The
dereferencing is not implicit because the syntax for a function call, for
ANSI C or later, is for calling either a function or a function via a
function pointer. The compiler needs to dereference sometimes, but not
always. It must "know" which is which.


Yes, that's called dereferencing a pointer. It's done by * - the
indirection operator. Once it's a function, the function is called.

No, technically it has to be converted back to a pointer. Odd, I
know, and not what used to happen in K&R C. In K&R C the left-most *
is required.

One reason why I think your explanation could mislead people learning
standard C is that it does not help to explain why

(puts)("text");
(*puts)("text");
(**puts)("text");
(***puts)("text");

are all valid function call expressions (well, statements consisting
of a valid function call expression).
Feel free to cite... (any C spec. or H&S will do.) I believe this to be
dependent on the implementation.

6.3.2.1 paragraph 4 and 6.5.2.2 paragraph 1 in C99 are the most
important. C90 has the same wording.
 
B

BGB / cr88192

bartc said:

because in the process of the cast, it casts a pointer through an int.

on a typical C compiler for x86-64, this will truncate the higher order
bits, quite possibly destroying the pointer (unless the pointer points to
within the low 2GB of the address space).

 
R

Rod Pemberton

Eric Sosman said:
(Should be "whom," by the way.)

"Whom" is singular object like "him" or "her". The object pronoun referred
to by "whom" is someone other than the subject pronoun. I did not mean
"whom" for the second "who". I meant "who" as in "you". The same "you" as
in the subject. "Whom" would've referred to WD, while "who" refers to Ben.
Even if there was a method in English to use the objective pronoun to
reference the subjective pronoun, it doesn't really matter anyway. The
usage of "whom" is dying out in preference of "who", much like "you" for
both singular object and singular subject.

"Who is commonly used for both objective and nominative cases, similar to
the word you."
http://en.wikipedia.org/wiki/Objective_pronoun

http://en.wikipedia.org/wiki/You
While it's true that the C Standard does not use the phrase
"call operator" in normative text, "function-call operator" appears
in the index, with a reference to 6.5.2.2. That section is titled
"Function calls," and the section one level higher, 6.5.2, is called
"Postfix operators." So although you're correct about the letter
of the law, I think you're being unnecessarily picky about the
spirit thereof when you pull the pin on a "False" and hurl it
over the parapet.

In my book, it was flat out false. If he had quoted "call operator" to
indicate that it was his personal usage or internal representation he was
describing and not C itself, there wouldn't have been a need to point out
the truth.

I also think it's rather deceptive of you to strongly imply that because
something, like a function call, has precedence that it too is an operator.
It isn't.


Rod Pemberton
 
E

Eric Sosman

Eric Sosman said:
(Should be "whom," by the way.)
[...]
"Who is commonly used for both objective and nominative cases, similar to
the word you."
http://en.wikipedia.org/wiki/Objective_pronoun

Ah! Well, if Wikipedia is authoritative concerning usage ...
[... function call is/is not an operator ...]
In my book, it was flat out false. If he had quoted "call operator" to
indicate that it was his personal usage or internal representation he was
describing and not C itself, there wouldn't have been a need to point out
the truth.

... then you might want to look at Wikipedia's page on C and
C++ operators, with special attention to the second item listed
under "Other operators."

"For 'tis the sport to have the enginer
Hoist with his own petar ..."
I also think it's rather deceptive of you to strongly imply that because
something, like a function call, has precedence that it too is an operator.
It isn't.

Deceptive of me? Or of the Standard? It's the latter that
describes function calls in a section titled "Postfix operators"
and indexes the description under the term "function-call operator."
Take your complaint to the authors thereof, I'd say.
 
R

Rod Pemberton

Ben Bacarisse said:
Yes, I said the code might be old but even if it is it is still valid
today. The only thing that is a little misleading is your statement
that the * calls the function.

That's true for old code. It's an error without it, for old code. How is
that misleading? You're describing in terms of the "modern" meaning, and
saying _not_ that the old meaning is misleading or no longer applies, but
that *my* statement is misleading... How warped is that?
The function will be called with or
without the dereference.

In modern code, yes, the dereference is superfluous. Old code, no. It must
be dereferenced for old code.
The example given pointlessly turned
the function pointer into a function.

It wasn't pointless. It was needed at one point in time. It also presents
a cleaner model of dereferencing.
No, technically it has to be converted back to a pointer. Odd, I
know, and not what used to happen in K&R C. In K&R C the left-most *
is required.
See...

One reason why I think your explanation could mislead people learning
standard C is that it does not help to explain why

(puts)("text");
(*puts)("text");
(**puts)("text");
(***puts)("text");

are all valid function call expressions (well, statements consisting
of a valid function call expression).

Who would code anything other than the first line for standard C? There is
no need to dereference a function pointer (or function) in standard C. If
students are being taught that, they won't either. Who cares if the example
dereferenced lines above are syntactically valid? The IOCCC has many
examples of syntactically valid C code too. Many are completely unreadable
by humans. So, your point seem moot to me - a solution in search of a
problem.

For pre-standard C, the last three are errors since puts is a function, not
pointer to a function. The second would work for pre-standard C if a
function pointer, and not a function like puts, was being dereferenced. So,
no one using pre-standard C would code the last two lines.
6.3.2.1 paragraph 4 and 6.5.2.2 paragraph 1 in C99 are the most
important. C90 has the same wording.

Well, it is in H&S 3rd... So, I guess it's not that important. Old code
with functions and single dereferenced function pointers work with the "new"
model.


Rod Pemberton
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,103
Messages
2,570,642
Members
47,245
Latest member
LatiaMario

Latest Threads

Top