printf doubt

M

Michael Wojcik

What systems use separate stacks for return addresses and
arguments? I am not aware of any in the modern world, so it
would be educational to hear about them.

The ever-lovin' AS/400 (now iSeries) OPM (Original Program Model)
appears to, based on the documentation for MI, the pseudo-assembly
language supplied with OS/400. (I say "appears to" because I'm
taking this from the somewhat-sparse documentation, my own
recollections, and what I've read from IBM insiders on Usenet; I'm no
expert on AS/400 internals.)

The first thing to remember about the AS/400, for programs running
in normal iOS mode (and not PACE or NT or Linux or whatever else
might be supported these days), is the "single-level store". Every
object in the machine has one address in a single 64-bit virtual
address space. (Pointers are actually 128 bits because they contain
a bunch of other information.) There's no per-process virtual
addressing.

Thus in a sense return addresses and parameters are in the "same"
area, but only because there's only one area. They probably won't
be contiguous, and it's not really a stack.

Security is provided by a combination of trusted compilation from
intermediate code into the native instruction stream, the
underlying capability architecture, and some hardware protections,
though the last play less of a role than in more typical process-
virtual-addressing systems.

MI (and the languages based on it such as OPM COBOL) supports three
call operations:

Call Internal (CALLI)
Call External (CALLX)
Call Program with Variable Length Argument List (CALLPGMV)

(There's also "program activation", which is sort of like dynamic
loading without invoking; and "transfer control", which is like
the exec system call in Unix; but I'll ignore those here.)

For an internal call, which creates a subactivation within the
current activation (the equivalent of a "normal call" on one of
those puny regular computers), the calling routine provides an
instruction pointer as the third operand, and the return address
is placed in that pointer. The called routine receives this
pointer and uses it as the operand of a Branch (B) instruction
to return. (Yes, an internal return is just a branch.)

Said instruction pointer must be allocated from (what passes for)
the heap or (what passes for) static data by the calling routine.

Paramerers, on the other hand, are specified an an operand list (OL)
pointer which is another operand of the CALLI operation. The called
routine should declare an operand list corresponding to the
parameters it expects to receive. When it's called, it can reference
those parameters, which will create a temporary mapping in the form
of space pointers to the parameter objects (all parameters are passed
by reference).

External calls are similar, except that a new activation is created
using the specified program object (which is sort of like a program
or shared library on less-interesting systems), and control is passed
to its program entry procedure (if it's a bound program) or external
entry point (if it's non-bound). It also takes an OL as an operand,
but rather than a return instruction pointer, the calling routine can
provide an optional "instruction definition list", which is basically
a list of addresses that the called program can return to. So the
called program doesn't have to return to the place from whence it was
called. Fun!

An external-called program returns using the External Return (RETX)
opcode. If the called program is non-bound and the caller supplied
a return list, the called program can provide an integer operand
that's an index into the return list to say which return point to
return to.

There are a bunch of other differences between external calls to
bound and non-bound programs, but I'll gloss over them.

Call Program with Variable Length Argument List is like CALLX except
that it takes an array of parameters and a count, and it omits the
list of possible return locations (the called program has to return
to the place it was called from).

Of course, these days not many people use OPM; they've mostly
switched to ILE, the Integrated Language Environment. ILE does use a
call stack and supports pass-by-value (and pass-by-reference and
pass-by- reference-to-temporary-copy), though it's still rather more
complicated than just the sort of move-and-decrement that's typical
of less-CISCy architectures.

These and many more intriguing details can be found at:

http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp

(Requires Javascript and all manner of such things.) Look especially
at Programming -> APIs -> MI Programming and Programming -> Languages
-> ILE Concepts.
Such a system might be
more resistant to "stack smashing" buffer overflow attacks, for
one thing.

Well, the '400 is certainly more resistant than many other platforms
to stack smashing.
 
B

Barry Schwarz

What systems use separate stacks for return addresses and
arguments? I am not aware of any in the modern world, so it
would be educational to hear about them. Such a system might be
more resistant to "stack smashing" buffer overflow attacks, for
one thing.

I don't of any systems using separate stacks but I am familiar with
one that uses no stacks. IBM's MVS system and all its descendants
through the current z/OS use a combination linked list (for return
addresses) and pointer to pseudo-array (for arguments). Ignoring
superfluous arguments requires no action in this system since they are
always after the relevant arguments in the pseudo-array. And it is
very resistant to buffer overflow attacks but the calling convention
is only one of many reasons.
A common convention is for the callee to pop off the return
address and leave the arguments on the stack. The caller then
pops the arguments.

An obvious solution which I'm embarrassed to admit I didn't consider
before I asked the question.


Remove del for email
 
C

Coos Haak

Op Tue, 11 Jul 2006 19:13:09 -0700 schreef Ben Pfaff:
What systems use separate stacks for return addresses and
arguments? I am not aware of any in the modern world, so it
would be educational to hear about them. Such a system might be
more resistant to "stack smashing" buffer overflow attacks, for
one thing.

I think of Forth processors with code generated by a C compiler.
 
R

robertwessel2

Ben said:
What systems use separate stacks for return addresses and
arguments? I am not aware of any in the modern world, so it
would be educational to hear about them. Such a system might be
more resistant to "stack smashing" buffer overflow attacks, for
one thing.


IPF more-or-less does. The registers are saved by the RSE in one
stack, which is distinct from the "normal" application stack. And
since subroutine calls always save the return address in a register,
they get saved in the RSE stack. Parameters passed in registers also
get saved in the RSE stack, which is why I said more-or-less.

Of course that assumes that anything gets saved in the RSE stack, which
only happens when there are no more free registers.

More-or-less the same thing happens with SPARC register Windows. C on
a number of the smaller embedded processors works that way too.

And yes, separating the returns (and register saves) pretty much makes
that particular type of stack smash impossible (of course that doesn't
help if any other function pointers are stored on the data stack) .

FWIW, implementing a compiler that did this on, say x86, would be
trivial, and would add only modest overhead to each function call. You
could continue to use esp for return addresses and register saves, and
use ebp for stack frames.
 
J

Joe Wright

Ben said:
What systems use separate stacks for return addresses and
arguments? I am not aware of any in the modern world, so it
would be educational to hear about them. Such a system might be
more resistant to "stack smashing" buffer overflow attacks, for
one thing.


A common convention is for the callee to pop off the return
address and leave the arguments on the stack. The caller then
pops the arguments.

Missing the point maybe. There are some number of arguments to printf,
one of them being the format string. Let's assume a format string and
four other expressions. That's five arguments. If the format string
describes what to do with only two expressions, so be it. Only two will
be treated. But five arguments were pushed and five will be popped.
 
K

Keith Thompson

Joe Wright said:
Missing the point maybe. There are some number of arguments to printf,
one of them being the format string. Let's assume a format string and
four other expressions. That's five arguments. If the format string
describes what to do with only two expressions, so be it. Only two
will be treated. But five arguments were pushed and five will be
popped.

Which means that a calling convention where the callee needs to know
how many arguments were passed won't work for printf() or other
variadic functions (unless there's an extra implicit argument that
provides that information).

A C compiler *could* use such a convention for non-variadic functions,
but I think most compilers use the same convention for both (since
there was no such distinction in early C).
 
F

Flash Gordon

Keith said:
Which means that a calling convention where the callee needs to know
how many arguments were passed won't work for printf() or other
variadic functions (unless there's an extra implicit argument that
provides that information).

A C compiler *could* use such a convention for non-variadic functions,
but I think most compilers use the same convention for both (since
there was no such distinction in early C).

At least one popular compiler does not. In the documentation for
Microsoft Visual Studio .NET under the description for __stdcall it says:
The __stdcall calling convention is used to call Win32 API functions.
The callee cleans the stack, so the compiler makes vararg functions
__cdecl. Functions that use this calling convention require a
function prototype.

So if the compiler is set to use __stdcall by default it will use
different calling conventions for variadic and non-variadic functions.
 
A

Al Balmer

At least one popular compiler does not. In the documentation for
Microsoft Visual Studio .NET under the description for __stdcall it says:
The __stdcall calling convention is used to call Win32 API functions.
The callee cleans the stack, so the compiler makes vararg functions
__cdecl. Functions that use this calling convention require a
function prototype.

At least one compiler, Watcom C (now OpenWatcom) allows the user to
specify one of several calling conventions. In fact, I think you can
even make up your own.
 
F

Flash Gordon

Al said:
At least one compiler, Watcom C (now OpenWatcom) allows the user to
specify one of several calling conventions. In fact, I think you can
even make up your own.

That would be even more interesting than MSVC++. MSVC++ doesn't let you
define your own, but it does allow you to select which of the calling
conventions to use by default (if not overridden using an MS extension).
 
M

Mark L Pappin

Ben Pfaff said:
What systems use separate stacks for return addresses and
arguments? I am not aware of any in the modern world, so it
would be educational to hear about them. Such a system might be
more resistant to "stack smashing" buffer overflow attacks, for
one thing.

There are still a few chips around with dedicated call/return stacks
and no way to shove arbitrary bytes on to them (and the dedicated
stacks are usually pretty small so you wouldn't want to, anyway).
Flavour of the <insert time unit> in the embedded world is the PIC,
which also happens to be Harvard architecture and so can't quietly
execute arbitrary data. The Venerable 8051 may have the former (I'm
not so familiar with it) and certainly also has the latter.
A common convention is for the callee to pop off the return
address and leave the arguments on the stack. The caller then
pops the arguments.

Known to the "All the world's a PeeCee with Turbo Pascal and MS C"
crowd as "C calling convention" (you'll still see "CDECL" or "cdecl"
scattered through source), contrasted with "everything else calling
convention" precisely because a variadic callee can NOT know how many
args were actually passed - ONLY the caller can possibly know.

mlp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,188
Messages
2,571,002
Members
47,591
Latest member
WoodrowBut

Latest Threads

Top