Decreasing order of address within main

K

Kenny McCormack

That's the jist of what I've been saying all along.

Lemma: Most newsgroups have a general ethos that questions that are
covered (i.e., answered) in the FAQs or other generally available
material is inappropriate for posting. I.e., the response to "what does
'i = i++' do?" is "Read the FAQ! (Don't bother us!)". While this
condemnation is not precisely that the question is "off topic", the
effect is the same - i.e., that the question is "inappropriate".

Therefore, when you combine the above lemma with the strict ban on
anything *not* in the C standard, you come to the (obvious to anyone
with a lick of sense) conclusion that nothing is acceptable here.

Notes:
1) Clearly, I am including the C standard documents as among the
"generally available material" (that everyone is assumed to have
access to and to have read cover-to-cover before posting here - even
though most of the posters [*] to this group have probably never
even heard of it).

2) Yes, there is a small window for so-called "language lawyering" -
that is, where people who really have no lives argue about tiny
minutiae in the standards documents - that no sensible person or
working programmer is like to care about. At best, this accounts
for about 5% of the volume of postings here.


[*] Measured by actual numbers of posters, not by volume of postings
(of course...!)
 
K

Keith Thompson

Mark McIntyre said:
Firstly its my understanding that n1256 is the final draft, not the
edited final version.

I suppose that depends on what you mean by those terms.

The current official C standard, as I understand it, consists of
C99 plus the three Technical Corrigenda. There is currently no
*official* single document that is the C standard. n1256 is an
attempt (and a darned good one with two very minor exceptions)
at creating what that single official document would look like if
it existed.

The term "draft" implies an early version of something that will
become official. In that sense, I don't think n1256 is really a
"draft".

On the other hand, it does say "Committee Draft" at the top of
each page (right before the "Septermber"), so perhaps I'm missing
something.
Secondly, I can get all sorts of stuff for free which I choose to pay
for in order to support the authors and/or the service they render.

Sure, but my understanding is that the $18 I paid for the C99
standard, or the $30 I'd pay if I bought it today, or the $???
I'd have to pay for a hard copy, doesn't go to the people who did
the actual work of writing the standard.
 
N

Nobody

Any sufficiently aggressive optimiser won't even bother setting
any variable set to such an undefined value. Nor will it set any
future values dependent on that variable.

I'm not sure if there are any sufficiently aggressive optimisers
out there, but not prepared to sloppily write UB in order to find
out.

Consider the following code (paraphrasing a bug which was recently
discovered in the linux kernel):

1 int x = p->x;
2 if (!p) return;
...

Some versions of gcc are sufficiently aggressive that they optimise line
2 out of existence.

The rationale is that because p->x had already been evaluated at line 1,
p being null leads to undefined behaviour. If p is not null, the return on
line 2 won't occur, but if p is null, the compiler is free to return, or
not return, or do whatever else it feels like doing.

At first, I was a bit confused as to what kind of optimisation strategy
would do this. My guess is that it uses a form of /reductio ad absurdum/,
(i.e. "if x implies UB, x is false") when making deductions about the
possible values an expression can have.
 
S

Seebs

Consider the following code (paraphrasing a bug which was recently
discovered in the linux kernel):

1 int x = p->x;
2 if (!p) return;
...

Some versions of gcc are sufficiently aggressive that they optimise line
2 out of existence.

Yes.

Ran into at least one of these in a very nasty bit of kernal internals
which relied on actually performing an apparently-irrelevant test for
null. Caused hangs on exactly one architecture.

-s
 
P

Phil Carmody

Seebs said:
Yes.

Ran into at least one of these in a very nasty bit of kernal internals
which relied on actually performing an apparently-irrelevant test for
null. Caused hangs on exactly one architecture.

More subtle is when the null value appears during linked list
traversal, and you simply 'pre-calculate' something dependent
on that pointer in order to simplify the code.

Phil
 
P

Processor-Dev1l

No you didn't.


No, it doesn't.


You can never confirm that something is "defined" by running code.

See, "defined" doesn't mean "it happened to work that way once".  It
means "we are guaranteed that this will always work, or at least that
failure to do so is clearly a bug in the compiler."

Imagine that you live in apartment #323.

You could make the claim:  "All apartment numbers are defined to be
palindromes."  You write a little program to check it, you plug in 323,
it confirms:  The number is a palindrome.

But you haven't actually checked what you'd *need* to check, which is
*every possible number*.

To make the claim that subtraction is "defined", you must not only
run your program on every compiler, for every kind of computer, with every
set of options.  You must also do it on every compiler that will exist
in the future, for machines that haven't even been designed yet.

... Or you could just use the language *definition* to tell you what is or
is not *defined*.  Note the relationship there; "defined" is a function of
the language specification, not of any specific real-world implementation..

The reason this matters is that you may someday want to use a different
compiler, and you may encounter a system where the subtraction crashes in
all cases, for instance.

-s

Well, I think the position of variables in memory is not caused by UB
but by CPU itself (if it uses memory in big or little endian).
x86 uses way when less significant bit is higher in memory so it is
reversed.
If int is set to 4B then sequence of variables will have addresses
0,-4,-8,-12, etc.
 
S

Stephen Sprunk

Processor-Dev1l said:
Well, I think the position of variables in memory is not caused by UB
but by CPU itself (if it uses memory in big or little endian).
x86 uses way when less significant bit is higher in memory so it is
reversed.
If int is set to 4B then sequence of variables will have addresses
0,-4,-8,-12, etc.

As far as Standard C is concerned, it's UB to even _try to find_ this
information.

Other standards, such as your platform's ABI, might define the behavior;
for instance, x86 systems put "auto" variables in a continuous,
downward-growing stack, but not all systems do and any code that relies
on this (or any other) behavior is inherently non-portable. It's left
undefined in Standard C for a reason.

S
 
S

Stephen Sprunk

Nobody said:
Consider the following code (paraphrasing a bug which was recently
discovered in the linux kernel):

1 int x = p->x;
2 if (!p) return;
...

Some versions of gcc are sufficiently aggressive that they optimise line
2 out of existence.

The rationale is that because p->x had already been evaluated at line 1,
p being null leads to undefined behaviour. If p is not null, the return on
line 2 won't occur, but if p is null, the compiler is free to return, or
not return, or do whatever else it feels like doing.

At first, I was a bit confused as to what kind of optimisation strategy
would do this. My guess is that it uses a form of /reductio ad absurdum/,
(i.e. "if x implies UB, x is false") when making deductions about the
possible values an expression can have.

GCC has an feature that tracks whether it's possible for a pointer to be
null; if you dereference a pointer, GCC then sets the "notnull"
attribute on it and any future checks for a null pointer are optimized
away. If the code branches after a check for null, the branch taken in
the not-null condition will also have the attribute set until the two
branches merge again. The programmer can also set the attribute
manually if desired, though I can't think of any scenario where that'd
be safe and useful.

I assume that this optimization is to remove redundant tests/branches
and therefore improve performance; presumably it wouldn't be there if it
didn't help in at least some cases.

S
 
K

Keith Thompson

Kenneth Brody said:
Keith said:
Kenneth Brody said:
Well, to be fair, not everyone has access to the "final" version of
the Standard. From what I understand, you need to pay for that,
though "near final" draft versions are available for free. Also, not
everyone understands the "legalese" of the text.
[...]

The C99 standard itself costs money (something like $30 US for a PDF
copy). But
<http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf>
is free; it includes the full C99 standard with all the changes
specified in the three Technical Corrigenda folded in. For most
purposes, I consider it better than the C99 standard itself. (For
some purposes, C99 plus copies of the three Technical Corrigenda might
be better, since n1256 is marginally less official.)

Thanks for the link. I currently have n1124, which I assume is
superseded by n1256?

Yes. n1124 incorporates TC1 and TC2. n1256 incorporates TC1, TC2,
and TC3 (plus the creating spelling of "Septermber").

(And thank you for spelling "superseded" correctly!)
 
K

Keith Thompson

Processor-Dev1l said:
Well, I think the position of variables in memory is not caused by UB
but by CPU itself (if it uses memory in big or little endian).
x86 uses way when less significant bit is higher in memory so it is
reversed.
If int is set to 4B then sequence of variables will have addresses
0,-4,-8,-12, etc.

Undefined behavior doesn't "cause" anything. It just means that
the behavior is undefined. It gives the implementation permission
to do quite literally anything. Whatever actual behavior happens
to occur is the result of -- well, of whatever caused it. But it's
outside the scope of the C language and standard.

If I fail to tell you what to do, and you go off and do X, I didn't
cause you to do X.

Incidentally, when you post a followup, please snip any quoted
text that isn't relevant to your followup. In particular, don't
quote signatures. Keep just enough quoted text so your followup
makes sense on its own to someone who didn't necessarily see the
parent article. See this followup for an example.
 
M

Morris Keesan

As far as Standard C is concerned, it's UB to even _try to find_ this
information.

Surely not.

printf("%p %p %p\n", (void *)&a, (void *)&b, (void *)&c);

doesn't invoke any undefined behaviour as far as I can tell.
The standard doesn't specify what values will be printed, and
the way those values will be represented as printing characters is
implementation-defined, but there's no UB there.

Similarly, this code
#include <stdint.h>

...

intptr_t aptr, bptr; /* or uintptr_t */
aptr = (intptr_t)(void *)&a;
bptr = (intptr_t)(void *)&b;

printf("a has a %s address than b\n", (aptr < bptr) ? "lower" :
"higher"));

allows one to try to find the information. There's no guarantee that
intptr_t
or uintptr_t is available, but the worst that can happen there is failure
to
compile, not UB. And even if the type exists, that doesn't mean that the
values
of aptr and bptr will correspond in any expected way to numerical memory
addresses,
but again, no UB.
 
S

Seebs

Well, I think the position of variables in memory is not caused by UB
but by CPU itself (if it uses memory in big or little endian).

This is totally wrong.

No one was arguing that the position of variables in memory was caused by
undefined behavior; rather, only that your attempt to figure out what the
differences between those positions was invoked undefined behavior.
x86 uses way when less significant bit is higher in memory so it is
reversed.

Doesn't affect location of variables in memory at all.
If int is set to 4B then sequence of variables will have addresses
0,-4,-8,-12, etc.

Except they won't always. They might be stored out of order. They might
be stored in totally different regions of memory, such that comparisons
between them yield nonsense.

To understand C, you have to learn that you *don't need to know*. And that
the answer can vary wildly. There is nothing prohibiting a system where
the relative addresses of the variables might be different between one call
and another to the same function. (And indeed, I can describe a plausible
real-world example...*)

-s
[*] Left as an exercise for the reader, for now.
 
K

Keith Thompson

Kenneth Brody said:
Keith said:
Kenneth Brody said:
Keith Thompson wrote: [...]
<http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf> [...]
Thanks for the link. I currently have n1124, which I assume is
superseded by n1256?

Yes. n1124 incorporates TC1 and TC2. n1256 incorporates TC1, TC2,
and TC3 (plus the creating spelling of "Septermber").

Hmm... My n1124 says "May 6, 2005", and neither "Septermber" nor
"September" appears anywhere in it. (Or was it TC3 that had that
typo?)

It's n1256 that has

ISO/IEC 9899:TC3 Committee Draft Septermber 7, 2007 WG14/N1256

at the top of almost every page (except that it doesn't appear on page
1, and the order of the fields alternates on even and odd pages).
Neither n1124 nor TC3 has that error. (TC3 is a 10-page document
listing just the changes.)

[...]
 
S

Stephen Sprunk

Morris said:
Surely not.

printf("%p %p %p\n", (void *)&a, (void *)&b, (void *)&c);

doesn't invoke any undefined behaviour as far as I can tell.
The standard doesn't specify what values will be printed, and
the way those values will be represented as printing characters is
implementation-defined, but there's no UB there.

That's arguable. Technically it's not UB, but in effect you're causing
the same UB as exhibited below, just performed by a human instead of the
computer.
Similarly, this code
#include <stdint.h>

...

intptr_t aptr, bptr; /* or uintptr_t */
aptr = (intptr_t)(void *)&a;
bptr = (intptr_t)(void *)&b;

printf("a has a %s address than b\n", (aptr < bptr) ? "lower" :
"higher"));

allows one to try to find the information. There's no guarantee that
intptr_t or uintptr_t is available, but the worst that can happen there
is failure to compile, not UB.

Using a relative comparison operator on pointers that do not point into
the same object is UB. Only testing for (in)equality is defined in that
case.
And even if the type exists, that doesn't mean that the values of aptr
and bptr will correspond in any expected way to numerical memory
addresses, but again, no UB.

Many, many implementations (AFAIK all the ones with a flat address
space) define this, but not the C Standard itself.

Consider a segmented architecture, such as the AS/400 or x86 real mode,
where each object may be in a different segment; relative comparisons
between segments is meaningless, which is _why_ those operations had to
be left undefined.

S
 
K

Keith Thompson

Stephen Sprunk said:
Morris Keesan wrote: [...]
Similarly, this code
#include <stdint.h>

...

intptr_t aptr, bptr; /* or uintptr_t */
aptr = (intptr_t)(void *)&a;
bptr = (intptr_t)(void *)&b;

printf("a has a %s address than b\n", (aptr < bptr) ? "lower" :
"higher"));

allows one to try to find the information. There's no guarantee that
intptr_t or uintptr_t is available, but the worst that can happen there
is failure to compile, not UB.

Using a relative comparison operator on pointers that do not point into
the same object is UB. Only testing for (in)equality is defined in that
case.
And even if the type exists, that doesn't mean that the values of aptr
and bptr will correspond in any expected way to numerical memory
addresses, but again, no UB.

Many, many implementations (AFAIK all the ones with a flat address
space) define this, but not the C Standard itself.

Consider a segmented architecture, such as the AS/400 or x86 real mode,
where each object may be in a different segment; relative comparisons
between segments is meaningless, which is _why_ those operations had to
be left undefined.

Even for flat-address-space implementations, the standard (optionally)
provides both intptr_t, a signed type, and uintptr_t, an unsigned
type, with no indication of which is more suitable. Addresses
corresponding to the intptr_t values -1 and 0 might be adjacent, or
they might be at opposite ends of the address space; likewise
for UINTPTR_MAX and 0.
 
P

Phil Carmody

Stephen Sprunk said:
That's arguable. Technically it's not UB, but in effect you're causing
the same UB as exhibited below, just performed by a human instead of the
computer.

How can something which is not UB cause UB? Care to point to somewhere
in the standard which permits that?
Using a relative comparison operator on pointers that do not point into
the same object is UB. Only testing for (in)equality is defined in that
case.

Straw man - what pointers? I see a comparison of integer types, viz
integer types capable of holding object pointers.

Phil
 
P

Phil Carmody

Seebs said:
To understand C, you have to learn that you *don't need to know*. And that
the answer can vary wildly. There is nothing prohibiting a system where
the relative addresses of the variables might be different between one call
and another to the same function. (And indeed, I can describe a plausible
real-world example...*)

-s
[*] Left as an exercise for the reader, for now.

I was about to say "no way!", but think that with the joys of inlining
and as-if, it becomes quite easy.

//...
inline void copy(struct thing *p,
struct thing *q,
bool direction)
{
void *pv=p, *qv=q;
if(direction) { memcpy(pv,qv,sizeof(thing)); }
else { memcpy(qv,pv,sizeof(thing)); }
}

I imagine that copy(x,y,0) and copy(y,x,1) could cause the values
representing pv and qv to be in different relative locations in
memory on register-sparse systems.

One doesn't even need inlining for that, simply an cooperative-
enough optimiser.

Phil
 
N

Nick Keighley

it is usual not to quote sigs (the bit after "-- ")

Well, I think the position of variables in memory is not caused by UB

you are correct the "position" of variables in memory is not caused by
UB.
But then he didn't say that. He said it is undefined behaviour to
subtract (or compare) two pointers that do not point to the same
object.

so
int i, j;
long diff = i - j;

is Undefined Behaviour (even if long is big enough to hold the result
of a pointer subtraction. In a sense i and j are not even in the same
address space.
but by CPU itself (if it uses memory in big or little endian).

I don't think you understand what endianess is. It has nothing
to do with the way addresses are allocated to variables.
x86 uses way when less significant bit is higher in memory so it is
reversed.
If int is set to 4B then sequence of variables will have addresses
0,-4,-8,-12, etc.

nonsense, I'm afraid
 
N

Nick Keighley

the output might be

":red-segment: :blue-segment: :beige-segment:"
How can something which is not UB cause UB? Care to point to somewhere
in the standard which permits that?

you're comparing pointers to different objects which is UB.
I like the idea that my mind can exhibit undefined behaviour...
Have I reformatted my hard drive just by thinking about this stuff?
:)


hmm. well that's unspecified behaviour. Though we know aptr
and bptr will end up with valid integers.
Straw man - what pointers? I see a comparison of integer types, viz
integer types capable of holding object pointers.

interesting. The complier would have to remember that they had been
pointers
 
R

Richard Tobin

Stephen Sprunk said:
GCC has an feature that tracks whether it's possible for a pointer to be
null; if you dereference a pointer, GCC then sets the "notnull"
attribute on it and any future checks for a null pointer are optimized
away.
[...]
I assume that this optimization is to remove redundant tests/branches
and therefore improve performance; presumably it wouldn't be there if it
didn't help in at least some cases.

As I've said before, I wish it would tell you when it's doing
this, as it traditionally has with simpler optimisations such as
always-true comparisons. Being able to remove a chunk of code
can be a sign of a mistake by the programmer, and just removing
it often makes the results of the error even more obscure.

-- Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,154
Members
46,701
Latest member
XavierQ83

Latest Threads

Top