Pre And Post Increment Operator Output

  • Thread starter Yogesh Yadav Pacheria
  • Start date
B

Bartc

Ben Bacarisse said:
For those who haven't read it, Bartc is characterising the question so
as to make it look absurd. -1 < 5 is always 1 in C, and no relational
expression is ever "False" (I'm not sure what the significance of the
capital F is but it looks significant).

The expression was something like: (-1 < sizeof("test")), and it gave a
result of 0.

The problem is that sizeof("test") is 5u, not 5, which is the cause of the
trouble.
I am not sure, but I suspect that Eric's remark about "rash opinions" is
more likely to be about your suggestions for how C's mixed type
arithmetic should be done. I think it *is* rash to suggest how a
programming language should behave unless you know it very well indeed.

I don't think that is necessary. That have been plenty of proposals in this
group by people who cannot all have been experts in the language.
Just in case, here's mine again: the notion of common sense is not
applicable to programming languages, as this very thread shows. For
people who don't program but who have common sense in abundance, the
only response to seeing

i=0; i=++i+ ++i+ ++i;

should be to ask what that notation means. People who know one or two
languages in which that sequence of tokens has a well-defined meaning
would be most rash to assume it has the same meaning in another
language. To make such an assumption is not common sense -- it's the
opposite of common sense. Such people (coming from a C# and Java
background for example) may well have a deceptive expectation, but their
common sense should save them from it.

Try it yourself. What does this do:

f = (+1);

and is any language which does not match what your expectation flawed as
a result?

You've obviously got one in mind, perhaps some functional language, where
the above does something different.

The obvious guess, is that it sets f to the value 1 (and half-a-dozen
languages back me up). That would make my guess 'reasonable'.

Is there something about that syntax? The "=", "++" and "+" symbols of the
original example were fairly standard, and I would expect the "=", "+" and
"()" of yours to be the same.

(There has to be some set of programming constructions that are understood,
more-or-less, by everybody, otherwise there would be no such thing as
pseudo-code.)

And I didn't say C was flawed in that respect. I can understand exactly why
C doesn't like modifying the same thing several times in the same
expression; it would be a nightmare guaranteeing consistent, predictable
results. So it bans all such things, rather than allow certain simpler forms
to be well-defined.
 
K

Keith Thompson

Bartc said:
The expression was something like: (-1 < sizeof("test")), and it gave a
result of 0.

The problem is that sizeof("test") is 5u, not 5, which is the cause of the
trouble.
[...]

A quibble: The result of sizeof("test") is (size_t)5 .

5u is specifically of type unsigned int; size_t is *some* unsigned type,
but not necessarily unsigned int. For example, sizeof("test") could
easily be 5ul.
 
B

Ben Bacarisse

Bartc said:
The expression was something like: (-1 < sizeof("test")), and it gave a
result of 0.

The problem is that sizeof("test") is 5u, not 5, which is the cause of the
trouble.

Yes. That would have been a less... provocative summary of the old
thread.
I don't think that is necessary. That have been plenty of proposals in this
group by people who cannot all have been experts in the language.

I think we are just talking at cross purposes. I did not say that
expert knowledge is necessary, just that making proposals without it is
rash. And, lest anyone assume otherwise, I don't believe all rash
behaviour is wrong or bad.
You've obviously got one in mind, perhaps some functional language, where
the above does something different.

Yes, I did, but the details hardly matter. To understand a program
text you need to know what the symbols are and what they mean. I know
two wildly different meaning for that arrangement of symbols, but I
would not be surprised if there were others.
The obvious guess, is that it sets f to the value 1 (and half-a-dozen
languages back me up). That would make my guess 'reasonable'.

Is there something about that syntax? The "=", "++" and "+" symbols of the
original example were fairly standard, and I would expect the "=", "+" and
"()" of yours to be the same.

"Fairly standard" just means you are used to it. There are lots of
languages where = is not assignment. There are many where ++ just two
unary operators. Brackets mean many things, even in C, let alone in
all the other languages that use them for their own purposes.

The available character set is limited, so symbols get re-used for many
meanings.
(There has to be some set of programming constructions that are understood,
more-or-less, by everybody, otherwise there would be no such thing as
pseudo-code.)

Yes, but i = ++i + ++i + ++i; should never appear in anything that goes
by the name pseudo-code.
And I didn't say C was flawed in that respect. I can understand exactly why
C doesn't like modifying the same thing several times in the same
expression; it would be a nightmare guaranteeing consistent, predictable
results. So it bans all such things, rather than allow certain simpler forms
to be well-defined.

OK, I conflated the two threads. You did consider C flawed for its
definition of mixed type arithmetic, and I just assumed that saying
the result of the above expression "ought to be 6" meant you considered
it a flaw. I accept that you don't.
 
R

Richard Damon

I've never liked the view that anything can happen with undefined code;
it sounds too much like a cheap pedagogical trick to say that the above
could format your drive.

One issue with C, is that when C was first being defined, computer
architectures could have some strange rules with limitations on things
like writing to and reading from a given location sequentially could
cause weird results, sometimes totally random numbers, sometimes bus
faults. Hardware designers didn't feel bad about putting restrictions on
"silly" combinations of operations, or making the software have to wait
for results to be done.

C, in an effort to be as efficient as possible, choose to make some of
these corner cases "undefined behavior" rather than make the compiler
add test to the code to detect these. While in the example, it looks
obvious of the double access, it real code, it might be better hidden
with something like:


int foo(int *a, int *b) {
return ++*a + ++*b;
}

int main() {
int i = 0;
i = foo(&i, &i);
printf("%d\n", i);
}

Where the code doing the dual access has two pointers, which happen to
be set to point to the same spot elsewhere. Note that the function foo
has well define behavior as long as a != b.

Now, it is more common that the processor has logic to detect these sort
of conflicts and rather than "getting it wrong", they stall the
processor a bit so the answer can get there, but that took too much
logic to be worth it in the earlier days.
 
E

Eric Sosman

One issue with C, is that when C was first being defined, computer
architectures could have some strange rules with limitations on things
like writing to and reading from a given location sequentially could
cause weird results, sometimes totally random numbers, sometimes bus
faults. Hardware designers didn't feel bad about putting restrictions on
"silly" combinations of operations, or making the software have to wait
for results to be done.

C, in an effort to be as efficient as possible, choose to make some of
these corner cases "undefined behavior" rather than make the compiler
add test to the code to detect these. While in the example, it looks
obvious of the double access, it real code, it might be better hidden
with something like:


int foo(int *a, int *b) {
return ++*a + ++*b;
}

int main() {
int i = 0;
i = foo(&i, &i);
printf("%d\n", i);
}

Where the code doing the dual access has two pointers, which happen to
be set to point to the same spot elsewhere. Note that the function foo
has well define behavior as long as a != b.

Now, it is more common that the processor has logic to detect these sort
of conflicts and rather than "getting it wrong", they stall the
processor a bit so the answer can get there, but that took too much
logic to be worth it in the earlier days.

Even if every CPU of interest can generate a consistent result
for such a case, it's not a given that two different CPU's -- even
different steppings of "the same" CPU -- will yield the same answer.

Also, let's rewrite your function just a little:

int foo(int *a, int *b) {
return *a == *++b;
}

What "should" the result be when a == b? "Should" *a yield the
original or the incremented value, and why? Or how about

int foo(int *a, int *b, int *c) {
return *a * *++b + *a * *++c;
}

.... when a==b and a==c? May the compiler apply the distributive
law and do common sub-expression elimination? Why, or why not,
and what are the implications for optimizers?
 
R

Richard Damon

Even if every CPU of interest can generate a consistent result
for such a case, it's not a given that two different CPU's -- even
different steppings of "the same" CPU -- will yield the same answer.

Also, let's rewrite your function just a little:

int foo(int *a, int *b) {
return *a == *++b;
}

What "should" the result be when a == b? "Should" *a yield the
original or the incremented value, and why? Or how about

int foo(int *a, int *b, int *c) {
return *a * *++b + *a * *++c;
}

... when a==b and a==c? May the compiler apply the distributive
law and do common sub-expression elimination? Why, or why not,
and what are the implications for optimizers?

If the variation was just based on the unspecified order of operations,
but CPUs would always properly execute it, then I suspect the standard
would have given the multiple modification UNSPECIFIED behavior, not
UNDEFINED behavior, which has much less possible dire consequences.

One comment on your example

*a == *++b

will yield a different type of UB. ++b can NOT change the value of *a,
as we are changing b, the parameter/local variable, not the location
pointed to. If foo was called with &i as in my example, ++b has the
++operator is defined only for pointers to arrays, not for pointer to
scalars. If the original had been static i; then it is possible that i
is at the end of the memory segment in which it lives, and ++b may then
generate an address outside the segment bounds, which might cause a trap
either on the increment or on the deference, and if that doesn't happen,
the value retrieved surely is not what is desired.

Changing the *++ to ++* to bet back to the original case, which value *a
retrieves in *a == ++*b when a == b, if we hold to the ideals of the C
language, it would be unspecified, so a standard conforming
implementation could return either 1 or 0. Other languages prefer to
reduce undefined/unspecified behavior, and may state that the expression
SHALL be evaluated left to right (for example) and then *a would always
get the previous value, so the expression would always be evaluated as 0.

In the second expression, since C doesn't specify order, the compiler
can apply the distributive law assuming that the overflow behavior of
the machine makes it applicable (for example, if we were dealing with
floats, then it wouldn't apply). The language which tries to define
order of operations would have problems with the second, as the ++*b
changes the meaning of the second *a, so it would need to disallow the
use of the distributive law (one reason why C doesn't specify order of
operations).
 
M

Michael Press

Kenneth Brody said:
"Because printing '9' is one of an infinite possible results."

Whomever you quote is incoherent, and incorrect
if a charitable reading is given to him.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,079
Messages
2,570,574
Members
47,205
Latest member
ElwoodDurh

Latest Threads

Top