Why this works?

alexstrf · Dec 20, 2012

I have this:

#include <iostream>

int& foo() {}

int main(void)
{
foo() = 1 + 1;

std::cout << foo() << std::endl;

return 0;
}

Compiling with gcc it works,
when executing the output is "2".

What the hell is this and why this works?

Alex

Victor Bazarov · Dec 20, 2012

I have this:

#include <iostream>

int& foo() {}

int main(void)
{
foo() = 1 + 1;

std::cout << foo() << std::endl;

return 0;
}

Compiling with gcc it works,
when executing the output is "2".

What the hell is this and why this works?

You need to ask folks who implemented gcc. The Standard doesn't specify
what should happen when a function that's supposed to return something
does not return it. It's what's known as "undefined behavior" (UB).
Read up on it and try to write code that doesn't have it, and at the
least does not depend on it.

V

Rui Maciel · Dec 20, 2012

I have this:

#include <iostream>

int& foo() {}

int main(void)
{
foo() = 1 + 1;

std::cout << foo() << std::endl;

return 0;
}

Compiling with gcc it works,
when executing the output is "2".

What the hell is this and why this works?

In C++, flowing off the end of a function is equivalent to a return with no
value, and a return statement without an expression is only permitted in C++
in functions that don't return a value. Therefore, although the standard
states that this is behavior is undefined, it contradicts other behaviors
defined in standard. Therefore, if it isn't treated as a bug, it should be.

Rui Maciel

Rui Maciel · Dec 20, 2012

Rui said:
In C++, flowing off the end of a function is equivalent to a return with
no value, and a return statement without an expression is only permitted
in C++ in functions that don't return a value. Therefore, although the
standard states that this is behavior is undefined, it contradicts other
behaviors defined in standard. Therefore, if it isn't treated as a bug,
it should be.

I've submitted a bug report, which is available at:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55767

Rui Maciel

Rui Maciel · Dec 20, 2012

Victor said:
You need to ask folks who implemented gcc. The Standard doesn't specify
what should happen when a function that's supposed to return something
does not return it. It's what's known as "undefined behavior" (UB).

The standard actually states that a return statement without an expression
can only be used in functions that do not return a value, which is a clear
indication that going against this behavior constitutes a problem.

The only problem that is present in the standard is that a loophole was
explicitly added for this specific case, in which the standard specifies
that when this happens then essentially they let each implementation do what
they wish. Nevertheless, among the list of "permissible undefined
behavior", they also include the possibility of a compiler throwing an
error. So, in spite of the loophole, there is no excuse for doing the right
thing.

Rui Maciel

Victor Bazarov · Dec 20, 2012

The standard actually states that a return statement without an expression
can only be used in functions that do not return a value, which is a clear
indication that going against this behavior constitutes a problem.

The Standard actually states <<Flowing off the end of a function is
equivalent to a return with no value; this results in undefined behavior
in a value-returning function.>> ([stmt.return]/2, last sentence).

What's against what here? Specifying a 'return' without an expression
is prohibited in a value-returning function. But it doesn't say that
absence is "specifying". It says "equivalent", and it most likely means
"equivalent behaviorally" not "equivalent syntactically".

The only problem that is present in the standard is that a loophole was
explicitly added for this specific case, in which the standard specifies
that when this happens then essentially they let each implementation do what
they wish. Nevertheless, among the list of "permissible undefined
behavior", they also include the possibility of a compiler throwing an
error. So, in spite of the loophole, there is no excuse for doing the right
thing.

Did you mean "no excuse for NOT doing the right thing"?

V

Rui Maciel · Dec 21, 2012

Victor said:
The standard actually states that a return statement without an
expression can only be used in functions that do not return a value,
which is a clear indication that going against this behavior constitutes
a problem.

Click to expand...

The Standard actually states <<Flowing off the end of a function is
equivalent to a return with no value; this results in undefined behavior
in a value-returning function.>> ([stmt.return]/2, last sentence).

Yes, that was the undefined behavior loophole I've mentioned earlier.

What's against what here? Specifying a 'return' without an expression
is prohibited in a value-returning function. But it doesn't say that
absence is "specifying". It says "equivalent", and it most likely means
"equivalent behaviorally" not "equivalent syntactically".

The standard states that the behavior is equivalent, without adding any
adjective to narrow out in what sense it should and should not be
equivalent. Hence, the standard only states that flowing off the end of a
function corresponds to the same language construct as a return statement
without an expression, which is not a valid construct according to the
standard, if this happens in a function which does return a value.

The only thing that makes flowing off a function not an error in C++ is the
undefined behavior loophole, whose only effect is to grant implementations
some leeway to deal however they wish with this, which often ends up in
disaster, as we've seen in this case.

Did you mean "no excuse for NOT doing the right thing"?

Yes, oops.

Rui Maciel

Richard Damon · Dec 21, 2012

In C++, flowing off the end of a function is equivalent to a return with no
value, and a return statement without an expression is only permitted in C++
in functions that don't return a value. Therefore, although the standard
states that this is behavior is undefined, it contradicts other behaviors
defined in standard. Therefore, if it isn't treated as a bug, it should be.

Rui Maciel

The issue is that in general, the compiler can not be always know if the
function can flow off the end of the function, as it requires solving
the halting problem, and the compiler may be missing key information to
even try it.

For example, should this be required to generate a diagnostic?

int foo() {
bar();
}

What if bar always calls exit()?, or throws?

Because of this problem, the standard should require a diagnostic for a
"possible" path from flowing off the end, and the committee probably
felt it wasn't worth trying to define a set of minimal cases that a
certain to do so (because it actually is fairly hard to do at the level
of rigor required by the standard).

Now it does turn out that many cases, like the one presented originally,
can often be detected by the compiler, and thus a "good" compiler will
issue a warning where it can detect that this is the likely case.

Paul Rubin · Dec 21, 2012

Richard Damon said:
int foo() {
bar();
}
What if bar always calls exit()?, or throws?

Then foo should have type void rather than int.

Paul Rubin · Dec 21, 2012

Robert Wessel said:
int foo() {
if (somecondition)
return 7;
else
bar(); // never returns
}

I'd want the compiler to signal a type error here, unless the type of
bar() is int. I don't know what the standard says about the situation
though.

Rui Maciel · Dec 21, 2012

Richard said:
The issue is that in general, the compiler can not be always know if the
function can flow off the end of the function, as it requires solving
the halting problem, and the compiler may be missing key information to
even try it.

For example, should this be required to generate a diagnostic?

int foo() {
bar();
}

What if bar always calls exit()?, or throws?

The compiler can ignore scenarios such as calls to exit(), throwing
exceptions, or invoking functions that do not return, and check which code
path doesn't include a properly formatted return statement. It's better if
the compiler requires unnecessary return statements than it is to support
invalid constructs such as returning uninitialized references of objects
that don't exist.

Because of this problem, the standard should require a diagnostic for a
"possible" path from flowing off the end, and the committee probably
felt it wasn't worth trying to define a set of minimal cases that a
certain to do so (because it actually is fairly hard to do at the level
of rigor required by the standard).

Now it does turn out that many cases, like the one presented originally,
can often be detected by the compiler, and thus a "good" compiler will
issue a warning where it can detect that this is the likely case.

Gcc does detect this problem, if the code is compiled with the
-Werror=return-type flag.

Rui Maciel

Rui Maciel · Dec 21, 2012

Robert said:
Which doesn't help:

int foo() {
if (somecondition)
return 7;
else
bar(); // never returns
}

There is absolutely no problem if a compiler requires a dummy return
statement after bar(), or at the end of foo().

Rui Maciel

Victor Bazarov · Dec 21, 2012

Which doesn't help:

int foo() {
if (somecondition)
return 7;
else
bar(); // never returns
}

Rewrite it

int foo() {
if (!somecondition)
bar();
return 7;
}

and you have no such problem.

V

Paul Rubin · Dec 21, 2012

Victor Bazarov said:
Rewrite it
int foo() {
if (!somecondition)
bar();
return 7;
}
and you have no such problem.

But that changes the semantics, if the compiler thinks bar returns an
int. If bar is declared int and never returns, that's fine, functions
in C++ aren't required to be total. However, modularity suggests that
if it's declared int, at some point it may change to actually return an
int. At that point the old code would have returned bar()'s value while
the new code returns 7.

If bar is supposed to never return, it would be good if there were a
type saying so (I think Ada has something like that) but the closest
thing available is void. In that case the caller should return some
appropriate int, which might not be 7.

Victor Bazarov · Dec 21, 2012

But that changes the semantics, if the compiler thinks bar returns an
int.

I don't understand. Please elaborate.

If bar is declared int and never returns, that's fine, functions
in C++ aren't required to be total.

What's the meaning of "total" here?

However, modularity suggests that
if it's declared int, at some point it may change to actually return an
int.

So? It's like calling 'printf' or 'scanf' and ignoring its return
value. If I ignore it, what difference does it make whether the return
type is 'int' or anything else?

At that point the old code would have returned bar()'s value while
the new code returns 7.

The old code (that you have snipped here) did NOT return the value
returned by 'bar()'. Go check it.

If bar is supposed to never return, it would be good if there were a
type saying so (I think Ada has something like that) but the closest
thing available is void. In that case the caller should return some
appropriate int, which might not be 7.

Again, I am not sure I understand what your point here is. Please look
at the code that Robert Wessel posted and review your reasoning, and
perhaps elaborate a bit. Thanks!

V

Richard Damon · Dec 22, 2012

The compiler can ignore scenarios such as calls to exit(), throwing
exceptions, or invoking functions that do not return, and check which code
path doesn't include a properly formatted return statement. It's better if
the compiler requires unnecessary return statements than it is to support
invalid constructs such as returning uninitialized references of objects
that don't exist.

I would disagree here, if the standard mandated a diagnostic here, then
to avoid the diagnostic, and the possibility that the compiler may not
generate a resulting program, would require the programmer to add a
return with a value even for paths that he knows can never get executed
(and which some helpful compiler may even generate a warning about). If
the function is returning a class object, especially one without a
default constructor, this might take a non-trivial amount of code.

The philosophy of the language has always been that in general, you make
thing constraint violations that require a diagnostic for those things
that are easily detectable, and clearly wrong; leave things as undefined
behavior if it is hard to diagnose it in general. The designers of the
language do not make things a constraint violation just because they
"might" be bad. Implementations are allowed, and in general encouraged,
to generate additional diagnostics (often called warnings) for things
that are questionable, or are "undefined behavior" that might be hard to
detect in general, but the compiler can detect the problem in this case.

Paul Rubin · Dec 28, 2012

Juha Nieminen said:
SomeType foo() ...
If constructing an object of 'SomeType' is extremely complicated,
having to construct a dummy object that will never be executed would
be completely unnecessary.

If it will never be executed, why is the type SomeType instead of void?

Öö Tiib · Dec 28, 2012

Being forced to construct such a
complicated object can in some cases be completely unnecessary, for
example in situations like

SomeType foo()
{
if(something)
someCodeHere;

throw("this should never happen!");
}

That is the optimistic "if things are good then we rock" style that
leaves bad things to end. Typical branch prediction in processors is
however also optimistic and hopes that branch won't be taken. So
better might be pessimistic style "eventually all things go wrong":

SomeType foo()
{
if (!someThing)
throw "there must be always something!";

switch (someOtherThing)
{
default: // always have default
case BadValue: // problems first
throw "bad or unhandled case

";

case GoodValue1:
// ... react to GoodValue1
break;

case GoodValue2:
// ... work with GoodValue2
break;
// ^ so maintainer can't forget to add break
// <- when he adds new case here
}

// ... rest of the code here
// very last thing is return of good result
}

As bonus that style also suppresses most warnings presence (or lack of)
whose this thread is about.

Rui Maciel · Dec 28, 2012

Juha said:
AFAIK C++ inherits this optionality of a return value from C. I'm guessing
that it exists in C because in the 70's they didn't want to force the
programmer to increase the size of functions by even one byte if they
absolutely didn't have to.

Nowadays increasing the size of a function by a few bytes due to an
extraneous 'return' might not be a problem, but the thing is, there
might be cases where the 'return' actually takes a lot more than just
a few bytes.

The return value might in fact be very large, like several kilobytes.
*Creating* the return value might be very complicated or laborious
(such as it being an object with a mandatory constructor that takes
hundreds of mandatory parameters.) Being forced to construct such a
complicated object can in some cases be completely unnecessary, for
example in situations like

SomeType foo()
{
if(something)
someCodeHere;

throw("this should never happen!");
}

If constructing an object of 'SomeType' is extremely complicated,
having to construct a dummy object that will never be executed would
be completely unnecessary.

I don't understand what you are trying to say. This issue is about a bug
caused by the way a compiler accepts an erroneous construct, and in the
message you replied to I suggested that a solution for this problem would be
to interpret that type of erroneous construct as an error, even in the
corner case where that code isn't actually executed. If we are dealing with
a code path which will actually never be executed then it doesn't make any
sense to worry about what that code might do, or how expensive it might be.

In addition, the problem you described has nothing to do with the flowing
off problem. If someone defined a function in a way that is needlessly
inefficient and expensive then the problem lies in the way that function was
defined, not in how a compiler doesn't accept erroneous constructs.

Rui Maciel

James Kanze · Dec 28, 2012

Hmm, it looks like I misread the discussion upthread, about what's
supposed to happen if a function flows off the end without returning a
value. OK, it's undefined, so anything can happen if bar gets called.
Still, with some compilers like KCC[1], the program doing something
undefined is supposed to guarantee an error or crash (so you know
there's a bug to fix) and in that case the semantics have changed.
IIRC, there's a GCC extension (for C but maybe not for C++) that flowing
off the end returns the last value computed in the block.

And what happens if the last computed value has a totally
unrelated type?

For simple types, like `int`, most compilers will return the
last computed value. Not intentionally, but because they return
the value in a register, and the last computed value often
happens to be in that register.

There's a very simple solution for compiler writers. If the
function returns a value, just insert an illegal instruction or
something similar where the code might fall off the end. It
doesn't cost much, just one extra byte. And it has no impact
unless the code actually does fall off the end.

Chatbot	0	Oct 8, 2024
CIN Input #2 gets skipped, I don't understand why.	1	Feb 9, 2023
Character operations in C++	2	Jan 28, 2024
Cannot find my infinite loop	1	Sep 23, 2023
GET NEIL DEGRASSES TYSON, I ripped a hole with this one...	0	Nov 10, 2022
why assigning to mismatch type in template still works.	6	Jul 13, 2013
Function overloading question	6	Dec 5, 2013
Why doesn't implicit conversion work with wide ostream?	4	Jul 5, 2013

Why this works?

alexstrf

Victor Bazarov

Rui Maciel

Rui Maciel

Rui Maciel

Victor Bazarov

Rui Maciel

Richard Damon

Paul Rubin

Paul Rubin

Rui Maciel

Rui Maciel

Victor Bazarov

Paul Rubin

Victor Bazarov

Richard Damon

Paul Rubin

Öö Tiib

Rui Maciel

James Kanze

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads