integer literals

A

Armen Tsirunyan

Hi all.
Consider the following program
#include <iostream>
#include <typeinfo>

template <class T>
void f(T x)
{
std::cout << typeid(x).name() << std::endl;
}
int main()
{
f(2000000000);
f(3000000000u);
f(3000000000);
}

I am using Microsoft Visual Studio 2008 and on my machine int and long
are both 32 bits.
As far as I understood from the 2003 C++ standard, this program should
print

int
unsigned int
unsigned int

however it prints

int
unsigned int
unsigned long

Is this a bug of MSVC9.0 or I have misinterpreted the standard?
Also, please note that I am aware that the standard imposes no
considerable requirements on type_info::name(). So please let's not
say "the output is correct since name can be anything".
Also, if this indeed is a bug of MSVC (this is a bit off-top now) is
there any use reporting that bug to them, I mean do they care? :)
Thank you in advance for your comments,
Armen Tsirunyan.
 
K

Kai-Uwe Bux

Armen said:
Hi all.
Consider the following program
#include <iostream>
#include <typeinfo>

template <class T>
void f(T x)
{
std::cout << typeid(x).name() << std::endl;
}
int main()
{
f(2000000000);
f(3000000000u);
f(3000000000);
}

I am using Microsoft Visual Studio 2008 and on my machine int and long
are both 32 bits.
As far as I understood from the 2003 C++ standard, this program should
print

int
unsigned int
unsigned int

however it prints

int
unsigned int
unsigned long

Is this a bug of MSVC9.0 or I have misinterpreted the standard?

From the standard [2.13.1/2]

The type of an integer literal depends on its form, value, and suffix. If
it is decimal and has no suffix, it has the first of these types in which
its value can be represented: int, long int; if the value cannot be
represented as a long int, the behavior is undefined. If it is octal or
hexadecimal and has no suffix ...

The last literal seems to be decimal, without suffix, and not representable
as long int. My reading is that the program has UB.

Now, as a matter of implementation quality, I would strongly expect a
compilation error. The standard also gives license for that (one could even
read it as a requirement) [5/5]:

If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable
values for its type, the behavior is undefined, unless such an expression
is a constant expression (5.19), in which case the program is ill-formed.

Also, please note that I am aware that the standard imposes no
considerable requirements on type_info::name(). So please let's not
say "the output is correct since name can be anything".
Also, if this indeed is a bug of MSVC (this is a bit off-top now) is
there any use reporting that bug to them, I mean do they care? :)
Thank you in advance for your comments,
Armen Tsirunyan.


Best

Kai-Uwe Bux
 
A

Armen Tsirunyan

From the standard [2.13.1/2]

  The type of an integer literal depends on its form, value, and suffix.. If
  it is decimal and has no suffix, it has the first of these types in which
  its value can be represented: int, long int; if the value cannot be
  represented as a long int, the behavior is undefined. If it is octal or
  hexadecimal and has no suffix ...

The last literal seems to be decimal, without suffix, and not representable
as long int. My reading is that the program has UB.

Yes, you are right, I just saw the part where it said the first of
int, unsigned int, long, unsigned long, and missed the part which said
that this referred to octal and hexadecimal literals :)
Now, as a matter of implementation quality, I would strongly expect a
compilation error.

How would you expect a compilation error if the standard says it's
undefined behavior?
The standard also gives license for that (one could even
read it as a requirement) [5/5]:

License for what? For treating undefined behavior as a compilation
error?
Is this true or false? "Undefined behavior refers to syntactically
well-formed programs. The behavior of the program is undefined, not
the behavior of the compiler."

Thanks,
Armen Tsirunyan
 
F

Francesco S. Carta

From the standard [2.13.1/2]

The type of an integer literal depends on its form, value, and suffix.. If
it is decimal and has no suffix, it has the first of these types in which
its value can be represented: int, long int; if the value cannot be
represented as a long int, the behavior is undefined. If it is octal or
hexadecimal and has no suffix ...

The last literal seems to be decimal, without suffix, and not representable
as long int. My reading is that the program has UB.

Yes, you are right, I just saw the part where it said the first of
int, unsigned int, long, unsigned long, and missed the part which said
that this referred to octal and hexadecimal literals :)
Now, as a matter of implementation quality, I would strongly expect a
compilation error.

How would you expect a compilation error if the standard says it's
undefined behavior?
The standard also gives license for that (one could even
read it as a requirement) [5/5]:

License for what? For treating undefined behavior as a compilation
error?
Is this true or false? "Undefined behavior refers to syntactically
well-formed programs. The behavior of the program is undefined, not
the behavior of the compiler."

Kai-Uwe referred to this part in particular:
"unless such an expression is a constant expression (5.19), in which
case the program is ill-formed."

Since in your case we have a constant expressions which cannot be
represented by a long int, a reading of the standard could lead to the
compiler rejecting your compilation unit as ill-formed. Honestly, I
think this is the only reasonable reading.

Please keep attribution lines in place when quoting the messages you're
replying to.
 
F

Francesco S. Carta

From the standard [2.13.1/2]

The type of an integer literal depends on its form, value, and
suffix.. If
it is decimal and has no suffix, it has the first of these types in
which
its value can be represented: int, long int; if the value cannot be
represented as a long int, the behavior is undefined. If it is octal or
hexadecimal and has no suffix ...

The last literal seems to be decimal, without suffix, and not
representable
as long int. My reading is that the program has UB.


Yes, you are right, I just saw the part where it said the first of
int, unsigned int, long, unsigned long, and missed the part which said
that this referred to octal and hexadecimal literals :)

Now, as a matter of implementation quality, I would strongly expect a
compilation error.

How would you expect a compilation error if the standard says it's
undefined behavior?

The standard also gives license for that (one could even
read it as a requirement) [5/5]:


License for what? For treating undefined behavior as a compilation
error?
Is this true or false? "Undefined behavior refers to syntactically
well-formed programs. The behavior of the program is undefined, not
the behavior of the compiler."

Kai-Uwe referred to this part in particular:
"unless such an expression is a constant expression (5.19), in which
case the program is ill-formed."

Since in your case we have a constant expressions which cannot be
represented by a long int, a reading of the standard could lead to the
compiler rejecting your compilation unit as ill-formed. Honestly, I
think this is the only reasonable reading.

Right, in a compiler that strictly enforces C++03's rules. But most
compilers these days have long long and unsigned long long, and apply
the C++0x rules for interpreting integer literals; they're the obvious
analog to the C++03 rules.

My MinGW 4.4.0, despite having long long which (I think) should be
considered more fitting (as the original literal reported by the OP has
no "u" suffix), interprets it as an unsigned long, furthermore it
reports a warning telling that "this decimal constant is unsigned only
in ISO C90".

I have no idea about why it is using a C90 rule here.
 
F

Francesco S. Carta

On 2010-09-26 09:16:30 -0400, Francesco S. Carta said:



From the standard [2.13.1/2]

The type of an integer literal depends on its form, value, and
suffix.. If
it is decimal and has no suffix, it has the first of these types in
which
its value can be represented: int, long int; if the value cannot be
represented as a long int, the behavior is undefined. If it is
octal or
hexadecimal and has no suffix ...

The last literal seems to be decimal, without suffix, and not
representable
as long int. My reading is that the program has UB.


Yes, you are right, I just saw the part where it said the first of
int, unsigned int, long, unsigned long, and missed the part which said
that this referred to octal and hexadecimal literals :)

Now, as a matter of implementation quality, I would strongly expect a
compilation error.

How would you expect a compilation error if the standard says it's
undefined behavior?

The standard also gives license for that (one could even
read it as a requirement) [5/5]:


License for what? For treating undefined behavior as a compilation
error?
Is this true or false? "Undefined behavior refers to syntactically
well-formed programs. The behavior of the program is undefined, not
the behavior of the compiler."

Kai-Uwe referred to this part in particular:
"unless such an expression is a constant expression (5.19), in which
case the program is ill-formed."

Since in your case we have a constant expressions which cannot be
represented by a long int, a reading of the standard could lead to the
compiler rejecting your compilation unit as ill-formed. Honestly, I
think this is the only reasonable reading.


Right, in a compiler that strictly enforces C++03's rules. But most
compilers these days have long long and unsigned long long, and apply
the C++0x rules for interpreting integer literals; they're the obvious
analog to the C++03 rules.

My MinGW 4.4.0, despite having long long which (I think) should be
considered more fitting (as the original literal reported by the OP
has no "u" suffix), interprets it as an unsigned long, furthermore it
reports a warning telling that "this decimal constant is unsigned only
in ISO C90".

I have no idea about why it is using a C90 rule here.

Hmm, I don't either. The rule in C++0x is the first that fits from int,
long int, long long int. Well, maybe gcc hasn't implemented the C++0x
rules yet.

That really seems to be so: using compiler flags such as -std=c++0x or
-std=gnu++0x don't change anything, that warning still gets raised and
that literal still gets interpreted as an unsigned long.
 
A

Alf P. Steinbach /Usenet

* Francesco S. Carta, on 26.09.2010 16:40:
That really seems to be so: using compiler flags such as -std=c++0x or
-std=gnu++0x don't change anything, that warning still gets raised and that
literal still gets interpreted as an unsigned long.

I can confirm that for MinGW g++ 4.4.1.


Cheers,

- Alf
 
J

Juha Nieminen

Pete Becker said:
behavior, such as might arise upon use of an erroneous program
construct or erroneous data, for which this International Standard
imposes no requirements. Undefined behavior may also be expected
when this International Standard omits the description of any explicit
definition of behavior. [Note: permissible undefined behavior ranges
from ignoring the situation completely with unpredictable results, to
behaving during translation or program execution in a documented
manner characteristic of the environment (with or without the
issuance of a diagnostic message), to terminating a translation
or execution (with the issuance of a diagnostic message). Many
erroneous program constructs do not engender undefined behavior;
they are required to be diagnosed. ??? end note ]

Btw, what's the standard definition of "ill-formed"?
 
A

Armen Tsirunyan

  Btw, what's the standard definition of "ill-formed"?

the 2003 standard clause 1.3.14 defines "well-formed program":
A C++ program constructed according to the syntax rules, diagnosable
semantic rules, and the One Definition Rule.

I guess a program is ill-formed if it is not well-formed. Am I right?
 
K

Kai-Uwe Bux

Juha said:
Pete Becker said:
behavior, such as might arise upon use of an erroneous program
construct or erroneous data, for which this International Standard
imposes no requirements. Undefined behavior may also be expected
when this International Standard omits the description of any
explicit definition of behavior. [Note: permissible undefined
behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a
diagnostic message). Many erroneous program constructs do not
engender undefined behavior; they are required to be diagnosed.
??? end note ]

Btw, what's the standard definition of "ill-formed"?

Not well-formed :)

From the standard [1.3.4]:

1.3.4 ill-formed program [defns.ill.formed]
input to a C++ implementation that is not a well-formed program (1.3.14).


Best

Kai-Uwe Bux

Just in case you now wonder what a well-formed program is:

1.3.14 well-formed program [defns.well.formed]
a C++ program constructed according to the syntax rules, diagnosable
semantic rules, and the One Definition Rule (3.2).
 
J

Juha Nieminen

Kai-Uwe Bux said:
Btw, what's the standard definition of "ill-formed"?

Not well-formed :)

From the standard [1.3.4]:

1.3.4 ill-formed program [defns.ill.formed]
input to a C++ implementation that is not a well-formed program (1.3.14).


Best

Kai-Uwe Bux

Just in case you now wonder what a well-formed program is:

1.3.14 well-formed program [defns.well.formed]
a C++ program constructed according to the syntax rules, diagnosable
semantic rules, and the One Definition Rule (3.2).

I seem to remember some constructs which are ill-formed but which would
nevertheless still compile (although I can't remember any concrete examples
right now). That definition would seem to imply that an ill-formed program
cannot even compile.
 
K

Kai-Uwe Bux

Juha said:
Kai-Uwe Bux said:
Btw, what's the standard definition of "ill-formed"?

Not well-formed :)

From the standard [1.3.4]:

1.3.4 ill-formed program [defns.ill.formed]
input to a C++ implementation that is not a well-formed program
(1.3.14).


Best

Kai-Uwe Bux

Just in case you now wonder what a well-formed program is:

1.3.14 well-formed program [defns.well.formed]
a C++ program constructed according to the syntax rules, diagnosable
semantic rules, and the One Definition Rule (3.2).

I seem to remember some constructs which are ill-formed but which would
nevertheless still compile (although I can't remember any concrete
examples right now). That definition would seem to imply that an
ill-formed program cannot even compile.

It does not imply that: off the top of my head, violations of the One
Definition Rule do not require a diagnostic. Maybe, there are more cases
like this.


Best

Kai-Uwe Bux
 
J

James Kanze

Armen Tsirunyan wrote:
From the standard [2.13.1/2]
The type of an integer literal depends on its form, value, and suffix. If
it is decimal and has no suffix, it has the first of these types in which
its value can be represented: int, long int; if the value cannot be
represented as a long int, the behavior is undefined. If it is octal or
hexadecimal and has no suffix ...
The last literal seems to be decimal, without suffix, and not representable
as long int. My reading is that the program has UB.
Now, as a matter of implementation quality, I would strongly expect a
compilation error.

As a matter of implementation quality, I would expect the
compiler to implement long long, and use that.

(The intent is clear: in no case should a decimal constant be
interpreted as signed. The existing practice is just as clear:
use unsigned long is long doesn't fit. The compromise is to
make it undefined behavior, and so allow both.)
 
K

Kai-Uwe Bux

James said:
Armen Tsirunyan wrote:
From the standard [2.13.1/2]
The type of an integer literal depends on its form, value, and suffix.
If it is decimal and has no suffix, it has the first of these types in
which its value can be represented: int, long int; if the value cannot
be represented as a long int, the behavior is undefined. If it is octal
or hexadecimal and has no suffix ...
The last literal seems to be decimal, without suffix, and not
representable as long int. My reading is that the program has UB.
Now, as a matter of implementation quality, I would strongly expect a
compilation error.

As a matter of implementation quality, I would expect the
compiler to implement long long, and use that.

That would be a very bad C++03 compiler. I would expect it to use long long
when invoked as a C++0x compiler. And then, I would expect the compiler to
barf at some higher literals.
(The intent is clear: in no case should a decimal constant be
interpreted as signed.

Huh? is that the reason why f(2000000000) prints int? You mean *un*signed,
right?
The existing practice is just as clear:
use unsigned long is long doesn't fit. The compromise is to
make it undefined behavior, and so allow both.)

But from the point of view of a programmer, undefined behavior is not a
license but a prohibition. It is a license only from the point of view of
the implementor. I agree that an implementor might go for UB and ignore
[5/5] -- I just consider that an indication of laziness on the implementors
part.


Best

Kai-Uwe Bux
 
J

James Kanze

[...]
That would be a very bad C++03 compiler. I would expect it to
use long long when invoked as a C++0x compiler. And then,
I would expect the compiler to barf at some higher literals.

That would be a very usable C++ compliler. If invoked in strict
mode, an error is called for, but from a usability point of
view, I would expect support for long long to be the default.
And an option (or combination of options) for strict mode,
except long long. (It's not often that I suggest a default
other than strict conformance. But in this case, it seems more
appropriate.)
Huh? is that the reason why f(2000000000) prints int? You mean
*un*signed, right?

Yes. I miscounted my negations.
But from the point of view of a programmer, undefined behavior
is not a license but a prohibition. It is a license only from
the point of view of the implementor. I agree that an
implementor might go for UB and ignore [5/5] -- I just
consider that an indication of laziness on the implementors
part.

In this case, I don't think it's laziness so much as doing what
we've always done, to avoid breaking code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,983
Messages
2,570,187
Members
46,747
Latest member
jojoBizaroo

Latest Threads

Top