Fun with casts

M

Mike Schilling

Consider the following code, which does not compile. What do you think is
wrong with it?

public class Stuff
{
int i = (int)12;
int j = (int)+12;
int k = (int)(+12);
Integer ii = (Integer)12;
Integer ij = (Integer)+12;
Integer ik = (Integer)(+12);
}

There is one illegal line:

Integer ij = (Integer)+12; // illegal

While 12 can be cast to Integer, +12 cannot (nor can -12). Why is this, you
wonder?

AFAICT, it stems from the ambiguity between casts and parenthesized
expressions. Consider

(1) long l = (long)+12;
(2) long ll = (l)+12;

The two are identical syntactically, but

In 1, () indicates a cast and + is unary
In 2, () parenthesizes an expression and + is binary

Java parsers seem to disambiguate these using the rule that a possible cast
preceding '+' or '-' should be parsed as a cast only if what's inside the
parentheses is a built-in type. That's easy to apply since it doesn't
require determining all of the types that are in scope at that point: the
set of built-in types is fixed. You can see it being applied in errors
produced by javac:

boolean b = (boolean)+7;
Object o = (Object)+7;

inconvertible types
found : int
required: boolean
boolean b = (boolean)+7;
^

vs.

cannot find symbol
symbol : variable Object
Object o = (Object)+7;

Thus

(Integer)+12

is disallowed apparently because it would break thst rule.
 
J

Joshua Cranmer

Java parsers seem to disambiguate these using the rule that a possible cast
preceding '+' or '-' should be parsed as a cast only if what's inside the
parentheses is a built-in type. That's easy to apply since it doesn't
require determining all of the types that are in scope at that point: the
set of built-in types is fixed. You can see it being applied in errors
produced by javac:

Or maybe it's because boolean is a keyword in the lexer, while Integer
and Object et al. are identifiers.


CastExpression:
( PrimitiveType ) UnaryExpression
( ReferenceType ) UnaryExpressionNotPlusMinus

So if you have a + or - in your unary expression, the lexer forbids you
from being able to cast it to make it unambiguous. Keep in mind that an
Identifier is both a ReferenceType and an ExpressionName, so you could
build either the parse tree [1]
AdditiveExpression
Primary IntegerLiteral
( ExpressionName ) +

or
CastExpression
( ReferenceType ) UnaryExpression
+ IntegerLiteral

Having unambiguous grammars is a good thing. It makes things easier to
parse.

[1] I'm ignoring the multiple levels of single children here.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,955
Messages
2,570,117
Members
46,705
Latest member
v_darius

Latest Threads

Top