The >> token

P

Peter Ammon

As we know, due to C++'s "longest match" rule, the >> token causes
headaches when working with nested templates, e.g.

vector<vector<int>>

will not parse correctly without inserting a space between the two >
signs. Why have a >> token at all? Why not have > be the token, and
handle >> in the grammar as two > tokens?

This would permit code like 3 > > 1, but that seems harmless to me.
 
D

Dave

Peter Ammon said:
As we know, due to C++'s "longest match" rule, the >> token causes
headaches when working with nested templates, e.g.

vector<vector<int>>

will not parse correctly without inserting a space between the two >
signs. Why have a >> token at all? Why not have > be the token, and
handle >> in the grammar as two > tokens?

This would permit code like 3 > > 1, but that seems harmless to me.


Hmmm, then what would you use for "greater than"? If it's "overloaded", you
then end up with context sensitivity issues, which makes grammars much
harder to deal with...
 
A

Andrey Tarasevich

Peter said:
Why have a >> token at all? Why not have > be the token, and
handle >> in the grammar as two > tokens?

That would make the grammar much more complex than it is now. It is not
worth it.
 
R

Rolf Magnus

Dave said:
Hmmm, then what would you use for "greater than"? If it's
"overloaded", you then end up with context sensitivity issues, which
makes grammars much harder to deal with...

There are already context sensitivity issues. That's the reason why you
can't write vector<vector<int>>. The "greater" token already is
"overloaded".
 
M

M. Akkerman

Because it's pretty usefull. I'm sure a desktop programmer won't
bother much with stuff like individual bits but if you're going to
code for a lower level layer (example: device driver) then
manuipulating bits is your only friend.
 
U

Unforgiven

M. Akkerman said:
Because it's pretty usefull. I'm sure a desktop programmer won't
bother much with stuff like individual bits but if you're going to
code for a lower level layer (example: device driver) then
manuipulating bits is your only friend.

I don't believe that that's what the OP meant. He wants to keep bitshift of
course, but wants to have the compilers not treat '>>' as a seperate token,
but instead have the compiler determine by context what two consecutive '>'
tokens mean. If that had been done, it would have been possible to write >>
(without a space) at the end of nested templates because the compiler would
see the two seperate '>' tokens and determine that they can't be a bitshift
in that context so correctly see them as the end of the template
instantiation. Currently, the grammatical analyzer gets a '>>' token from
the lexical analyzer in that situation, and concludes that that token is
invalid in that context.
 
J

Jerry Coffin

[ ... ]
There are already context sensitivity issues. That's the reason why you
can't write vector<vector<int>>. The "greater" token already is
"overloaded".

That's not context sensitivity. Context sensitivity is when your
grammar contains at least one production like:

xA ::= whatever

where an 'A' is recognized as a particular syntactic element ONLY in the
context of an 'x'. Otherwise, it's recognized as some other syntactic
element.

In the case of '<<' or '>>', there's no such thing -- distinguishing
between '>' and '>>' is done entirely at the lexical level, before the
grammar sees either one at all. By the time the parser sees any of
these, the lexer has converted each one to a token. The lexer doesn't
use any context sensitivity either -- it just creates a token out of the
longest sequence of input characters that it can. I.e. it reads in
characters until it encounters one that can't possibly be part of any
token that started with the characters that have already been read. At
that point, it does one of two things: returns the characters its
already read as a token, or else signals an error because what it's read
isn't a token, and the next character in the input can't be part of any
token that could start with the characters that have already been read
either.

There are a few parts of C++ that involve context sensitivity, but
they're mostly there to resolve ambiguities in the grammar proper --
e.g. in some cases, the choice between a declaration and a definition is
context sensitive.
 
R

Rolf Magnus

Jerry said:
[ ... ]
There are already context sensitivity issues. That's the reason why
you can't write vector<vector<int>>. The "greater" token already is
"overloaded".

That's not context sensitivity. Context sensitivity is when your
grammar contains at least one production like:

xA ::= whatever

where an 'A' is recognized as a particular syntactic element ONLY in
the
context of an 'x'. Otherwise, it's recognized as some other syntactic
element.

In the case of '<<' or '>>', there's no such thing --

I was talking about something like 'if (a<3)' vs. 'vector<int>', not
about the '<<' token. Sorry, I should have said that more clearly.
 
J

Jerry Coffin

[ talking about context sensitivity ]
I was talking about something like 'if (a<3)' vs. 'vector<int>', not
about the '<<' token. Sorry, I should have said that more clearly.

That's still not really context sensitivity, at least in the way the
term is normally defined. Basically, the grammar just has something
like (simplifying drastically):

cmpop: '=' | '<' | '>' | '<=' | '>='
;

expression: operand cmpop operand
| lots of other possibilities elided
;

/* ... */
template_instantiation: template_name '<' templ_args '>' name ';'
;

and a given input will only match one of these. There is (usually) a
bit of trickery involved in recognizing whether 'x' is the name of a
template or some other name (e.g. of a variable), and while this means
the parser needs access to the symbol table, it still isn't context
sensitivity in the classic sense.

In the end, none of this is really new or different with C++ though --
in C, the compiler runs into the same kinds of things, such as '&' being
both a unary operator to take an address and a binary operator to do a
bitwise AND. Here again, the parser has to
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,156
Messages
2,570,878
Members
47,404
Latest member
PerryRutt

Latest Threads

Top