J
John G Harris
<snip>The primary way this works is by having each production continue until
meeting some character not allowed in the production. In this case this
means that the Identifier production will parse straight trough the
entire string until encountering whitespace, lineterminator, punctuator
or div.
This is really quite simple...
There! You've said it yourself. The lexical parser takes in as many
characters as it can when parsing the next input element (identifier,
whitespace, etc). It does this not because the programmer thought it was
a good idea, nor because it is a common practice, but because the
language standard says the parser bloodywell *must* do so. The standard
uses the now-famous text
"The source text of an ECMAScript program is first converted into a
sequence of input elements, which are tokens, line terminators,
comments, or white space. The source text is scanned from left to right,
repeatedly taking the longest possible sequence of characters as the
next input element."
to say so. This is really even simpler. I hope Evertjan has understood
that this text supplements the syntax specification.
John