Determining the last statement exercise

C

Csaba Gabor

Suppose you have some javascript statements in a string.
Can you determine whether the entire string is syntactically
valid, and if so, the starting position of the last statement?
In other words,

function lastStatementPos(code) {
// returns the starting position within code of the last
// javascript statement, and -1 if code is not syntactiaclly
// valid.


This came up in a different context today, and I thought
it would make an interesting exercise.

Csaba Gabor from Vienna
 
S

SAM

Le 11/3/09 4:21 PM, Csaba Gabor a écrit :
Suppose you have some javascript statements in a string.

can you give an example of that ?
Can you determine whether the entire string is syntactically
valid, and if so, the starting position of the last statement?
In other words,

function lastStatementPos(code) {
// returns the starting position within code of the last
// javascript statement, and -1 if code is not syntactiaclly
// valid.


This came up in a different context today, and I thought
it would make an interesting exercise.

eval(code); ?
 
E

Evertjan.

Csaba Gabor wrote on 03 nov 2009 in comp.lang.javascript:
Suppose you have some javascript statements in a string.
Can you determine whether the entire string is syntactically
valid, and if so, the starting position of the last statement?
In other words,

function lastStatementPos(code) {
// returns the starting position within code of the last
// javascript statement, and -1 if code is not syntactiaclly
// valid.


This came up in a different context today, and I thought
it would make an interesting exercise.

A string like?

str = "alert(11);{alert('alert(11); is correct Javascript');};"

Would that be usefull?
I don't think it would.
 
R

Richard Cornford

Suppose you have some javascript statements in a string.
Can you determine whether the entire string is syntactically
valid, and if so, the starting position of the last statement?

It would be possible to write a javascript tokeniser/parser with
javascript, and have that determine the syntactic correctness of the
source and expose the contained statements in a way that would allow
you to determine the starting position of the last.
In other words,

function lastStatementPos(code) {
// returns the starting position within code of the last
// javascript statement, and -1 if code is not syntactiaclly
// valid.

This came up in a different context today, and I thought
it would make an interesting exercise.

Maybe a bit too big a task to be considered an 'exercise'. Quickest
results would probably start with something that was already doing
most of the job like Narcissus or JSLint.

<URL: http://mxr.mozilla.org/mozilla/source/js/narcissus/ >

Richard.
 
T

Thomas 'PointedEars' Lahn

Csaba said:
Suppose you have some javascript statements in a string.

Define "javascript".
Can you determine whether the entire string is syntactically
valid,

Depends. Short of writing an ECMAScript parser, which would only be able
to cover specified syntax rules, where available you could try-catch the
SyntaxError that the eval() of the specific implementation would throw if
the code would not conform to its syntax rules. However, AFAICS that can
become quite a nuisance in Konqueror 4.3.2 as syntax errors cannot be
catched there (syntax errors caused Konqueror 3.5.x to crash, therefore I
had to modify the ECMAScript Support Matrix.)
and if so, the starting position of the last statement?

Depends. What is, in your book, considered "the last statement" in

if (x)
y;
else
z;

(we are looking at the following production here:

/IfStatement/ :
if (/Expression/) /Statement/ else /Statement/

see ES3F, 12.5)


PointedEars
 
C

Csaba Gabor

Csaba  Gabor wrote on 03 nov 2009 in comp.lang.javascript:




A string like?

str = "alert(11);{alert('alert(11); is correct Javascript');};"

Would that be usefull?
I don't think it would.

Csaba Gabor wrote on 03 nov 2009 in comp.lang.javascript:




A string like?

str = "alert(11);{alert('alert(11); is correct Javascript');};"

Would that be usefull?
I don't think it would.

You've made a good example, and yes it can be useful.
But I think I better give context to my question
and recast it, especially in light of the fact that my
proposed solution does not work across browsers.

The question arose in the following context: The
user is asked to enter some javascript statements
to affect a value. I want to ensure that all the
statements are syntactically correct, AND to insert
a return before the last statement, if it doesn't
start with return, but does make syntactic sense.

The validation is fairly straightforward:

function syntax_check(code) {
// returns false is code is not syntactically OK
// returns browser's interpretation of the code if it's OK
try {
var f = new Function(code);
return f.toString(); }
catch (err) { return false; } // syntax error
}


On firefox, the returned string is cleaned of all
comments, and is recast into a 'standard form'.
I was thinking that if the penultimate line of
this returned string did not consist of "}",
then a return could be prefixed, if it wasn't
already there. Any exceptions to this?

However, with IE the returned string is not gussied up
into standard form and is left pretty much as is.

So the question, exercise, or problem is: given
a function which will tell you whether or not you
have a syntactically correct javascript, can you
semi reasonably isolate the last statement to
determine whether it should be prefixed with a
return, and to do so in such case.


Some examples:
code = "'Fred'" => return 'Fred';

code = "var i=7; i *= 9"; =>
var i=7;
return i *= 9;

code = "x='word'\nx+='s' // making a plural\n" +
" return x /* pluralizing a word */";
=> no change

Evertjan's code =>
alert(11);
return alert('alert(11); is correct Javascript');

code = '{a = 1; b = 2}' =>
a = 1;
return b = 2;


Two problem cases:
code = "var y=6;\n y*= 2 // difficult; hard case";
code = "var y=6;\n y*= 2; // difficult; hard case";
 
E

Evertjan.

Csaba Gabor wrote on 04 nov 2009 in comp.lang.javascript:
Evertjan's code =>
alert(11);
return alert('alert(11); is correct Javascript');

Which is utter nonsense as

1 alert() does not have a return value

2 return is only sensible in a function
The question arose in the following context:
The user is asked to enter some javascript statements
to affect a value.

Sorry, that is not only not very useful, but it could even be dangerous.
 
T

Thomas 'PointedEars' Lahn

Csaba said:
The validation is fairly straightforward:

AISB, it is not.
function syntax_check(code) {
// returns false is code is not syntactically OK

Based on which syntax rules? ECMAScript's, JavaScript's, JScripts or
others'?
// returns browser's interpretation of the code if it's OK
try {

And if this already constitutes a syntax error, the whole thing breaks.
var f = new Function(code);

And if the Function constructor is not supported, the whole thing breaks.
return f.toString(); }
catch (err) { return false; } // syntax error
}

Does not work reliably. You have failed to realize, among other things,
that syntax is context-sensitive. For example, `return' is allowed in a
function, it is not allowed elsewhere.


PointedEars
 
V

VK

Thomas said:
Based on which syntax rules?  ECMAScript's, JavaScript's, JScripts or
others'?

And if the Function constructor is not supported, the whole thing breaks.

By trying get nasty do not get dorky ;)

To OP: I am wondering if you are looking for a regexp solution
recreating parser rules - or some creative eval or eval-like approach.
Or it's up to respondents to find the most effective way?
 
T

Thomas 'PointedEars' Lahn

VK said:
By trying get nasty do not get dorky ;)

Please shut up until you know what you are talking about (suppose
that is ever going to happen). And lose the bogus smileys.


PointedEars
 
J

John G Harris

code = '{a = 1; b = 2}' =>
a = 1;
return b = 2;
<snip>

b=2 is not the last statement. The last statement is the one that ends
with } .

You do know that a block statement, { ... }, is a statement, don't you ?

John
 
V

VK

Csaba said:
So the question, exercise, or problem is: given
a function which will tell you whether or not you
have a syntactically correct javascript, can you
semi reasonably isolate the last statement to
determine whether it should be prefixed with a
return, and to do so in such case.

No, it is not possible because it would violate Rice's Theorem and
respectively would violate the halting problem which is known to be
undecidable. For the full description see http://en.wikipedia.org/wiki/Rice's_theorem
in a very simplified yet useful form it states that "Any question
about what an arbitrary script does with an arbitrary input is
undecidable, unless it is trivial".

For "semi reasonably" practical use as a helper for beginners it can
be a script that
1) takes the string input
2) gets guarantees that it is a function body with curly brackets
removed
3) gets guarantees that there are not inner functions in the body
( steps 2 and 3 assure that any closing curly bracket is closing an
inner block of statements and nothing else )
4) find the first line break going from the end that followed by non-
white-space chars
5) if it's return whatever then OK, exit, else go to 6)
6) if the previous line contains "return \s*\n" then remove \s*\n, OK,
exit
(often beginners' mistake of
return
myReturnValue;)
else go to 7)
7) if 4) contains "}" the return error, the problem is not decidable
8) if 4) contains something else than change to "return something
else", OK, exit.

That is the physical maximum you can get, the actual coding and tune
up is skipped.
 
C

Csaba Gabor

  <snip>

b=2 is not the last statement. The last statement is the one that ends
with } .

You do know that a block statement, { ... }, is a statement, don't you ?

A rhetorical question?
Nevertheless, since it has come up a few times in this
thread... This exercise is about determining the last
statement and its location (in possibly transformed code),
and the method is more interesting to me than the exact
definition used to get at an answer. What I am saying
is that I would be just as happy with either the original
{a = 1; b = 2} in the example above being returned, or
the b = 2 portion that FF indicates.
 
C

Csaba Gabor

No, it is not possible because it would violate Rice's Theorem and
respectively would violate the halting problem which is known to be
undecidable. For the full description seehttp://en.wikipedia.org/wiki/Rice%27s_theorem
in a very simplified yet useful form it states that "Any question
about what an arbitrary script does with an arbitrary input is
undecidable, unless it is trivial".

An interesting approach VK. I am familiar with Rice's Theorem,
but let me show you that it does not apply, and then perhaps
you will be able to see WHY it does not apply.

Let's consider this same question in the PHP world,
where it is easier to handle. In PHP land, all statements
except the last one must end with ; or }.

Thus, ignoring the final character in the code, find
the prior occurrence of either ; or }. Strip the
string from the next character to the end and syntax
check it. If it doesn't pass, keep looking back for
the next prior occurrence of ; or }.

If instead, the syntax check passes, then the final
character is either the end of the prior statement or
at the end of a // comment. The latter is easily
checked by syntax checking the code with ' x y' affixed.
If at the end of a // comment, keep looking back
for the next prior occurrence of ; or }. Otherwise,
we're done.

So, since I've demonstrated a solution to the problem
(in PHP land), it's clear that Rice's Theorem does
not hold here. My question to you is why not?

Javascript syntax is more complicated, however, and
the question remains whether the approach outlined
above is viable in JS land. In particular, how to
differentiate between (arbitrarily messy versions of):

while(truth)
x+="*"

and

whole(truth)
x+="*"

Csaba
 
J

John G Harris

No, it is not possible because it would violate Rice's Theorem and
respectively would violate the halting problem which is known to be
undecidable. For the full description see http://en.wikipedia.org/wiki/
Rice%27s_theorem
in a very simplified yet useful form it states that "Any question
about what an arbitrary script does with an arbitrary input is
undecidable, unless it is trivial".
<snip>

He isn't trying to check that the code does a particular job, so Rice's
theorem isn't relevant. Nor is he trying to check that the code will
terminate on all inputs, so Turing's halting theorem isn't relevant.

He wants to find the start and end of each statement in the code.
Javascript parsers are able to do that, as is well known.

John
 
J

John G Harris

A rhetorical question?
Nevertheless, since it has come up a few times in this
thread... This exercise is about determining the last
statement and its location (in possibly transformed code),
and the method is more interesting to me than the exact
definition used to get at an answer. What I am saying
is that I would be just as happy with either the original
{a = 1; b = 2} in the example above being returned, or
the b = 2 portion that FF indicates.

The only kind of statement you can put a 'return' in front of is an
expression statement. E.g a=b; becomes return a=b;
So you have to worry about this piece of code :

if ((new Date()).getDay() == 0) a = 23; else a = 42;

Which value do you want to return ?

I can't help thinking that you need to spend more time deciding what you
want, and also whether it's really needed.

John
 
V

VK

Csaba said:
Let's consider this same question in the PHP world,
where it is easier to handle. In PHP land, all statements
except the last one must end with ; or }.

It doesn't matter because Rice's Theorem doesn't depend on a formal
language syntax. The only thing that matters is if it's a general
purpose higher level programming language (skipping for now on exact
formal definitions of these terms). Perl also requires ; after each
statement, so does C++ if I recall properly - and their cases for
Rice's Theorem have been proven 2 and 1 year ago respectively. No one
bothered to make it for PHP yet simply because after the generalized
prove is made it is not so interesting anymore.
Thus, ignoring the final character in the code, find
the prior occurrence of either ; or }. Strip the
string from the next character to the end and syntax
check it. If it doesn't pass, keep looking back for
the next prior occurrence of ; or }.

I gave the possible block schema in my previous answer and
demonstrated that for the most trivial cases it is well possible to
write a "return adder". The only extension I would still suggest is to
add correction for cases like:
return
Oops_ReturnOnTheNextLine;

For a general case finding the right place for the return statement
means to algorithmically decide for an arbitrary program with an
arbitrary input if it has a return point and then to find that
intended return point of it, so, unlike Harris claimed, it is
necessary to prove that the halting problem is decidable and
consecutively to resolve the Entscheidungsproblem. Another option is
to write a program that successfully passes the Turing test so having
all qualities of a free-will human identity.
I guess that either of these three tasks from above is way beyond the
humble frames of clj. The future Nobel laureate should post such
solution in a serious scientific preprint journals right away :)
 
T

Thomas 'PointedEars' Lahn

VK said:
No, it is not possible because it would violate Rice's Theorem and
respectively would violate the halting problem which is known to be
undecidable. For the full description see
http://en.wikipedia.org/wiki/Rice's_theorem in a very simplified yet
useful form it states that "Any question about what an arbitrary script
does with an arbitrary input is undecidable, unless it is trivial".

Neither Rice's theorem nor the halting problem applies here.

If they did, no compiler could throw a compile error, and in particular no
script engine could throw a syntax error.

This is neither about deciding whether the result of an algorithm meets a
particular value (as it suffices to get the value) nor whether this
algorithm holds (as it always holds, either with success or failure). It is
simply about whether an input can be produced by a (context-free) grammar,
and that is a decidable problem.

As usual, you do not know what you are talking about.


PointedEars
 
T

Thomas 'PointedEars' Lahn

Csaba said:
A rhetorical question?

Apparently not.
Nevertheless, since it has come up a few times in this
thread... This exercise is about determining the last
statement and its location (in possibly transformed code),

That would be the /Block/ statement, then.
and the method is more interesting to me than the exact
definition used to get at an answer.

But the exact definition is a prerequisite for finding an answer that you
would consider correct.
What I am saying is that I would be just as happy with either the original
{a = 1; b = 2} in the example above being returned, or the b = 2 portion
that FF indicates.

That would be the last statement *that is executed*, provided
the execution of the statement does not depend on a condition.
As you can see, your oversimplification is not helpful.


PointedEars
 
J

John G Harris

It doesn't matter because Rice's Theorem doesn't depend on a formal
language syntax. The only thing that matters is if it's a general
purpose higher level programming language (skipping for now on exact
formal definitions of these terms).
<snip>

It doesn't have to be a high-level language. It can be a Turing machine,
and you can't get any lower-level than that.

Rice's theorem is about testing a function to see if it meets its
specification. That's not what's wanted here.

For a general case finding the right place for the return statement
means to algorithmically decide for an arbitrary program with an
arbitrary input if it has a return point and then to find that
intended return point of it, so, unlike Harris claimed, it is
necessary to prove that the halting problem is decidable

Read the question. The OP wants to find the last part of the submitted
code and make it a return statement if it's an expression. It doesn't
matter in the least if execution never reaches that last part : he isn't
interested.

and
consecutively to resolve the Entscheidungsproblem.

That's the problem of proving that a well formed formula is/isn't a
theorem, given a logic language and a set of axioms. It can be solved
for some, relatively simple, languages but not in general.

Another option is
to write a program that successfully passes the Turing test so having
all qualities of a free-will human identity.
I guess that either of these three tasks from above is way beyond the
humble frames of clj. The future Nobel laureate should post such
solution in a serious scientific preprint journals right away :)

They don't award Nobel Prizes for mathematics.

John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top