Parsing String of Named Function & Converting To Source

S

Seni Seven

Suppose I have HTML markup with a SCRIPT element as shown below:

<script type="text/javascript">
function multiply(a, b) {
return a * b;
}
var i = multiply(2, 5);
document.write("The value of <i>i</i> is " + i);
</script>

All of this markup and contained script code is retrieved as a string using
Ajax.

I have an HTML parser with support for parsing the string containing the
script code.

In parsing the contained code (using JavaScript, of course), how would you
convert the contained script code from string into source?

What I have found so far (using Firebug in Firefox for development) is this:

1) Isolating the string "function multiply(a, b) { return a * b; }" and then
passing that string as an argument to eval() does not cause the definition
of the function named "multiply," it seems.

2) I can parse the function definition in a way to create the string:

"var multiply = new Function (\"a\", \"b\", \"{ return a * b; }\");"

and then use eval() on it (without error). And then when I eval() the
global level code in a debugger (Firebug in FireFox), as so

eval("var i = multiply(2, 5); document.write(\"The value " +
"of <i>i</i> is \" + i);"); // parsed code as string on one line


the value of identifier 'i' is immediately set to 10 in the debugger, but
the debugger exits with error 'multiply is not defined', which indicates
multiple levels of execution contexts, I suppose. In fact, if I just do an
eval() on the "multiply" assignment of the Function constructor, the
identifier 'multiply' is still not defined.

QUESTION: What's the solution to the goal I want to achieve?

[don't worry about the 'document.write()' call for the moment: that string
pattern gets replaced eventually with a function that injects DOM nodes into
the tree in the same way document.write() would on document loading]


For those who are curious as to why I would want to do this, there is a
basis for this. See BACKGROUND below.



BACKGROUND

I have a modular interactive chemistry calculator running entirely on the
client side.

The "modularity" is such that a main HTML document provides a base interface
and I extend functionality by reading in one JS file which specifies a
document in HTML markup--the markup is valid, but not the document--which is
read in using Ajax. The JS file also contains all the code for doing the
chemistry calculations specific for that module.

The string representing the markup obtained by Ajax retrieval is then passed
to a parsing function which is supposed to do it all: parses the HTML
markup using DOM methods to insert the DocumentFragment into the main
document tree. It also parses CSS (stylesheets and style attributes), as
well as any Javascript contained inside SCRIPT elements. Everything becomes
part of the interactive document by using DOM method calls, HTML --> element
nodes with attributes; CSS --> as stylesheets in LINK element nodes for
external stylesheets, as style property changes to DOM element nodes where
style attributes are identified; JavaScript --> converting strings into
executable source, parsing named functions from global level source (which
might be wrapped in anonymous functions and called).

So far there has been no problem with the HTML and CSS. It's the Javascript
that remains a barrier now.

Security considerations: all the module HTML documents and JS files are
delivered from the same server, the one from which the main document was
loaded. Inherent Javascript security features against accessing the file
system without user permissions are always in play.
 
A

Andreas Bergmaier

Seni said:
Suppose I have HTML markup with a SCRIPT element as shown below:

<script type="text/javascript">
function multiply(a, b) {
return a * b;
}
var i = multiply(2, 5);
document.write("The value of<i>i</i> is " + i);
</script>

All of this markup and contained script code is retrieved as a string using
Ajax.

I have an HTML parser with support for parsing the string containing the
script code.

In parsing the contained code (using JavaScript, of course), how would you
convert the contained script code from string into source?

The string representing the markup obtained by Ajax retrieval is then passed
to a parsing function which is supposed to do it all: parses the HTML
markup using DOM methods to insert the DocumentFragment into the main
document tree. It also parses CSS (stylesheets and style attributes), as
well as any Javascript contained inside SCRIPT elements. Everything becomes
part of the interactive document by using DOM method calls, HTML --> element
nodes with attributes; CSS --> as stylesheets in LINK element nodes for
external stylesheets, as style property changes to DOM element nodes where
style attributes are identified; JavaScript --> converting strings into
executable source, parsing named functions from global level source (which
might be wrapped in anonymous functions and called).

Why do you try to build an (x)html parser in javascript (@PE: or one of
its dialects)?
I see two solutions to your problem:
* Use your parser to append the "script" node from the ajaxed fragment
into your document, just as style nodes and everything other. The script
then should be executed in your global environment.
* Why don't you use iframes? Then valid documents were needed (instead
of your fragments), but you wouldn't have to build a error-prone parser.
As the fragments load from the same server, you could even copy the
DOM-tree from the (hidden) iframe into your document if you need that.

Bergi
 
L

Lasse Reichstein Nielsen

Seni Seven said:
Suppose I have HTML markup with a SCRIPT element as shown below:

<script type="text/javascript">
function multiply(a, b) {
return a * b;
}
var i = multiply(2, 5);
document.write("The value of <i>i</i> is " + i);

Invalid HTML 4 markup. You should escape the </ inside the string.
</script>

All of this markup and contained script code is retrieved as a string using
Ajax.

So, it's a string with the above content (newlines and all).
I have an HTML parser with support for parsing the string containing the
script code.
Ok.

In parsing the contained code (using JavaScript, of course), how would you
convert the contained script code from string into source?

What's the difference? I assume your HTML parser extracts the content
of the script element as a string.
What do you want to do with that string?
What I have found so far (using Firebug in Firefox for development) is this:

1) Isolating the string "function multiply(a, b) { return a * b; }" and then
passing that string as an argument to eval() does not cause the definition
of the function named "multiply," it seems.

It probably does, but in the scope where the eval is executed, not at
the global scope.
For that, you can use, e.g.,
window.eval(sourceString);
(or any other non-direct call to eval).

2) I can parse the function definition in a way to create the string:

"var multiply = new Function (\"a\", \"b\", \"{ return a * b; }\");"

How wasteful :) The function syntax is perfectly fine, no need to make
it more convoluted.
and then use eval() on it (without error). And then when I eval() the
global level code in a debugger (Firebug in FireFox), as so

eval("var i = multiply(2, 5); document.write(\"The value " +
"of <i>i</i> is \" + i);"); // parsed code as string on one line


the value of identifier 'i' is immediately set to 10 in the debugger, but
the debugger exits with error 'multiply is not defined', which indicates
multiple levels of execution contexts, I suppose. In fact, if I just do an
eval() on the "multiply" assignment of the Function constructor, the
identifier 'multiply' is still not defined.

QUESTION: What's the solution to the goal I want to achieve?

Try:
var script = document.createElement("script");
script.textContent = sourceString;
document.body.appendChild(script);
This executes the code as top-level, non-eval code (which there is no way
to do in pure Javascript).
No guarantees wrt. old browsers, but it seems to work in the current versions.

Alternatively, just use a non-direct call to eval.
var topLevelEval = eval;
topLevelEval(sourceString);
The string representing the markup obtained by Ajax retrieval is then passed
to a parsing function which is supposed to do it all: parses the HTML
markup using DOM methods to insert the DocumentFragment into the main
document tree. It also parses CSS (stylesheets and style attributes), as
well as any Javascript contained inside SCRIPT elements. Everything becomes
part of the interactive document by using DOM method calls, HTML --> element
Why not pass it as something easier to parse and separate into
CSS/HTML/JS, e.g., JSON?

/L
 
T

Thomas 'PointedEars' Lahn

Andreas said:
Why do you try to build an (x)html parser

Because it is an interesting problem to solve; `innerHTML' is proprietary,
unreliable and undocumented by comparison. I would probably skip the XHTML
part, though, because there's DOMParser.
in javascript (@PE: or one of its dialects)?

There is no "javascript", so there are no "javascript dialects" :)

I might comment on the rest later.


PointedEars
 
S

Seni Seven

Invalid HTML 4 markup. You should escape the </ inside the string.

Yes, my parser actually stopped on the "</i>" etago markup, of all things.

And yes, the HTML 4.1 recommendation specifies this is invalid.

Then I read one article (I believe there are others) on this subject of
CDATA content inside a SCRIPT element which first asserted that (1) it was
nonsense to have this in the recommendation and (2) that no browser
implementation actually followed this part of the recommendation: that is,
all textual content between the start and end tag of a SCRIPT element was
not effectively regarded as part of HTML markup, so I made my parser ignore
all content between the start and end tag of a SCRIPT element and process it
as script code.

But even so, I admit that I should backlash the string pattern "</"
appropriately to be in compliance with the recommendation.
So, it's a string with the above content (newlines and all).

Ajax effectively recovers data that become strings if handled by
ECMAscript/Javascript, doesn't it?
What's the difference? I assume your HTML parser extracts the content
of the script element as a string.
What do you want to do with that string?

Well, normally, all script as either elements in HTML documents or loaded
from external JS files get processed during HTML document loading in the
browser. This is before the document.close() call has effectively been
invoked and completed, right?

But what do you do if you bring in ECMAscript code (as strings via Ajax)
AFTER the document.close() call has been made? You want to make function
definitions (named functions) part of the executable code in the HTML
document (object?), just as if it were made during document loading. As for
script code outside of named function definition scope, you want to parse
and execute immediately, in order of retrieval via Ajax.

It probably does, but in the scope where the eval is executed, not at
the global scope.
For that, you can use, e.g.,
window.eval(sourceString);
(or any other non-direct call to eval).
Okay.


How wasteful :) The function syntax is perfectly fine, no need to make
it more convoluted.


Try:
var script = document.createElement("script");
script.textContent = sourceString;
document.body.appendChild(script);
This executes the code as top-level, non-eval code (which there is no
way to do in pure Javascript).
No guarantees wrt. old browsers, but it seems to work in the current
versions.

Yes, tried, and found it works.
Alternatively, just use a non-direct call to eval.
var topLevelEval = eval;
topLevelEval(sourceString);

Yes, tried, and found it works.

I am looking at search results (http://is.gd/jwhNln) to understand why
indirect calls to eval() work.
Why not pass it as something easier to parse and separate into
CSS/HTML/JS, e.g., JSON?

I will look into this most certainly and try to make it workable.


The suggestion by poster "Andreas Bergmaier" to use iframes probably is
worthy of exploring. Perhaps I made this overly complicated.
 
D

Dr J R Stockton

I've done that, or something rather similar, in
<http://www.merlyn.demon.co.uk/js-randm.htm> - the code for inserting
Johannes Baagoe's (or other) functions. IIRC, it was necessary to be
slightly subtle about the argument to 'eval', perhaps such as something
like prepending " XXX = " to the function - see the source code of the
page.


Try:
var script = document.createElement("script");
script.textContent = sourceString;
document.body.appendChild(script);
This executes the code as top-level, non-eval code (which there is no way
to do in pure Javascript).
No guarantees wrt. old browsers, but it seems to work in the current versions.

Alternatively, just use a non-direct call to eval.
var topLevelEval = eval;
topLevelEval(sourceString);

Would any of your contribution give a worthwhile improvement in the
cited js-randm.htm, given that the latter works?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,187
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top