J
Jeffrey
Hello,
I've found an oddity with HTML/Javascript that I'm hoping someone on
this list could shed some light on for me. This arose when I was using
the libxml parser to parse some HTML web pages.
The observation is that the following page does something odd:
http://www.cs.washington.edu/homes/jbigham/test/js-test.html
The source of the page is:
<html>
<head>
<script>
function alert_me() {
alert("<script>function foo() { alert("Hello!"); }</script>");
}
</script>
</head>
<body>
The body of the page.
<input type="button" onclick="alert_me();" value="Click me!">
</body>
</html>
It produces:
"); }
The body of the page. [[Button]]
According to this page, I should expect this behavior because ending
HTML tags are not allowed to appear within <script> tags:
http://www.htmlhelp.com/tools/validator/problems.html#script
But my problem is that some very popular websites, seem to violate this
and that apparently messes up the libxml SAX parser
(http://xmlsoft.org/). For example, Yahoo does this, as this excerpt
from their page shows:
<script language=javascript>
if(typeof(YAHOO)!='undefined') {
document.write('<map name="yodel"><area shape="rect"
coords="209,30,216,39" href="http://www.yahoo.com"
onclick="callYodel();return false;"><area shape="poly"
coords="211,0,222,1,215,26,211,25" href="http://www.yahoo.com"
onclick="callYodel();return false;"></map><div id=l_fl
style="position:absolute"></div>');
var
lr0='http://us.ard.yahoo.com/SIG=12ldjm8...cSkA/Y=YAHOO/EXP=1160765162/A=3912593/R=0/*';
var lcap=0,lncap=0,ad_jsl=0,lnfv=6,ylmap=0;
var ldir="http://us.i1.yimg.com/us.yimg.com/i/mntl/ww/06q3/";
var swfl1=ldir+"yodel.swf";
var swflw=1,swflh=1;
}
....
</script>
The libxml parser thinks those ending tags are incorrect and causes
problems for me when trying to use it to traverse the DOM. Is Yahoo
incorrect? Is libxml incorrectly interpretting the standard? Are they
both somehow correct?
Thanks!
Jeff
I've found an oddity with HTML/Javascript that I'm hoping someone on
this list could shed some light on for me. This arose when I was using
the libxml parser to parse some HTML web pages.
The observation is that the following page does something odd:
http://www.cs.washington.edu/homes/jbigham/test/js-test.html
The source of the page is:
<html>
<head>
<script>
function alert_me() {
alert("<script>function foo() { alert("Hello!"); }</script>");
}
</script>
</head>
<body>
The body of the page.
<input type="button" onclick="alert_me();" value="Click me!">
</body>
</html>
It produces:
"); }
The body of the page. [[Button]]
According to this page, I should expect this behavior because ending
HTML tags are not allowed to appear within <script> tags:
http://www.htmlhelp.com/tools/validator/problems.html#script
But my problem is that some very popular websites, seem to violate this
and that apparently messes up the libxml SAX parser
(http://xmlsoft.org/). For example, Yahoo does this, as this excerpt
from their page shows:
<script language=javascript>
if(typeof(YAHOO)!='undefined') {
document.write('<map name="yodel"><area shape="rect"
coords="209,30,216,39" href="http://www.yahoo.com"
onclick="callYodel();return false;"><area shape="poly"
coords="211,0,222,1,215,26,211,25" href="http://www.yahoo.com"
onclick="callYodel();return false;"></map><div id=l_fl
style="position:absolute"></div>');
var
lr0='http://us.ard.yahoo.com/SIG=12ldjm8...cSkA/Y=YAHOO/EXP=1160765162/A=3912593/R=0/*';
var lcap=0,lncap=0,ad_jsl=0,lnfv=6,ylmap=0;
var ldir="http://us.i1.yimg.com/us.yimg.com/i/mntl/ww/06q3/";
var swfl1=ldir+"yodel.swf";
var swflw=1,swflh=1;
}
....
</script>
The libxml parser thinks those ending tags are incorrect and causes
problems for me when trying to use it to traverse the DOM. Is Yahoo
incorrect? Is libxml incorrectly interpretting the standard? Are they
both somehow correct?
Thanks!
Jeff