Various DOM-related wrappers (Code Worth Recommending Project)

R

Richard Cornford

David said:
On Dec 8, 8:32 pm, Peter Michaux wrote:

A lot of the extra code is for the IE5 problem with the
"*" parameter (gEBTN.)

One of the advantages with a multiple implementations approach would be
that it can take advantage of the fact that using the "*" parameter is
itself a bit of a novelty. Certainly the vast bulk a DOM scripts that
use - getElementsByTagName - are not using it to get a NodeList of all
of the elements in the DOM.

With proper documentation it should be feasible to present a simple
wrapper version that states that using "*" would be problematic if IE5
was among the receiving browsers, and so allow those who either knew
that they would not use "*" (as most would not), or that they had an
Intranet context where IE 5 was excluded, to employ the faster/simpler
version.

Then the more complex version becomes a "if this caveat to the simple
version is likely to be an issue use this other version instead".

Richard.
 
D

David Mark

if (typeof getEBI != 'undefined' &&
typeof getEBTN != 'undefined') {
Looking at the above I'm a little puzzled why I wrote it that way.

Probably because you were testing for variables created in the same
script. But of course, it may not be created at all and it would be
possible for another script to define them as something other than
functions.


It seems the following would be a better test
if (typeof getEBI == 'function' &&
typeof getEBTN == 'function') {

This is more robust.

I should add that it will make no difference if a build process wraps
the functions in an anonymous function and then populates object
methods. I assume that will be the standard approach as otherwise the
global namespace will be polluted.
 
P

Peter Michaux

[snip]
if (typeof getEBI != 'undefined' &&
typeof getEBTN != 'undefined') {
Looking at the above I'm a little puzzled why I wrote it that way.
Probably because you were testing for variables created in the same
script. But of course, it may not be created at all and it would be
possible for another script to define them as something other than
functions.

I don't think I was being that clever...but thanks.


I've changed to this.

I should add that it will make no difference if a build process wraps
the functions in an anonymous function and then populates object
methods. I assume that will be the standard approach as otherwise the
global namespace will be polluted.

I think that will be the common practice.
 
T

Thomas 'PointedEars' Lahn

Peter said:
[...]
I called it "isFeaturedMethod" to make it clear that it was for
feature testing the host environment.

I think isHostMethod is probably better because an plain JavaScript
object could "feature" a method. "featured" is a bit ambiguous.

The documentation should note that
1) the first argument must be qualified host object
2) the second argument is the name of a property that the developer
expects to be a callable
3) if the function returns true then it is pretty safe to assume it
that the method can be called on the object

However, further precautions might be indicated.
Would it be possible to remove the word "host" from rule #1?

It would be necessary, and the word `reference' should be appended
or it should replace the word `object'. You can not pass *objects*.


PointedEars
 
P

Peter Michaux

RobG wrote:



That was probably inevitable with the first few examples. It is a good
idea that these things are kicked around a bit in public. And people
will want to state opinions, question definitions and raise points.

With the possible exception of people who want to be deliberately
obtuse/disruptive the likelihood is that much that is under debate now
will settle down with compromise or agreements to disagree. Peter is
trying to formalize the compromises into statements about the starting
points for design/development.

That is roughly my intent. Really I just want a reasonable set of
rules for what goes into the repository. Infinite code in the
repository is not that appealing. I think it is quite likely that I'll
be doing the brunt of the documentation.
And part of the point of the 'multiple
implementations' of layered interfaces design is that it can accommodate
differing attitudes towards some browser scripting principles.

Indeed multiple implementations does allow for this. There will be
some line that needs to be drawn about what goes into the repository.
Implementations that are "edge cases" or only appeal to more-paranoid-
than-normal developers, can easily be maintained by those developers.
That is another nice feature about the multiple implementations
approach.

What is determined to be an "edge case" or overly paranoid is of
course subjective. The situation I worry about is if the repository
has implementations that

* support IE4 & NN4
* support IE4 but not NN4
* support NN4 but not IE4
* support neither IE4 nor NN4

That is a lot of code when there will be many functions in the
repository. It is true not all the functions could be written for all
four situations listed above but still. To a certain degree it is only
an academic exercise to write such implementations for the
combinations of support for IE4 and NN4.
So even if a base position of not worrying about actively
supporting/accommodating browsers as old as IE and Netscape 4 is
employed there is no reason for the repository not to include (properly
documented) implementation versions that do accommodate those browsers
(any less that it would be acceptable for it to include implementations
for much more restrictive contexts such as browser specific Intranets
and web applications that specify a very small set of browsers (with the
motivation that those applications be OS neutral but may require
specific browsers on each OS)).

Testing, documentation and maintenance are my justification for
limiting what goes into the repository.

It may be that later, when the repository is more stable and we have
all the kinks of the system worked out, that a broader set of
implementations can be added. I think for the time being just a few
implementations of each function is practical to get going. Already
just a few implementations have brought out many issues. Some of these
issues require me touching all the files that I currently have for the
repository. Having fewer files during this unstable period is helpful.

Does that mean that you have come to some conclusion as to what is, and
what is not, a 'function'?

Personally I think wikis are far more trouble than they are worth. And
they are pretty much useless off-line.

If Rob is interested in writing some substantial pages and wants a
wiki to do so, then I think he should have the tools he wants. No one
else has jumped in to volunteer.

So far there does seem be a deficit in proposals being explicit about
what the interface being created is supposed to do up front. That seems
unfortunate as when a system is based around interface design with
multiple underlying implementations then it seems unreasonable to have
to deduce how a second implementation should behave from examining the
actual behaviour of the first (not to mention the question of
determining if the first implementation is correct without a clear
assertion of what "correct" behaviour is expected to be).

I think leading up to the first group of functions that will be in the
repository it is ok if the system is a little chaotic. Seeing a body
of code is helpful for formalizing the system. After a system has
emerged it can be followed in the future.

I strongly disagree with that. If something is to be "Code Worth
Recommending" then who is supposed to be recommending it, and why? If
there is some notion that the end result be "Recommended" by this group
then that end result must be presented to this group, and the whole
group (regular and intermittent participants, lurkers, casual visitors
and anyone stumbling across posts as a result of web searches). It is
the strength of this group that it is public, unmoderated and open to
anyone.

I agree.
There can never be anything to stop additional discussion happening
elsewhere. It is extremely unlikely that I would bother to participate
in such discussions.

I agree.
 
A

AKS

Here is the getEBCS that can get elements only by class name

if (typeof getEBTN != 'undefined') {

var getEBCS = function(s, d) {
var m, // regexp matches and temp var in loop
i, // loop index
ilen, // loop limit
el, // temp element variable
ns = [], // elements to return
cn = s.substring(1), // className in s
els = getEBTN('*', d); // candidate elements for return

for (i=0, ilen=els.length; i<ilen; ++i) {
el = els;
// Could call an external hasClassName function but in line
// it here for efficiency. There are no cross browser issue
// with checking className.
if ((m = el.className) &&
(' ' + m + ' ').indexOf(' ' + cn + ' ') >- 1) {
ns[ns.length] = el;
}
}
return ns;
};

}


This condition:

(' ' + m + ' ').indexOf(' ' + cn + ' ') >- 1

will be a problem if you will meet this kind of markup:

<div class='jquery
sucks'>
</div>

It is absolutely valid markup and such class name (consisting of two
names) will be correctly processed by browsers. FF will replace "\n"
with "\s", but IE won't. Therefore getEBCS('.sucks') won't find this
node. But if getEBCS will use regexp (David Mark suggested new
RegExp('(^|\\s)' + cn + '(\\s|$)')) instead it'll be ok...
 
D

David Mark

Here is the getEBCS that can get elements only by class name
if (typeof getEBTN != 'undefined') {
var getEBCS = function(s, d) {
var m, // regexp matches and temp var in loop
i, // loop index
ilen, // loop limit
el, // temp element variable
ns = [], // elements to return
cn = s.substring(1), // className in s
els = getEBTN('*', d); // candidate elements for return
for (i=0, ilen=els.length; i<ilen; ++i) {
el = els;
// Could call an external hasClassName function but in line
// it here for efficiency. There are no cross browser issue
// with checking className.
if ((m = el.className) &&
(' ' + m + ' ').indexOf(' ' + cn + ' ') >- 1) {
ns[ns.length] = el;
}
}
return ns;
};


This condition:

(' ' + m + ' ').indexOf(' ' + cn + ' ') >- 1

will be a problem if you will meet this kind of markup:

<div class='jquery
sucks'>
</div>

It is absolutely valid markup and such class name (consisting of two
names) will be correctly processed by browsers. FF will replace "\n"
with "\s", but IE won't. Therefore getEBCS('.sucks') won't find this
node. But if getEBCS will use regexp (David Mark suggested new
RegExp('(^|\\s)' + cn + '(\\s|$)')) instead it'll be ok...-


Thanks. I will update that here.

Despite the fact that the project is currently not open to a more
advanced CSS selector function, I am going to post my take on it later
today. It uses a few functions (getAttribute, elementDocument and
elementChildren) that we should discuss shortly anyway. Even if it
never goes into the project, it should spark some useful discussion.

If it turns out to be a practical selector query function, I'll wrap a
simple "object factory" function around it with methods to chain the
project's DOM-related functions (e.g. setOpacity.) Of course,
developers won't have to use such an object with the API (I certainly
wouldn't), but I think such an interface will help the project gain
popularity.

Even without Peter's proposed optimizations with Function
construction, I am getting very favorable results on a hacked version
of this test:

http://mootools.net/slickspeed/

The slow branch using IE7 is winning most tests (some by a
landslide.) The XPath branch (tested in FireFox) wins virtually all
of them.

I left out some of the selectors (e.g. nth child, even, odd) for the
moment, but adding them will just create additional cases for switch
statements, so they won't slow down what is there now.

On that subject, does anybody know what this is supposed to mean?

div[class|=dialog]

The three libraries in the test don't seem to agree on it. It looks
like jQuery doesn't support it. Speaking of disagreements, none of
them agree on "*" either. I'm not sure what the deal is with that,
but I am still using the (inefficient) filtering system I originally
proposed for gEBTN. AFAIK, it accurately returns elements only. The
other three libraries in the test all return one or more additional
nodes, so it makes me wonder if they are not filtering accurately.

I need to do a little more testing of adjacency in the slow branch as
I think I botched that one, despite the fact that it passes the test.
Speaking of adjacency, what is this supposed to mean?

div ~ div

I made it the opposite of:

div + div

It passes the test, but I wonder if it is using the correct logic.
 
D

Diego Perini

Here is the getEBCS that can get elements only by class name
if (typeof getEBTN != 'undefined') {
var getEBCS = function(s, d) {
var m, // regexp matches and temp var in loop
i, // loop index
ilen, // loop limit
el, // temp element variable
ns = [], // elements to return
cn = s.substring(1), // className in s
els = getEBTN('*', d); // candidate elements for return
for (i=0, ilen=els.length; i<ilen; ++i) {
el = els;
// Could call an external hasClassName function but in line
// it here for efficiency. There are no cross browser issue
// with checking className.
if ((m = el.className) &&
(' ' + m + ' ').indexOf(' ' + cn + ' ') >- 1) {
ns[ns.length] = el;
}
}
return ns;
};


This condition:

(' ' + m + ' ').indexOf(' ' + cn + ' ') >- 1

will be a problem if you will meet this kind of markup:

<div class='jquery
sucks'>
</div>

It is absolutely valid markup and such class name (consisting of two
names) will be correctly processed by browsers. FF will replace "\n"
with "\s", but IE won't. Therefore getEBCS('.sucks') won't find this
node. But if getEBCS will use regexp (David Mark suggested new
RegExp('(^|\\s)' + cn + '(\\s|$)')) instead it'll be ok...


These are the facts, it is really valid HTML markup but I would
discard
any CMS or automation producing that kind of markup, first for using
single quotes second for the linefeed embedded in it...

However since we are hunting solutions not modifying specifications
in my selectors I will use the following:

(' '+element.className+' ').replace(/\s+/g, ' ').indexOf(' '+cn+' ') >
-1

In any case the "indexOf()" method seems to be faster than a match,
even accounting some time for concatenating the blanks, I will do
some test with the above fix and see if it is still faster with
a replace using itself an in-line regular expression.

If that proves to be slower, I will prefer to drop support for
anything
but simple spaces as class name separator, so to educate HTML
writers...
 
D

David Mark

Here is the enhanced getEBCS. As mentioned, the only pseudo-selectors
supported are first-child and last-child. The others should be
trivial to add without any severe impact on performance. As no
attempt is made to validate selectors, passing unsupported types will
result in anything from the wrong results to script errors.

Tested with the MooTools "SlickSpeed" page in IE7, FireFox, Opera 9
and Windows Safari Beta. That last one threw a monkey wrench into the
works in that it doesn't support the child pseudo-selectors properly
with XPath. As the design is split between browsers that support
XPath and those that don't, XPath is completely disabled when this
"feature" is detected. It would be worth changing it to choose based
on the selector as Safari appears to have the fastest XPath
implementation. Luckily, it is relatively fast with the DOM too.

Certainly there are more optimizations that could be made. I don't
see any urgency there as even IE wins most of the tests as it sits.
All tests were done with the most inefficient implementations of gEBI
and gEBTN. All of the code was wrapped in an anonymous function so
that global variables would not be part of the equation (I have heard
that some browsers are slower to access those.)

I successfully tested quite a few selector combinations, but I have no
illusions that this is a pat hand.

Lots of lines will wrap. I don't have time to format it for the ng.
If you want to help with testing and don't feel like extricating the
code from this post, send me an email and I will send my test page. I
can't send the hacked MooTools test page as it has a copyright notice,
but I can offer instructions on how to create a local copy and add a
fourth column to it.

I think I included everything required that is not currently in the
repository. IMO, some of the support functions should be discussed
for the project in the near future.

var $; // Declare globally for automated testing

// Put the rest in an anonymous function.

var doc = this.document;
var html = getAnElement();
var getEBCN, getEBXP, resolve, selectByXPath,
xPathChildSelectorsBad;
var attributeAliases = {'for':'htmlFor', accesskey:'accessKey',
maxlength:'maxLength', 'class':'className', readonly:'readOnly'};
var attributesBad = (html && html.getAttribute &&
html.getAttribute('style') && typeof(html.getAttribute('style')) ==
'object');
var reCamel = new RegExp('([^-]*)-(.)(.*)');

// Used to convert array-like host objects to arrays
// IIRC, Array.prototype.slice didn't work with node lists
function toArray(o) {
var a = [];
var l = o.length;
while (l--) { a[l] = o[l]; }
return a;
}

if (isFeaturedMethod(doc, 'evaluate')) {
resolve = function() { return 'http://www.w3.org/1999/xhtml'; };
getEBXP = function(s, d) {
d = d || doc;
var i, q = [], r, docNode = (d.nodeType == 9)?d:
(d.ownerDocument);
r = docNode.evaluate(s, d,
(xmlParseMode(docNode))?resolve:null,

global.XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
null);
i = r.snapshotLength;
while (i--) { q = r.snapshotItem(i); }
return q;
};
}

function elementDocument(el) {
if (el.ownerDocument) {
return el.ownerDocument;
}
while (el.parentNode) {
el = el.parentNode;
}
return el;
}

var hasAttribute = (function() {
if (isFeaturedMethod(html, 'hasAttribute')) {
return function(el, name) { return el.hasAttribute(name); };
}
if (isFeaturedMethod(html, 'attributes')) {
return function(el, name) { return !!(el.attributes[name] &&
el.attributes[name].specified); };
}
})();

function camelize(name) {
var m = name.match(reCamel);
return (m)?([m[1], m[2].toUpperCase(), m[3]].join('')):name;
}

var getAttribute = (function() {
var att, alias, nameC, nn, reEvent, reNewLine, reFunction,
reBoolean, reURI;

if (html && html.getAttribute) {
if (attributesBad) {
reEvent = new RegExp('^on');
reNewLine = new RegExp('[\\n\\r]', 'g');
reFunction = new RegExp('^function anonymous\\(\\) *{(.*)}$');
reBoolean = new RegExp('checked|selected|disabled|multiple');
reURI = new RegExp('href|src|longdesc');

return function(el, name) {
if (!hasAttribute || hasAttribute(el, name)) {
if ((elementDocument(el)).selectNodes) { return
el.getAttribute(name, 2); } // HTML embedded in an XML document
name = name.toLowerCase();
alias = attributeAliases[name];
if (!alias) {
if (name == 'style') { return (el.style)?
(el.style.cssText || null):null; }
if (reBoolean.test(name)) { return (el[name])?
name:null; }
if (reURI.test(name)) { return el.getAttribute(name,
2); }
if (reEvent.test(name) && el[name]) {
att = el[name].toString();
if (att) {
att = att.replace(reNewLine, '');
if (reFunction.test(att)) { return
att.replace(reFunction, '$1'); }
}
return null;
}
nn = el.tagName;
if (nn == 'select' && name == 'type') { return null; }
if (nn == 'form' && el.getAttributeNode) {
att = el.getAttributeNode(name);
return (att && att.nodeValue)?att.nodeValue:null;
}
}
nameC = camelize(alias || name);
if (typeof(el[nameC]) == 'unknown') {
return '[unknown]';
}
else {
return ((typeof(el[nameC]) != 'string' && typeof(el[nameC]) !=
'undefined' && el[nameC] !== null && el[nameC].toString)?
el[nameC].toString():el[nameC]) || null;
}
}
return null;
};
}
return function(el, name) { return el.getAttribute(name); };
}
})();

var getChildren = (function() {
if (isFeaturedMethod(html, 'children')) {
return function(el) {
return el.children;
};
}
if (isFeaturedMethod(html, 'childNodes')) {
return function(el) {
// Should use XPath here when possible
// Doesn't matter for getEBCS as XPath branch never calls this
var nl = el.childNodes, r = [];
var i = nl.length;

while (i--) {
// Code duplicated for performance
if ((nl.nodeType == 1 && nl.tagName != '!') || (!
nl.nodeType && nl.tagName)) {
r.push(nl);
}
}
return r.reverse();
//return filter(toArray(el.childNodes), elementFilter);
};
}
})();

function parseAtom(s) {
var ai, m, mv, ml;
var o = {};

s = s.replace(/\x00/g, ' '); // Change nulls back to spaces
m = s.match(/^([>\+~])/);
if (m) {
o.combinator = m[1];
s = s.substring(1);
}

m = s.match(/^([^#\.\[:]+)/);
o.tag = m ? m[1] : '*';

m = s.match(/#([^\.]+)/);
o.id = m ? m[1] : null;

m = s.match(/\.([^\[\:]+)/);
o.cls = m ? m[1] : null;

m = s.match(/:(.+)$/);
o.pseudo = m ? m[1] : null;

m = s.match(/\[[^\]]+\]/g);
if (m) {
ml = m.length;
o.attributes = [];
o.attributeValues = [];
o.attributeOperators = [];
for (ai = 0; ai < ml; ai++) {
o.attributes[ai] = m[ai].substring(1, m[ai].length - 1);
m[ai] = m[ai].replace(/^%/, '');
mv = m[ai].match(/(~|!)?="*([^"\]]*)"*/);
if (mv) {
o.attributeOperators[ai] = mv[1];
o.attributeValues[ai] = mv[2];
o.attributes[ai] = o.attributes[ai].replace(/(~|!)?=.*/,
'');
}
}
}
return o;
}

if (typeof(getEBXP) != 'undefined') {
selectByXPath = function(d, a) {
var atts, m, o, r, s;
var docNode = (d.nodeType == 9)?d:elementDocument(d);
var i = a.length;
while (i--) {
o = parseAtom(a);
if (s) {
if (o.combinator) {
s += (o.combinator == '>')?'/':(o.combinator == '~')?'/
preceding-sibling::':'/following-sibling::';
}
else {
s += '//';
}
}
else {
s = './/';
}
s = [s, ((xmlParseMode(docNode))?'html:':''),
(o.pseudo)?'*':eek:.tag, ((o.cls)?"[contains(concat(' ', @class, ' '), '
" + o.cls + " ')]":'')].join('');
if (o.pseudo) {
s += ((o.pseudo == 'last-child')?'[last()]':'[1]') +
'[self::' + o.tag + ']';
}
if (o.id) {
s += ['[@id="', o.id, '"]'].join('');
}
if (o.attributes) {
atts = [];
m = o.attributes.length;
while (m--) {
switch(o.attributeOperators[m]) {
case '~':
atts.push(['contains(@', o.attributes[m], ',"',
o.attributeValues[m], '")'].join(''));
break;
case '!':
atts.push(['not(@', o.attributes[m], '="',
o.attributeValues[m], '")'].join(''));
break;
default:
atts.push((o.attributeValues[m])?['@', o.attributes[m],
'="', o.attributeValues[m], '"'].join(''):['@',
o.attributes[m]].join(''));
}
}
s = [s, '[', atts.join(' and '), ']'].join('');
}
}
return getEBXP(s, d);
};
}

var getEBCS = (function() {
var els, // candidate elements for return
ns, // elements to return
o, // selector atom object
docNode,
cache = {}, // cached select functions
aCache = {}, // cached select atom functions
qid = 0, // query id (marks branches as traversed)
bAll; // indicates if "all" object is featured for elements

function getDocNode(d) {
return (d.nodeType == 9 || (!d.nodeType && !d.tagName))?
d:elementDocument(d);
}

bAll = (isFeaturedMethod(html, 'all'));

var previousAtom; // adjacent selectors check this to determine
comparison (currently only checking for tag)
var selectAtomFactory = function(id, tag, cls, combinator,
attributes, attributeValues, attributeOperators, pseudo) {
var ai, al, att, b, c, d, el, i, j, k, m, r, sibling;
return function(a, docNode) {
if (attributes) { al = attributes.length; }
r = [];
k = a.length;
qid++;
while (k--) {
d = a[k];
if (id) {
if (!d.tagName || (combinator && combinator != '>')) {
els = (el = getEBI(id, docNode)) ? [el] : [];
}
else {
els = (bAll && (el = d.all[id]))?[el]:((combinator ==
'>')?getChildren(d):getEBTN(d, tag));
}
}
else {
els = (combinator == '>')?getChildren(d):getEBTN(d, tag);
}
i = els.length;
while (i--) {
el = els;
b = ((!cls || ((m = el.className) &&
(' ' + m + ' ').indexOf(cls) > -1)) &&
(!id || el.id == id)
);
if (b) {
switch (combinator) {
case '~':
case '+':
sibling = el;
do {
sibling = (combinator == '~')?
sibling.nextSibling:sibling.previousSibling;
}
while (sibling && sibling.nodeType != 1);
b = b && (sibling && ((!previousAtom.id ||
previousAtom.id == sibling.id) && (previousAtom.tag == '*' ||
sibling.tagName.toLowerCase() == previousAtom.tag) && (!
previousAtom.cls || ((m = sibling.className) && (' ' + m + '
').indexOf(previousAtom.cls) > -1))));
break;
default:
b = b && (tag == '*' || (!combinator && !id) ||
el.tagName.toLowerCase() == tag);
}

if (pseudo && el.parentNode) {
c = getChildren(el.parentNode);
b = b && (c[(pseudo == 'first-child')?0:c.length - 1]
== el);
}
if (attributes) {
ai = al;
while (ai-- && b) {
switch(attributeOperators[ai]) {
case '~':
att = getAttribute(el, attributes[ai]);
b = b && att && att.indexOf(attributeValues[ai]) !
= -1;
break;
case '!':
b = b && getAttribute(el, attributes[ai]) !=
attributeValues[ai];
break;
default:
b = b && (attributeValues[ai])?getAttribute(el,
attributes[ai]) == attributeValues[ai]:(!hasAttribute &&
getAttribute(el, attributes[ai])) || hasAttribute(el, attributes[ai]);
}
}
}
if (b && el._qid != qid) { r[r.length] = el; el._qid =
qid; if (id) { break; } }
}
}
}
return r;
};
};

var selectFactory = function(a) {
var i, j;

return function(d) {
i = a.length;
j = 1;
docNode = getDocNode(d);
ns = [[d]];
while (i--) {
o = parseAtom(a);
if (!aCache['_' + a]) {
aCache['_' + a] = selectAtomFactory(o.id,
o.tag.toLowerCase(), (o.cls)?' ' + o.cls + ' ':null, o.combinator,
o.attributes, o.attributeValues, o.attributeOperators, o.pseudo);
}
ns[j] = aCache['_' + a](ns[j - 1], docNode);
previousAtom = o;
j++;
}
return ns[j - 1].reverse();
};
};

var get = (function() {
var el, getD, r;

if (typeof getEBI != 'undefined' &&
typeof getEBTN != 'undefined' &&
typeof getChildren != 'undefined' &&
typeof getAttribute != 'undefined') {
getD = function(d, a, s, qid) {
if (a.length == 1) {
o = parseAtom(a[0]);
if (!o.pseudo && !o.attributes) {
if (o.id && !o.pseudo && !o.cls && !o.attributes) {
// Optimization for #foo
el = getEBI(o.id, getDocNode(d));
return (el && (o.tag == '*' || o.tag ==
el.tagName.toLowerCase()))?[el]:[];
}
if (!o.id && !o.cls) {
// Optimization for foo
r = getEBTN(d, o.tag);
return (typeof(r.reverse) == 'function')?r:toArray(r);
}
}
}
s = '_' + s;
if (!cache) { // avoid toString conflict
cache = selectFactory(a);
}
return cache(d, qid);
};
}

if (getD) {
return function(d, a, s, qid) {
// Really only need to disable XPath for specific selectors
if (selectByXPath && !xPathChildSelectorsBad) {
return (get = selectByXPath)(d, a, s);
}
else {
return (get = getD)(d, a, s, qid);
}
};
}
})();

if (get) {
return function(s, d) {
var a = [], aSel = [], chr, i, inQuotes, r = [], used = {};

d = d || doc;
s = s.replace(/^\s+/,'').replace(/\s+$/,''); // trim
s = s.replace(/\s+,/g, ',').replace(/,\s+/g, ','); // remove
spaces before and after commas
i = s.length;
while (i--) {
chr = s.charAt(i);
switch (chr) {
case ',':
if (inQuotes) {
aSel[aSel.length] = chr;
}
else {
a[a.length] = aSel.reverse().join('');
aSel = [];
}
break;
case ' ':
// change quoted spaces to nulls temporarily
// changed back in parseAtom
aSel[aSel.length] = (inQuotes)?'\x00':' ';
break;
case '"':
inQuotes = !inQuotes;
aSel[aSel.length] = chr;
break;
default:
aSel[aSel.length] = chr;
}
}
if (aSel.length) { a[a.length] = aSel.reverse().join(''); }
a.reverse();

i = a.length;
while (i--) {
a = a.replace(/\s+/g, ' '); // collapse multiple spaces
a = a.replace(/([^\s])([>\+])/g, '$1 $2');
a = a.replace(/([^\s])([~])[^=]/g, '$1 $2');
a = a.replace(/([>\+~])\s/g, '$1');
if (!used['_' + a]) { // prevent dupes (e.g. div, div,
div)
r = r.concat(get(d, a.split(' ').reverse(), a));
}
used['_' + a] = 1;
}
return r;
};
}
})();

if (getEBCS) {
$ = getEBCS;

getEBCN = function(s, d) {
return getEBCS('.' + s, d);
};
}

// Safari 3 bug test (tested Windows Beta version)
// This logic needs to go in a central DOMContentLoaded wrapper

if (isFeaturedMethod(this, 'addEventListener')) {
this.addEventListener('load', function() {
if (getEBXP) {
xPathChildSelectorsBad = !!getEBXP('.//*[1]
[self::body]').length;
}
}, false);
}
 
P

Peter Michaux

[snip]

Lots of lines will wrap. I don't have time to format it for the ng.
If you want to help with testing and don't feel like extricating the
code from this post, send me an email and I will send my test page. I
can't send the hacked MooTools test page as it has a copyright notice,

It is an MIT copyright so you can do anything you want with it except
remove the copyright notice or sue the author.

but I can offer instructions on how to create a local copy and add a
fourth column to it.

I think I included everything required that is not currently in the
repository. IMO, some of the support functions should be discussed
for the project in the near future.

I agree. If you are logged in to trac then you can make tickets for
future topics here

<URL: http://cljs.michaux.ca/trac/newticket?component=future+topics>

If anyone else would like to be able to create tickets please send me
an email and I will create a trac account.

[snip]
 
P

Peter Michaux

Here is the getEBCS that can get elements only by class name
if (typeof getEBTN != 'undefined') {
var getEBCS = function(s, d) {
var m, // regexp matches and temp var in loop
i, // loop index
ilen, // loop limit
el, // temp element variable
ns = [], // elements to return
cn = s.substring(1), // className in s
els = getEBTN('*', d); // candidate elements for return
for (i=0, ilen=els.length; i<ilen; ++i) {
el = els;
// Could call an external hasClassName function but in line
// it here for efficiency. There are no cross browser issue
// with checking className.
if ((m = el.className) &&
(' ' + m + ' ').indexOf(' ' + cn + ' ') >- 1) {
ns[ns.length] = el;
}
}
return ns;
};
}

This condition:
(' ' + m + ' ').indexOf(' ' + cn + ' ') >- 1
will be a problem if you will meet this kind of markup:
<div class='jquery
sucks'>
</div>
It is absolutely valid markup and such class name (consisting of two
names) will be correctly processed by browsers. FF will replace "\n"
with "\s", but IE won't. Therefore getEBCS('.sucks') won't find this
node. But if getEBCS will use regexp (David Mark suggested new
RegExp('(^|\\s)' + cn + '(\\s|$)')) instead it'll be ok...

These are the facts, it is really valid HTML markup but I would
discard
any CMS or automation producing that kind of markup, first for using
single quotes second for the linefeed embedded in it...

However since we are hunting solutions not modifying specifications
in my selectors I will use the following:

(' '+element.className+' ').replace(/\s+/g, ' ').indexOf(' '+cns+' ') >
-1

In any case the "indexOf()" method seems to be faster than a match,
even accounting some time for concatenating the blanks, I will do
some test with the above fix and see if it is still faster with
a replace using itself an in-line regular expression.

If that proves to be slower,


I think will necessarily be slower than the indexOf version because it
is doing both the replace and the indexOf I just ran the following
tests in Firefox and the indexOf is certainly the fastest.

var el = document.getElementById('advantages');
console.log(el);

var cn = 'baz';

var re = /(^|\s)baz(\s|$)/;
var cns = ' ' + cn + ' ';

var start = (new Date()).getTime();
var i = 10000;
while(i--) {

// 350 ms
//(' '+el.className+' ').replace(/\s+/g,' ').indexOf(cns);

// 325 ms
//(' '+el.className.replace(/\s+/g,' ')+' ').indexOf(cns);

// 290 ms
//(' '+el.className.replace('\n', ' ')+' ').indexOf(cns);

// 240 ms
//(' '+el.className+' ').indexOf(cns);

// 375 ms
el.className.match(re);

}

console.log((new Date()).getTime() - start);

I will prefer to drop support for anything but simple spaces
as class name separator, so to educate HTML writers...

That seems like a reasonable choice if the documentation explains that
choice.
 
P

Peter Michaux

[snip]
function elementDocument(el) {
if (el.ownerDocument) {
return el.ownerDocument;
}
while (el.parentNode) {
el = el.parentNode;
}
return el;
}

The above function will be defined in browsers without ownerDocument
or parentNode. NN4 seems to be one example.

Wouldn't it be better to write the following?


if (typeof getAnElement == 'function') {

var html = getAnElement();

var elementDocument = (function() {

if (isRealObjectProperty(html, 'ownerDocument')) {
return function(el) {
return el.ownerDocument;
};
}
else if (isRealObjectProperty(html, 'parentNode')) {
return function(el) {
var e;
while (e = el.parentNode) {
el = e;
}
return el;
};
}

})();

}

I'm assuming the extra "e" variable is faster than accessing
parentNode property twice.

[snip]
 
A

AKS

On Dec 13, 1:16 pm, AKS <[email protected]> wrote:
...I will prefer to drop support for
anything
but simple spaces as class name separator, so to educate HTML
writers...

"To educate HTML writers"?
It won't be so easy, because they already know from the spec, that:

Multiple class names must be separated by -white space characters-.

They also know, that (from Wikipedia):

In Unicode (Unicode Character Database) the following codepoints are
defined as whitespace:

* U0009-U000D (Control characters, containing TAB, CR and LF)
* U0020 SPACE
* U0085 NEL
* U00A0 NBSP
* U1680 OGHAM SPACE MARK
* U180E MONGOLIAN VOWEL SEPARATOR
* U2000-U200A (different sorts of spaces)
* U2028 LSP
* U2029 PSP
* U202F NARROW NBSP
* U205F MEDIUM MATHEMATICAL SPACE
* U3000 IDEOGRAPHIC SPACE

So I think, that "to drop to support" would be too simple decision.
 
P

Peter Michaux

[snip]
var getChildren = (function() {
if (isFeaturedMethod(html, 'children')) {
return function(el) {
return el.children;
};
}
if (isFeaturedMethod(html, 'childNodes')) {

Need to test for tagName support with this?

&& isRealObjectProperty(html, 'tagName')

If no tagName support the following will never find any children.
return function(el) {
// Should use XPath here when possible
// Doesn't matter for getEBCS as XPath branch never calls this
var nl = el.childNodes, r = [];
var i = nl.length;

while (i--) {
// Code duplicated for performance
if ((nl.nodeType == 1 && nl.tagName != '!') ||
(!nl.nodeType && nl.tagName)) {
r.push(nl);


If performance is a concern wouldn't the following be better?

var n = nl;
if ((n.nodeType == 1 && n.tagName != '!') ||
(!n.nodeType && n.tagName)) {
r.push(nl);

}
}
return r.reverse();
//return filter(toArray(el.childNodes), elementFilter);
};
}
})();


Could the || be determined at feature test time?

if (typeof getAnElement == 'function') {
var html = getAnElement;


var getChildren = (function() {
if (isFeaturedMethod(html, 'children')) {
return function(el) {
return el.children;
};
}
if (isFeaturedMethod(html, 'childNodes') &&
isRealObjectProperty(html, 'tagName') {
if (isRealObjectProperty(html, 'nodeType')) {
return function(el) {
var nl = el.childNodes, r = [];
var i = nl.length;

while (i--) {
var n = nl;
if (n.nodeType == 1 && n.tagName != '!') {
r.push(n);
}
}
return r.reverse();
};

}
else {
return function(el) {
var nl = el.childNodes, r = [];
var i = nl.length;

while (i--) {
var n = nl;
if (n.tagName) {
r.push(n);
}
}
return r.reverse();
};

}
}
})();
}

Even if this is incorrect, this points out something I don't like
about optimizing for speed. It creates a lot of code.

[snip]
 
P

Peter Michaux

"To educate HTML writers"?

I think Diego didn't mean to educate them about what is possible in
HTML but about the limitations of the function. The documentation for
the JavaScript function would state that the function should only be
used when the only whitespace in the class attribute are spaces. If
the documentation states this limitation and the HTML authors follow
this then there will not be a problem and the JavaScript will run
faster. My guess is most HTML authors would think an attribute cannot
contain a line break or any of the other exotic whitespace below.
It won't be so easy, because they already know from the spec, that:

Multiple class names must be separated by -white space characters-.

They also know, that (from Wikipedia):

In Unicode (Unicode Character Database) the following codepoints are
defined as whitespace:

* U0009-U000D (Control characters, containing TAB, CR and LF)
* U0020 SPACE
* U0085 NEL
* U00A0 NBSP
* U1680 OGHAM SPACE MARK
* U180E MONGOLIAN VOWEL SEPARATOR

I had to look it up...

"The MONGOLIAN VOWEL SEPARATOR is used to separate the vowel A/E at
the end of a word and the consonant before them. On the computer
screen, a graphic symbol of a fourth part of the length of a whole
character should be shown; but in printing it suffices to leave a
space without the graphic symbol."

<URL: http://ra.dkuug.dk/jtc1/sc2/wg2/docs/n1972.htm>

"A thin space character used in Mongolian to cause the final two
characters of a word to take on different shapes."

* U2000-U200A (different sorts of spaces)
* U2028 LSP
* U2029 PSP
* U202F NARROW NBSP
* U205F MEDIUM MATHEMATICAL SPACE
* U3000 IDEOGRAPHIC SPACE

So I think, that "to drop to support" would be too simple decision.

This is a great point you've brought up. It certainly requires
consideration and if an implementation dropped support then it would
require documentation saying so.
 
A

AKS

My guess is most HTML authors would think an attribute cannot
contain a line break or any of the other exotic whitespace below.

I have tried to describe a hypothetical situation. Actually, there's
no chances to meet such html-code (it all because of my
imagination :) ).
I just want to help you to cteate not only fastest method, but also
reliable. And I think, that regexps could be useful in this case.
 
D

Diego Perini

I have tried to describe a hypothetical situation. Actually, there's
no chances to meet such html-code (it all because of my
imagination :) ).
I just want to help you to cteate not only fastest method, but also
reliable. And I think, that regexps could be useful in this case.

AKS, Your imagination have just cut out of compliance claims the most
named frameworks on the market. The reality is that few web sites use
that syntax, but FEW is still important to me too if possible.

Drop support is a too heavy statement for a couple of milliseconds you
are correct, and since we have solutions to avoid using match I
believe this is the right way to go (using replace).

I said "solutions" because Peter left out the most obvious:

(' '+e.className+' ').split(/\s/).indexOf(cn) > -1

But the "replace()" method is still faster, and working on strings
doesn't require and extended "indexOf()" to be present. No dependence.

I would like to point out that by concatenating the className with
spaces also avoid checking the className is not empty, and this is
another boost to the conditional.

I am kind of forced not to use "match" for the simple reason I use
compiled selectors and using "match()" I will be forced to compile a
more complex (new RegExp()) in-line for each iteration, just compiling
a (/\s/) is faster and is readable. And because we can tell why it is
faster instead of just guessing.

Thank you AKS for bringing this up, I for myself patched my code with
"replace()".
 
V

VK

In comp.lang.javascript message <45cfaa54-8670-4472-b3a6-ec14838231d3@e1
0g2000prf.googlegroups.com>, Tue, 11 Dec 2007 12:04:25, VK


If the facts (below) are as I believe, then that's false logic (if not
natural perversity).

(1) IE4, being distributed with Win98, was and is more widely used than
NN4.

At the moment of the the Battle of the 4th Versions began (October
8,1997) the market share was about NN 60% / IE 40%
By the end of the Battle (November 24, 1998) the market share was
about NN 10% / IE 90%
It is still neither 1% nor 0.001% so the logic "no NN4.x have left for
sure but some noticeable amount of IE4.x very well can be" is too
convoluted for my blood. Either say both "A" and "B", or just don't
say anything.
(2) Support for IE4 is, for at least one common task, easier than for
NN4.

.... OK... So the reason to support one legacy browser and do not
support other is based on how easy to do it? Is this the real true of
the "correct support"?

You see, at the summer of 1998, then I was doing a web-site for one
Silicon Valley computer company, CEO was using nothing but IE4 while
CFO who signed my checks was using nothing but NN4. That was a very
frequent situation at that time with people sometimes even arguing
with each other "who is cooler and more advanced" - with IE or with
NN. So in the wildest dream no one developer would think to issue a
solution without full testing in both application. And no, even at
that time I never relied on userAgent. document.layers check, iframe-
layer-nolayers and a bunch of other often unspeakably curved way to
make things to work together despite the intentional design to make it
incompatible with the rival. But that was _practical_ task required by
the life. And now I'm oftenly getting an impression that some coding
is done by young people who are just having too much spare time in
their hands plus no better ideas of how to use this spare time.

So back the round one:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>Fail me!</title>
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1">
<script type="text/javascript">
function init() {
var probe = document.getElementById('Header1');
window.alert('I am OK');
}
window.onload = init;
</script>
</head>

<body>
<h1 id="Header1">Fail me!</h1>
<p>If you don't see alert message &quot;I am OK&quot;
on page load then it means that the script could not
be executed because of missing DOM features</p>
<noscript>
<p style="color:red">If you see this text then it means
that the current browser settings do not allow script
execution for this page. This attempt doesn't count</p>
</noscript>
</body>
</html>

Please provide at leeast one failure case for the page above with the
indication of the browser name, version and OS. I will tell you if
worth to invest a single penny into source fix - or if the programmer
caught on it will be fired as using company time and money for
business-unrelated activity during the work hours.
 
T

Thomas 'PointedEars' Lahn

AKS said:
My guess is most HTML authors would think an attribute cannot
contain a line break or any of the other exotic whitespace below.

I have tried to describe a hypothetical situation. Actually, there's
no chances to meet such html-code [...]

Yes, there is.


<... onwhatever="property.access['with'].assignment =
'foobar';">


PointedEars
 
T

Thomas 'PointedEars' Lahn

Peter said:
Peter said:
[...] Thomas 'PointedEars' Lahn [...] wrote:
Peter Michaux wrote:
if (document.getElementById) {
var getEBI = function(id, d) {
return (d||document).getElementById(id);
};
}
// id is a string
// d is some optional node that implements the Document interface.
Now, what is wrong with that?
Besides the fact that your code is still lacking the necessary feature test,
you falsely assume that, because `document.getElementById' yields a
true-value, `d.getElementById' has to be callable.
Have you ever seen a host where where document.getElementById is not
callable?
Several existing (and used) UAs do not support the W3C DOM,

I know that. What is your answer to my question? I'll make it clearer

Have you ever seen a host that has document.getElementById where
document.getElementById is not callable?

Not built-in, but you miss the point. The technical possibility for that
exists and one should be prepared for it.
What is a "fitting" UA?

One that allows for example

document.getElementById = {};

Firefox 2.0.0.11 appears to allow that, and there are Firefox extensions.
Although it isn't likely that an extension will screw up like this, I hope
you see my point now.
I see no difference between language features and host features.

Then you really should read the ECMAScript Specification, Edition 3 (again).
I think by now you will agree that for feature testing the line has to
be draw somewhere and different developers will draw the line in
different places. Do you agree?

I think I did that just before.
I'd be interested to read your design guidelines.

My design guidelines include that when I provide code for possible public
use I want to be prepared for the eventuality of an unknown execution
environment as much as I can, instead of fixing the code after *maybe* it
has been reported to break. I think that is a part of proper QA.


PointedEars
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,375
Latest member
FelishaCma

Latest Threads

Top