Microsoft and attributes--will they ever figure them out?

David Mark · Nov 29, 2009

I get 2 errors now (on 9530).

Removed column span: '2' is not null
Input checked property set: false is not true

I see that the feature test I added for IE8-ish boolean woes was
checked in the getter, but not the setter in one fork. Oops. I
strongly suspect Blackberry was taking that fork and that should
account for the second one. Odd that only the checked property had a
problem as there are a few other tests with similar attribute sets and
boolean property checks. Should work now.

I also suspect that like the relative/absolute file URI issues (1 in
Opera 9.27, 6 in Opera 9.27), the first one is broken beyond repair on
Blackberry. Those two issues seem worthy of flagging externally (apps
may need them or not).

I need to add some better reporting (i.e. the state of the internal
flags) so I don't have to guess about the results as much. But it's
worked pretty well so far.

ISTM that a CSS selector query engine needs to start with a foundation
like this. Likewise apps with exacting parsing requirements. Editors
and apps that need a consistent innerHTML interface would need such
wrappers for any semblance of cross-browser compatibility.

But virtually everyone else can do without them, perhaps using
individual tests (like those on the test page) on the rare occasion
that direct attribute manipulation is needed (DOM properties usually
suffice).

I know. That simulator is a PITA to set up. Good luck

I got it set up, but it didn't do anything impressive.

David Mark · Nov 29, 2009

David said:
David said:

David Mark wrote:
David Mark wrote:
[...]
To run Blackberry Simulator, you need Email and MDS and then one or more
Simulators.
https://www.blackberry.com/Downloads/
Page not found. I would like to see the results posted forthis
device.
Which results?
Whatever was "document.written" at the time you ran it. You reported
20-something errors, but didn't say if that total was for both sets of
tests (I assume it was). I'm curious about which tests failed as
well. So far I've been able to smooth out the kinks in every tested
browser (other than file URI's of course).
I get 2 errors now (on 9530).

Click to expand...

Click to expand...

That means either some of the same quirks found in the old Opera
versions were also present in Blackberry or (less likely) the beefed

Click to expand...

Yet another proof of proper feature testing taking care of "bug
copying".

Yes, and and testing features directly and with a plan (e.g. having
some idea what to expect, which allows the results of the tests to be
correctly analyzed).

There's an ubiquitous Opera/IE similarities; I found some of
the IE -like innerHTML bugs in Konqueror recently; and now Blackberry.

I bet.

If only more people would understand this.

It's coming. The days of testing for !navigator.userAgent.indexOf
('Opera') are clearly over. That strategy was always aimed at trying
to keep up with the (seemingly impossible) present, never mind the
future. For cross-browser (or even multi-browser) scripting 2000-2005
was hell on earth compared to the last five years, so it is easy to
see how the browser sniffing craze gained so much momentum at the
start and is slowly petering out at the end of the decade. If only
more people had read the FAQ notes.

Yep. 23 on the left. 2 on the right.

I think it should be down to the one now. ISTM that maybe Blackberry
won't allow the removal of a colspan attribute at all. That would be
an extreme use case anyway, so I'll just flag it (in case an app must
rely on colspan removal). Have to figure rowspan has the same problem
too. It would be more apparent if I made the test table visible.

And I added a notes section after the first set of tests to make it
easier to interpret results.

Thanks again!

Garrett Smith · Nov 29, 2009

Eric said:
Since there are numerous people posting to this group who appear to
think that SGML validation is a relevant tool for HTML QA, I am
delighted to say that you are wrong about that.

Good point. In fact, I distinctly remember an angry situation with URLs
containing literal ^ (unencoded %5E), somebody's unwillingness to fix
the URLs' generator, and HTML 4.01 Appendix B.2.

As it turns out, many browsers do not follow HTML 4.01 Appendix B.2
completely, and will not encode ^. Safari 4 for windows does (or did).

In the source code â€“ of HTML documents, at least â€“ are attribute value
*literals*.

Thank you for bringing this point up.

The implication of this is outerHTML cannot be used directly for reading
attribute values. It could be used intermediately, to first get the
value, then set textContent of an element to that value, then get the
innerText of the element read. Hackery.

Entities in outerHTML may be resolved or they may be unresolved.
Seems IE chooses a mix of the two, plus entitification of various other
characters:

IE6-IE8:
+--------------+------------+
| Source Code | outerHTML |
+--------------+------------+
| & | & |
| tab literal | |
| CRLF literal |
|
| ' | ' |
| » | Â» |
+--------------+------------+

The results in the table show that some entities are resolved while
others are not, and that some literal characters result in creation of
entities where none occurred.

Persuing a comprehensive way to read attribute values (like realAttr)
seems not worth the effort. Testing various encodings, all entities,
etc, sounds like a tremendous effort.

IE8 still does not support '

My next table is going to be a list of clientHeight clientWidth for
documentElement and body in quirks mode and standards mode. I will test
tall, wide documents and short, narrow documents.

This will require the creation of many documents and a considerable
amount of time.

David Mark · Nov 29, 2009

Good point. In fact, I distinctly remember an angry situation with URLs
containing literal ^ (unencoded %5E), somebody's unwillingness to fix
the URLs' generator, and HTML 4.01 Appendix B.2.

As it turns out, many browsers do not follow HTML 4.01 Appendix B.2
completely, and will not encode ^. Safari 4 for windows does (or did).

Thank you for bringing this point up.

The implication of this is outerHTML cannot be used directly for reading
attribute values. It could be used intermediately, to first get the
value, then set textContent of an element to that value, then get the
innerText of the element read. Hackery.

Entities in outerHTML may be resolved or they may be unresolved.
Seems IE chooses a mix of the two, plus entitification of various other
characters:

IE6-IE8:
+--------------+------------+
| Source Code | outerHTML |
+--------------+------------+
| & | & |
| tab literal | |
| CRLF literal |
|
| ' | ' |
| » | » |
+--------------+------------+

The results in the table show that some entities are resolved while
others are not, and that some literal characters result in creation of
entities where none occurred.

Persuing a comprehensive way to read attribute values (like realAttr)
seems not worth the effort. Testing various encodings, all entities,
etc, sounds like a tremendous effort.

IE8 still does not support '

My next table is going to be a list of clientHeight clientWidth for
documentElement and body in quirks mode and standards mode. I will test
tall, wide documents and short, narrow documents.

Don't forget borders on the body.

But I think you are wasting
your time.

This will require the creation of many documents and a considerable
amount of time.

So why bother? There's already a test for it. Why not try the test
page from the FAQ 9.3 thread in something other than the browsers
already tested? I really think it is close to perfect on the
measurement at this point and am going to update with scroll position
reporting (and setting) shortly.

David Mark · Nov 29, 2009

Good point. In fact, I distinctly remember an angry situation with URLs
containing literal ^ (unencoded %5E), somebody's unwillingness to fix
the URLs' generator, and HTML 4.01 Appendix B.2.

As it turns out, many browsers do not follow HTML 4.01 Appendix B.2
completely, and will not encode ^. Safari 4 for windows does (or did).

Thank you for bringing this point up.

The implication of this is outerHTML cannot be used directly for reading
attribute values. It could be used intermediately, to first get the
value, then set textContent of an element to that value, then get the
innerText of the element read. Hackery.

Oh, I missed this part. The outerHTML property is used (when
possible) in broken MSHTML implementations (and those that would mimic
them) to determine if attributes _exist_. That property is not used
to read attributes at all. So your worries are of no consequence to
the example at hand. I think you'll find the viewport example to be
the same case.

David Mark · Nov 30, 2009

David said:
David said:

David Mark wrote:
David Mark wrote: [...]
up test for broken MSHTML DOM's put it on the right path. Either way,
I'm quite pleased with the progress on the bizarre and ancient browser
front.
Removed column span: '2' is not null
Input checked property set: false is not true
The former likely means the Blackberry DOM can't remove that attribute
(and probably others).
Looking at the latter, I can see how that slipped through the net
(never feature tested that the boolean properties were consistent in
their reflections).
2 on the right (wrapped) and 30-something on the left (raw), right?
Yep. 23 on the left. 2 on the right.

Click to expand...

Click to expand...

I think it should be down to the one now. ISTM that maybe Blackberry

Click to expand...

Yep, just colspan one is left. I toyed with a minimal test case a little
and it looks like Blackberry does delete attribute after all.

The reason test is failing is due to getAttribute('colspan') returning
an empty string instead of `null`.

Thanks! Will look into it. Would you tell me what the Notes section
says (after first set, link at the top?) I am just curious.

David Mark · Nov 30, 2009

David said:
David said:

David Mark wrote:
David Mark wrote: [...]
up test for broken MSHTML DOM's put it on the right path. Either way,
I'm quite pleased with the progress on the bizarre and ancient browser
front.
Removed column span: '2' is not null
Input checked property set: false is not true
The former likely means the Blackberry DOM can't remove that attribute
(and probably others).
Looking at the latter, I can see how that slipped through the net
(never feature tested that the boolean properties were consistent in
their reflections).
2 on the right (wrapped) and 30-something on the left (raw), right?
Yep. 23 on the left. 2 on the right.

Click to expand...

Click to expand...

I think it should be down to the one now. ISTM that maybe Blackberry

Click to expand...

Yep, just colspan one is left. I toyed with a minimal test case a little
and it looks like Blackberry does delete attribute after all.

The reason test is failing is due to getAttribute('colspan') returning
an empty string instead of `null`.

A test case was:

<table>
<tbody>
<tr><td colspan="2" id="testee"></td></tr>
</tbody>
</table>

<script type="text/javascript">
(function(){
var el = document.getElementById('testee');
el.removeAttribute('colspan');
document.write(
el.parentNode.innerHTML
.replace(/</g, '<').replace(/>/g, '>'));
document.write('<br>' + el.hasAttribute('colspan'));
document.write('<br>' + (el.getAttribute('colspan') === ''));
})();
</script>

and it resulted in:

<TD id="testee"></TD>
false
true

[...]

Yes, there are several attributes in odd browsers that fail in this
way. This is a typical feature test that is likely related to this
last quirk in Blackberry:-

var cellSpanAttributesBad = (function() {
var el = doc.createElement('td');
return el.getAttribute('colspan') !== null;
})();

After reducing the equations, outside of broken MSHTML
implementations, there are two forks that are nearly identical. The
difference is in when they use hasAttr to guard against unreliable
getAttribute results. Let me know if the above flag is mentioned in
the first notes section. ISTM it should be there as the only
difference in your test is that you removed the attribute (and the
tested element is in the document). But if that flag is set, the
workaround should be happening.

Need to make the feature testing a little more specific to merge (or
diverge) the two forks.

Thanks again for your help on this. If you are curious, there are two
more sets of tests that deal with DOM properties.

David Mark · Nov 30, 2009

David said:
David said:

David Mark wrote:
David Mark wrote:
David Mark wrote:
[...]
up test for broken MSHTML DOM's put it on the right path. Eitherway,
I'm quite pleased with the progress on the bizarre and ancient browser
front.
Removed column span: '2' is not null
Input checked property set: false is not true
The former likely means the Blackberry DOM can't remove that attribute
(and probably others).
Looking at the latter, I can see how that slipped through the net
(never feature tested that the boolean properties were consistent in
their reflections).
2 on the right (wrapped) and 30-something on the left (raw), right?
Yep. 23 on the left. 2 on the right.
I think it should be down to the one now. ISTM that maybe Blackberry
Yep, just colspan one is left. I toyed with a minimal test case a little
and it looks like Blackberry does delete attribute after all.
The reason test is failing is due to getAttribute('colspan') returning
an empty string instead of `null`.

Click to expand...

Click to expand...

Thanks! Will look into it. Would you tell me what the Notes section
says (after first set, link at the top?) I am just curious.

Click to expand...

Cell span attributes bad

Okay, that should indicate the problem then. Thanks!

David Mark · Nov 30, 2009

David said:
David said:

David Mark wrote:
David Mark wrote:
David Mark wrote:
[...]
up test for broken MSHTML DOM's put it on the right path. Either way,
I'm quite pleased with the progress on the bizarre and ancient browser
front.
Removed column span: '2' is not null
Input checked property set: false is not true
The former likely means the Blackberry DOM can't remove that attribute
(and probably others).
Looking at the latter, I can see how that slipped through the net
(never feature tested that the boolean properties were consistentin
their reflections).
2 on the right (wrapped) and 30-something on the left (raw), right?
Yep. 23 on the left. 2 on the right.
I think it should be down to the one now. ISTM that maybe Blackberry
Yep, just colspan one is left. I toyed with a minimal test case a little
and it looks like Blackberry does delete attribute after all.
The reason test is failing is due to getAttribute('colspan') returning
an empty string instead of `null`.
Thanks! Will look into it. Would you tell me what the Notes section
says (after first set, link at the top?) I am just curious.

Click to expand...

Click to expand...

Cell span attributes bad

Click to expand...

Okay, that should indicate the problem then. Thanks!

Yeah, I think I see it. Based on feedback, I determined it was safe
to merge the two quasi-standard forks and eliminate all but that one
feature test, which not coincidentally relates to table cell spans. I
provided an easier out for hasAttr for browsers that botch table cell
attributes as well. That should do it.

I will add some more unit tests when I get a chance. I'm sure there
are more table-related attributes that should be considered for this
workaround. I know it isn't all attributes that correspond to number
properties as there is a test that removes the tabindex attribute.
I'll probably end up testing all of the numeric table-related
attributes and flagging for the GP workaround if one fails.

Thomas 'PointedEars' Lahn · Dec 8, 2009

Garrett said:
[BlackBerry browser] has some really weird javascript bugs in it, some
very undesirable behavior with respect to DOM recalc (it skips many), but
has decent DOM support and pretty good support of ECMA-262 r3.

That is a contradiction. No wait, two.

PointedEars

Thomas 'PointedEars' Lahn · Dec 15, 2009

David said:
The DOM properties interpret the attribute values.

DOM properties are separate from attribute values for the most part. They
represent the current value, not the value in the markup, where there is a
corresponding attribute to begin with.

In the case of URI's, you get the full path (in all browsers). That's why
you can't use properties - for example - to write an innerHTML emulation.

Yes, you can.

PointedEars

David Mark · Dec 15, 2009

DOM properties are separate from attribute values for the most part.

I wouldn't say _most_ part. There is a lot of reflection. It varies,
so you have to test at least some cases.

They
represent the current value, not the value in the markup, where there is a
corresponding attribute to begin with.

Setting properties creates attributes in many cases. If a script
creates a DIV:-

var elDiv = document.createElement('div');
div.id = 'test';

....the resulting structure is:-

<div id="test"></div>

div.getAttribute('id') == 'test'

The distinction is that many properties have _defaults_, so there is
no way to know if the attribute is there or not. That's where you
need to call hasAttribute (or an emulation). A common case where this
is necessary is:-

<option value="">Test</option>

....because the value property will vary cross-browser. What a
serialization function needs to get here is "", not "Test".

This must be an old post. As we've seen on the test page, some
browsers return unresolved URI's for some properties. I consider that
a bug, though there is no formal spec that says so, as it only makes
sense for the property to hold the resolved path (else how would you
get it?) Most modern browsers do this for a.href anyway. Where they
fail, the - prop - wrapper compensates (when possible). From the
version that is up there now:-

Known Exceptions

* IE6/7 and IE8 compatibility mode return unresolved paths for the
action, usemap, longdesc and link href attributes.

I think I've added a few to that list.

And BTW, in case you are curious, there's a typo in that test in the
version up there now. There's a line that sets an href property of a
dummy anchor, but in the test it is setting the wrong property. It's
only working by coincidence, but I have tested without the coincidence
locally. I should have the new version, which has a lot more tests
and filters out user input (another source of distinction between
properties and attributes) up soon. I think it will turn out to be a
good test page for browser developers (especially the IE team).

Yes, you can.

A pretty crappy one.

But the realAttr (renamed attr now) wrapper
will serialize a document without such contamination.

Thomas 'PointedEars' Lahn · Dec 15, 2009

David said:
I wouldn't say _most_ part. There is a lot of reflection. It varies,
so you have to test at least some cases.

Setting properties creates attributes in many cases. If a script
creates a DIV:-

var elDiv = document.createElement('div');
div.id = 'test';

...the resulting structure is:-

<div id="test"></div>

div.getAttribute('id') == 'test'

The distinction is that many properties have _defaults_, so there is
no way to know if the attribute is there or not.

True. But I do not need to know.

That's where you need to call hasAttribute (or an emulation).
No.

A common case where this
is necessary is:-

<option value="">Test</option>

...because the value property will vary cross-browser. What a
serialization function needs to get here is "", not "Test".

This must be an old post.

Yes, I am trying hard to keep up with you ;-)

As we've seen on the test page, some browsers return unresolved URI's for
some properties.

There is no such thing as an "unresolved URI". There are URIs (e.g.
schema://host/path?query#fragment) and there are URI-references (e.g.
path?query#fragment). See also RFC 3986, which obsoletes RFC 2396 as
referred in W3C DOM Level 2 HTML.

I consider that a bug, though there is no formal spec that says so,

But there is. You can find "URI" everywhere in

<http://www.w3.org/TR/DOM-Level-2-HTML/html.html>

sometimes strengthened by the word "absolute". You will find no occurence
whatsoever of "URI-reference".

as it only makes sense for the property to hold the resolved path (else
how would you get it?)

Using the proprietary `location' property, I presume. And yes, it is a bug
for an otherwise conforming implementation not to yield a URI there.

Most modern browsers do this for a.href anyway. Where they
fail, the - prop - wrapper compensates (when possible).
ACK

A pretty crappy one.

How so?

But the realAttr (renamed attr now) wrapper
will serialize a document without such contamination.

Hmmm. Is it not more reasonable to assume that if an attribute property has
been feature-tested to exist and has the default value the corresponding
attribute specification does not need to be part of the serialized version
at all?

PointedEars

David Mark · Dec 16, 2009

True. But I do not need to know.

Depends on what you are trying to do.

No.

Did you read the example below?

That's what I'm talking about.

Yes, I am trying hard to keep up with you ;-)

There is no such thing as an "unresolved URI". There are URIs (e.g.
schema://host/path?query#fragment) and there are URI-references (e.g.
path?query#fragment). See also RFC 3986, which obsoletes RFC 2396 as
referred in W3C DOM Level 2 HTML.

But there is. You can find "URI" everywhere in

<http://www.w3.org/TR/DOM-Level-2-HTML/html.html>

sometimes strengthened by the word "absolute". You will find no occurence
whatsoever of "URI-reference".

Somewhere in this thread, it was asserted that the spec left all but
a.href open to interpretation.

Using the proprietary `location' property, I presume. And yes, it is abug
for an otherwise conforming implementation not to yield a URI there.

How so?

Would be full of DOM defaults, user input, resolved paths, etc., so it
would vary wildly from one browser to the next and would never give a
clear view of the underlying document.

Hmmm. Is it not more reasonable to assume that if an attribute property has
been feature-tested to exist and has the default value the corresponding
attribute specification does not need to be part of the serialized version
at all?

But generally you have no way of knowing what the default is. If you
want to serialize something like this:-

<div id="test"></div>

....you don't normally want this:-

<div id="test" maxlength="1234567" tabindex="0" ... ></div>

What would you do with such a novelty?

Similarly, if you have this structure:-

<input name="test" value="test">

the "proper" serialization would not normally be:-

<input name="test" value="last thing the user typed">

The value attribute if reflected by defaultValue, not its namesake
property. The same goes for checked and selected.

These details matter for a number of applications (though they should
not for a basic Web app). A basic (consistent) innerHTML emulation is
the first thing that comes to mind. Serialization of edited HTML is
another. The proprietary innerHTML as seen by the host does not make
a good canonical form (e.g. the form to send to the DB on the server).

Then there are these silly CSS selector query engines, which have
become a ludicrous standard fixture for "Real World" Web apps. A lot
of them use XPath. Some purport to support both HTML and XML DOM's.
Others use QSA. So the requirements for an alternate jQuery-ish fork
are clear. Any variations between the various forks will lead to
sporadic incompatibilities that will be virtually impossible to track
down without writing a dissertation on the underlying library code
(probably more than the average code monkey bargains for).

Thomas 'PointedEars' Lahn · Dec 16, 2009

David said:
Somewhere in this thread, it was asserted that the spec left all but
a.href open to interpretation.

Then that assertion was wrong. A URI is not a URI-reference. Read RFC 3986
(or the referred 2396, for that matter).

Would be full of DOM defaults, user input,
No.

resolved paths,

So what?

etc., so it would vary wildly from one browser to the next and would never
give a clear view of the underlying document.

What you appear to be overlooking is that it was never supposed to.

But generally you have no way of knowing what the default is. If you
want to serialize something like this:-

<div id="test"></div>

...you don't normally want this:-

<div id="test" maxlength="1234567" tabindex="0" ... ></div>

What would you do with such a novelty?

It is not going to happen in the first place. My implementation would
consider only properties specified in W3C DOM Level 2 HTML. We are not
dealing with any markup language here, but with HTML. Since responsible Web
development forbids augmenting host objects, proprietary attributes and
properties, I would simply ignore them. Granted, that is not the same as
`innerHTML'; it is a lot better.

Similarly, if you have this structure:-

<input name="test" value="test">

the "proper" serialization would not normally be:-

<input name="test" value="last thing the user typed">

The value attribute if reflected by defaultValue, not its namesake
property. The same goes for checked and selected.

Some exceptions to the rule need to made, of course. That does not mean
one needs a full-blown getAttribute() fixing wrapper to do this.

These details matter for a number of applications (though they should
not for a basic Web app).

Exactly, they should not.

A basic (consistent) innerHTML emulation is
the first thing that comes to mind.

That kind of emulation would need to be restricted to attributes specified
in HTML and properties specified in W3C DOM Level 2 HTML, right?

Serialization of edited HTML is another.

I do not see your point.

The proprietary innerHTML as seen by the host does not make
a good canonical form (e.g. the form to send to the DB on the server).

Exactly my point. So is it not next to stupid to try to emulate it as
exactly as possible?

Then there are these silly CSS selector query engines, which have
become a ludicrous standard fixture for "Real World" Web apps. A lot
of them use XPath.

XPath does not work with HTML in MSHTML, and in Gecko & friends I can use
the native XPath implemenation. I do not see your point.

Some purport to support both HTML and XML DOM's.

So what?

Others use QSA. So the requirements for an alternate jQuery-ish fork
are clear. Any variations between the various forks will lead to
sporadic incompatibilities that will be virtually impossible to track
down without writing a dissertation on the underlying library code
(probably more than the average code monkey bargains for).

You cannot fix junk, you can only replace it with something better which
means that you must not implement all its quirks.

PointedEars

David Mark · Dec 16, 2009

Then that assertion was wrong. A URI is not a URI-reference. Read RFC 3986
(or the referred 2396, for that matter).

Then the spec agrees with my original opinion that they should all
resolve to URI's. Garrett said it was only specified for a.href.

No.

No? How would you know which are defaults?

So what?

So, it's not the markup you are trying to serialize.

What you appear to be overlooking is that it was never supposed to.

What wasn't?

Typo. That was supposed to be an INPUT example.

It is not going to happen in the first place. My implementation would
consider only properties specified in W3C DOM Level 2 HTML.

See above.

We are not
dealing with any markup language here, but with HTML. Since responsible Web
development forbids augmenting host objects, proprietary attributes and
properties, I would simply ignore them.

I wasn't talking about those.

Granted, that is not the same as
`innerHTML'; it is a lot better.

Depends on the context. It wouldn't be better for the examples I
listed.

Some exceptions to the rule need to made, of course. That does not mean
one needs a full-blown getAttribute() fixing wrapper to do this.

I've mentioned that several times in this thread. Best to test (and
fix) only the features you need. However I've listed a few examples
where you would need the whole thing.

Exactly, they should not.

Unfortunately, the current lynch-pins for the "major" libraries are
CSS selector queries. So everyone using them is potentially affected
by these variations.

That kind of emulation would need to be restricted to attributes specified
in HTML and properties specified in W3C DOM Level 2 HTML, right?

Not necessarily, no.

I do not see your point.

Have you ever written an editor?

Exactly my point. So is it not next to stupid to try to emulate it as
exactly as possible?

That's not what I said at all. The emulation I propose would be
consistent cross-browser. If the host innerHTML properties were 100%
consistent and standardized, there would be no need for such a
solution.

XPath does not work with HTML in MSHTML,
Exactly.

and in Gecko & friends I can use
the native XPath implemenation.
Right.

I do not see your point.

You just reinforced it.

Have you ever written a CSS selector
query? If not, think about what you use XPath for in Gecko and
imagine what you would need to do to duplicate it in IE.

So what?

So, you seem lost.

You cannot fix junk, you can only replace it with something better which
means that you must not implement all its quirks.

You have missed the point entirely. I am not talking about
replicating quirks at all.

Thomas 'PointedEars' Lahn · Dec 16, 2009

David said:
No? How would you know which are defaults?

From the specifications.

So, it's not the markup you are trying to serialize.

Not exactly, that is correct. Like innerHTML, the innerHTML replacement
implementation only needs to provide something that resembles the original
markup enough for it to work; in the case of the replacement that means that
it needs to be consistent in one user agent, and interoperable among user
agents if possible.

What wasn't?

`innerHTML' does not give "a clear view of the underlying document" either.
But it does not need to nor would it appear that it was supposed to.

[...] If you want to serialize something like this:-

<div id="test"></div>

...you don't normally want this:-

<div id="test" maxlength="1234567" tabindex="0" ... ></div>

Click to expand...

Click to expand...

Typo. That was supposed to be an INPUT example.

It is not going to happen in the first place. My implementation would
consider only properties specified in W3C DOM Level 2 HTML.

Click to expand...

See above.

The default values for `maxLength' and `tabIndex' are -1 and 0 in a Gecko-
based browser. Obviously the former (or the value 0) does not make sense so
it can be safely ignored for serialization. Per HTML 4.01, it only needs to
be considered for type="text" or type="password" anyway.

As for the latter, if one were to avoid the attribute specification, when in
doubt hasAttribute() or getAttribute() can be called for comparison.

I wasn't talking about those.

Look, I am not to guess your thoughts; you will have to tell them or
consider your "argument" discarded.

Depends on the context. It wouldn't be better for the examples I
listed.

Unfortunately, your "examples" are too general to be useful in a discussion.

I've mentioned that several times in this thread.

I do not want to read that whole mostly full-quoted thread and sieve your
possible points out of it. If you want to prove something, prove it here.
If you do not want to repeat yourself too much, you can support the argument
with a Message-ID to one of your postings.

Best to test (and fix) only the features you need. However I've listed a
few examples where you would need the whole thing.

Name them. And no more commonplace examples, please.

Unfortunately, the current lynch-pins for the "major" libraries are
CSS selector queries. So everyone using them is potentially affected
by these variations.

Their problem. You argument is too general to be useful, again.

Not necessarily, no.

It was a rhetorical question. Most certainly the answer is yes.
Those two Specifications are the lowest common denominator.

Have you ever written an editor?

No, but I have debugged one. Get to the point, please.

That's not what I said at all. The emulation I propose would be
consistent cross-browser.

I will look into it if and when I find the time. Until then, I will
continue writing my own.

If the host innerHTML properties were 100% consistent and standardized,
there would be no need for such a solution.

Correct, but useless.

You just reinforced it. Have you ever written a CSS selector
query?
No.

If not, think about what you use XPath for in Gecko

I use it to retrieve elements by type identifier or attribute or ancestor-
descendant relationship. Aside from the `class' attribute in (X)HTML, CSS
does not even enter into my considerations.

and imagine what you would need to do to duplicate it in IE.

This is not a guessing game. Get to the point, please.

So, you seem lost.
Likewise.

You have missed the point entirely. I am not talking about
replicating quirks at all.

So, what are you talking about then?

And try to keep your quotes short, would you, please?

PointedEars

David Mark · Dec 16, 2009

From the specifications.

I see. Unfortunately, some browser developers (e.g. MS) don't see the
specifications as firm rules.

Not exactly, that is correct. Like innerHTML, the innerHTML replacement
implementation only needs to provide something that resembles the original
markup enough for it to work; in the case of the replacement that means that
it needs to be consistent in one user agent, and interoperable among user
agents if possible.

Interoperability through consistency is the point.

`innerHTML' does not give "a clear view of the underlying document" either.

That's the point. My example is a replacement that does. As
mentioned, if the innerHTML were standardized to the point of
interoperability, this would be a moot point.

But it does not need to nor would it appear that it was supposed to.

See directly above.

[...] If you want to serialize something like this:-
<div id="test"></div>
...you don't normally want this:-
<div id="test" maxlength="1234567" tabindex="0" ... ></div>

Click to expand...

Click to expand...

Typo. That was supposed to be an INPUT example.

Click to expand...

See above.

Click to expand...

The default values for `maxLength' and `tabIndex' are -1 and 0 in a Gecko-
based browser.

So?

Obviously the former (or the value 0) does not make sense so

it can be safely ignored for serialization.

The latter? How do you figure a default tab index of 0 makes no
sense? It makes perfect sense to me.

Per HTML 4.01, it only needs to
be considered for type="text" or type="password" anyway.

That illustrates just how old that spec is. It was just the starting
point. Obviously if you ignored tab index for all but text and
password inputs today, you would miss a lot of significant
information.

As for the latter, if one were to avoid the attribute specification, whenin
doubt hasAttribute() or getAttribute() can be called for comparison.

Not sure what you mean about avoiding the attribute specification. As
for hasAttribute, that was introduced by MS in IE8 (standards mode
only). And, as I hope we all know by _now_ (two years since this
subject was brought up and beaten to death), get/set/removeAttribute
are all screwy in IE < 8 (and IE8 compatibility mode). So what are
you saying?

For a "Real World" example, at the recent jQuery attribute summit,
somebody pointed out that jQuery UI uses these DOM methods
(sparingly), but calls jQuery's odd assortment of "wrappers" more
often.

"47 occurrences of .attr() (a mix of string and object argument
syntaxes) and 12 .removeAttr()'s"

What does that tell you? It was determined by the panel that:-

"jQuery UI is more then expected to work browser independently, its
implied by its use."

I wouldn't expect their cunning plan to work any bettter than whatever
it is you mean by using has/getAttribute "when in doubt". My position
is there shouldn't be any real doubt about these methods at this
point.

Look, I am not to guess your thoughts; you will have to tell them or
consider your "argument" discarded.

Discard at will. Like I said, never mind custom attributes.
Obviously, we are talking about the standard ones.

Unfortunately, your "examples" are too general to be useful in a discussion.

That's your opinion.

I do not want to read that whole mostly full-quoted thread and sieve your
possible points out of it.

Okay.

If you want to prove something, prove it here.

I'm not trying to prove anything.

If you do not want to repeat yourself too much, you can support the argument
with a Message-ID to one of your postings.
Thanks.

Name them. And no more commonplace examples, please.

What do you consider commonplace?

Their problem.
Exactly.

You argument is too general to be useful, again.

I don't see that at all.

It was a rhetorical question. Most certainly the answer is yes.

I don't see that either. I say no.

Those two Specifications are the lowest common denominator.

You could still include custom attributes in a serialization. I'm not
saying it would be particularly useful though.

No, but I have debugged one. Get to the point, please.

I made the point about the editor.

I will look into it if and when I find the time. Until then, I will
continue writing my own.

Look into what? There are some related wrappers in My Library (e.g.
getElementHtml, getElementOuterHtml).

Correct, but useless.

I don't follow that.

No.

Well, then perhaps you haven't considered what goes into it. Think
about it.

I use it to retrieve elements by type identifier or attribute or ancestor-
descendant relationship. Aside from the `class' attribute in (X)HTML, CSS
does not even enter into my considerations.
Okay.

This is not a guessing game. Get to the point, please.

How would you duplicate any or all of those XPath tasks in IE?

Likewise.

Well, I'm not.

So, what are you talking about then?

At this point, that's my line.

And try to keep your quotes short, would you, please?

Sure.

Thomas 'PointedEars' Lahn · Dec 16, 2009

David said:
I see. Unfortunately, some browser developers (e.g. MS) don't see the
specifications as firm rules.

One can determine which default values of attribute properties are necessary
to include in the serialization and which are not.

Interoperability through consistency is the point.

However, this contradicts your requirement that there would need to be a
bijection between the element object and the serialization of it.

That's the point. My example is a replacement that does. As
mentioned, if the innerHTML were standardized to the point of
interoperability, this would be a moot point.

See directly above.

Still nobody needs that.

The latter?

No, the former, maxLength < 0 or maxLength == 0.

How do you figure a default tab index of 0 makes no
sense? It makes perfect sense to me.

Check your assumptions. tabIndex == 0 is the same as if the `tabindex'
attribute was not supported on an element or was not specified (omitted).

That illustrates just how old that spec is.

No, if it illustrates anything then that it is the lowest common
denominator, the target for achieving interoperability.

It was just the starting point. Obviously if you ignored tab index for
all but text and password inputs today, you would miss a lot of
significant information.

Not information significant to interoperability, which was the point of the
whole exercise.

Not sure what you mean about avoiding the attribute specification.

It means not serializing an attribute property and its value into
`attribute="value"' because it would not make a difference.

As for hasAttribute, that was introduced by MS in IE8 (standards mode
only). And, as I hope we all know by _now_ (two years since this
subject was brought up and beaten to death), get/set/removeAttribute
are all screwy in IE < 8 (and IE8 compatibility mode).

So "screwy" that you cannot use it to differentiate whether the attribute
was specified or not? I doubt it.

That's your opinion.

Which should be relevant to you as you are discussing this with me. Unless
your purpose here is just to state something and to hell with the
contradictions.

I'm not trying to prove anything.

What are you up to, then?

What do you consider commonplace?

Too general a description.

I don't see that at all.

Well, you are talking rather nebulously about potential problems. Why not
take this opportunity to name some of the perceived problems explicitly and
concisely instead, to support your argument?

You could still include custom attributes in a serialization. I'm not
saying it would be particularly useful though.

Most importantly, it would not be interoperable, so we can safely ignore
them by default.

I made the point about the editor.

No, you failed to do that by asking a closed question which could be
understood as a red herring.

I don't follow that.

While stating the obvious is no doubt a correct statement in itself, it does
not help with this discussion.

Well, then perhaps you haven't considered what goes into it. Think
about it.

I am growing tired of your commonplace arguments.

How would you duplicate any or all of those XPath tasks in IE?

Type identifier is easy as is ancestor-decendant relationship. Attributes
are a bit harder, but not unsolvable.

That said, I would not even attempt using XPath in IE except for XHTML
served as text/xml or application/xml where there is a native
implementation. So you see, you are tackling the problem from the wrong
side: XPath in HTML is a (proprietary) bonus in Gecko (and perhaps in other
!MSHTMLs) that can be taken advantage of on occasion, not a lack in MSHTML
that needs to be compensated for.

PointedEars

David Mark · Dec 17, 2009

One can determine which default values of attribute properties are necessary
to include in the serialization and which are not.

You don't know _which_ property values are defaults. For instance,
maxlength has a default of some very large number in MSHTML. And
this:-

<a href="..." tabindex="0">

....is not the same as this:-

<a href="...">

....and no, you cannot just go with the latter.

However, this contradicts your requirement that there would need to be a
bijection between the element object and the serialization of it.

You are very confused (and not making sense at all) Bijection?

Still nobody needs that.

Those who would rely on CSS selector queries would certainly need it.
Same for an editor that must save its results.

And you said yourself that you would "refer" to hasAttribute and
getAttribute, when in "doubt". How would you do that if those methods
are missing or broken?

No, the former, maxLength < 0 or maxLength == 0.

Regardless, the default is some huge integer in MSHTML. Will you
throw out every value that is either very large or negative?

Check your assumptions. tabIndex == 0 is the same as if the `tabindex'
attribute was not supported on an element or was not specified (omitted).

Nope. You are dead wrong on that. Leaving it off will disallow
tabbing to that element in some agents. And if it is not supported,
the property value is typically undefined. Also, leaving it off - for
example - a DIV will result in a default property value of -1 in some
agents. You've got nothing to go on.

It's really simple. Either you can serialize a document or you
can't. I've demonstrated how to do it (e.g. getElementOuterHtml in My
Library) and you have speculated how you might do it. I'm telling you
that your proposed algorithm will result in markup that looks like the
output of MS Word.

<http://www.w3.org/TR/html401/interact/forms.html#adef-tabindex>

Forms? It's not as if form elements are the only concern for
tabindex.

No, if it illustrates anything then that it is the lowest common
denominator, the target for achieving interoperability.

All I am saying is that you could optionally allow non-standard
attributes.

Not information significant to interoperability, which was the point of the
whole exercise.

Wrong. See above.

It means not serializing an attribute property and its value into
`attribute="value"' because it would not make a difference.

Still not clear what you mean.

So "screwy" that you cannot use it to differentiate whether the attribute
was specified or not? I doubt it.

You better believe it. And if you don't, that explains your
"position" on this.

Which should be relevant to you as you are discussing this with me. Unless
your purpose here is just to state something and to hell with the
contradictions.

Your assertion about "general examples" has no technical meaning.

What are you up to, then?

Too general a description.
Whatever.

Well, you are talking rather nebulously about potential problems. Why not
take this opportunity to name some of the perceived problems explicitly and
concisely instead, to support your argument?

As you mentioned, you are way behind on this. Catch up and your
questions will have been answered.

Most importantly, it would not be interoperable, so we can safely ignore
them by default.

No, you failed to do that by asking a closed question which could be
understood as a red herring.

You just can't get your brain around this or you are deliberately
obfuscating the points made. I don't really care which at this point.

While stating the obvious is no doubt a correct statement in itself, it does
not help with this discussion.

I am growing tired of your commonplace arguments.

There's that word again.

Type identifier is easy as is ancestor-decendant relationship. Attributes
are a bit harder, but not unsolvable.

Read that last bit again. You summarized my point exactly. Take it a
step further and realize that none of the "major" libraries have even
tried to solve it.

That said, I would not even attempt using XPath in IE except for XHTML
served as text/xml or application/xml where there is a native
implementation.

Of course you wouldn't attempt to use XPath with MSHTML (for an HTML
DOM). That's why you would have to write an equivalent script. And
that requires...

So you see, you are tackling the problem from the wrong
side: XPath in HTML is a (proprietary) bonus in Gecko (and perhaps in other
!MSHTMLs) that can be taken advantage of on occasion, not a lack in MSHTML
that needs to be compensated for.

You clearly don't know what I'm doing or why. Strange.

Attributes, properties and XHTML	0	Apr 8, 2009
Microsoft and IE9 New Features	33	Mar 16, 2010
A Brief Review of jQuery 1.5	13	Feb 13, 2011
YUI--Competent?	1	Dec 25, 2009
New jQuery announced!	104	Dec 7, 2009
Now I've seen everything	12	Aug 2, 2010
David Mark's Daily Javascript Tips - Volume #3 - Tip #8 - How toCompute Styles	1	Dec 6, 2011
More attribute-related jQuery futility	1	May 15, 2010

Microsoft and attributes--will they ever figure them out?

David Mark

David Mark

Garrett Smith

David Mark

David Mark

David Mark

David Mark

David Mark

David Mark

Thomas 'PointedEars' Lahn

Thomas 'PointedEars' Lahn

David Mark

Thomas 'PointedEars' Lahn

David Mark

Thomas 'PointedEars' Lahn

David Mark

Thomas 'PointedEars' Lahn

David Mark

Thomas 'PointedEars' Lahn

David Mark

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads