A little afternoon WTF

M

markspace

Mike said:
That is, that LF, CR, and CRLF are all translated to simple LF by a
conforming XML processor.


Hmm, you're right. The combination of CRLF is translated to a single
LF. In some ways that's worse. It seems to me it could be difficult in
many cases to remove a character already in a buffer. If you're using
something like String and not a stream, then that's a lot of places that
one character will have to be removed from the input, causing two
strings to have to be concatenated.
 
M

Mike Schilling

markspace said:
Hmm, you're right. The combination of CRLF is translated to a single
LF. In some ways that's worse. It seems to me it could be difficult
in many cases to remove a character already in a buffer. If you're
using something like String and not a stream, then that's a lot of
places that one character will have to be removed from the input,
causing two strings to have to be concatenated

If you're converting bytes to chars, dropping the CR is a very minor issue.
If not, dropping the CR is a bit of a perfornance issue, since it does
require copying characters around, whether to create a String (as in DOM),
or to present the client with an array of chars (as in SAX). In either
case, it's simpler than handling escapes (like > for >).
 
L

Lew

Possibly so that all the XML is on line with nothing BUT xml,
possibly for grepping/searching purposes.

To which the empty string component contributes absolutely nothing whatsoever
at all in the least.

Tom said:
Oh god, i've [sic] just spotted another one: the hardcoded CRLF! This is a
linux [sic]-only project (up to and including developing on linux [sic] VMs - the
only time you'd ever look at this file would be on a linux [sic] machine),
and XML normalises all line breaks to LF anyway. Why would you do that?
Perhaps in multi-platform environment, the coder had occasion
to open up the XML in a windows [sic] text editor.

All but one of which handle LF-terminated lines just fine. And which tom
explained, in the passage you cited no less, is not the case here regardless.
 
T

Tom Anderson

Perhaps in multi-platform environment, the coder had occasion
to open up the XML in a windows text editor.

I doubt it's quite that, but it is surely along those lines - to do with
the fact that the person who wrote this is more used to Windows: to them,
CRLF is the standard line ending, and they probably wrote the above
reflexively, just as i would have incorrectly-ish used just LF if i was
using Window for some reason.

tom
 
T

Tom Anderson

Well, I don't know if I can consider the SQL to be fine, not if you include
the hardcoded "id" value of 2057. I can see that it's evidently test data but
that still makes me queasy.

Oh, don't worry, that's there in the production version too!

tom
 
R

Roedy Green

What XML package? I'm not sure what you mean.

He may have experimented with various XML read/write packages, see
http://mindprod.com/jgloss/xml.html and had trouble generating
precisely what he wanted. He then decided to eschew XML packages (at
least for writing) since it is not particularly difficult to do
manually, and since you have precise control.

--
Roedy Green Canadian Mind Products
http://mindprod.com

Beauty is our business.
~ Edsger Wybe Dijkstra (born: 1930-05-11 died: 2002-08-06 at age: 72)

Referring to computer science.
 
R

Roedy Green

Possible, although it seems a bit of a stretch. There are string constants
all over that bit of the codebase, and i can't think of a single instance
of the ""+int trick in the entire system. I think it's pretty bad
practice, so i'd remember if i'd seen it. It's quite possible he learned
the idiom from that use, though.

The only other way I could see a "" + "legitimately" appearing in code
is when you convert some string to "" with global search replace. My
gut feel is the guy was just an idiot who "stuttered" for much the
same non-motive of other stuttering.

See http://mindprod.com/jgloss/stuttering.html

--
Roedy Green Canadian Mind Products
http://mindprod.com

Beauty is our business.
~ Edsger Wybe Dijkstra (born: 1930-05-11 died: 2002-08-06 at age: 72)

Referring to computer science.
 
A

Andreas Leitgeb

John B. Matthews said:
Much better! Double + is surely ungood.

I'd prefer this:

private static final String HEADER = ""
+ "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n"
+ "<initech:tps-report><initech:coversheet> etc";

I'm a bit over-biased towards simple diffs, therefore I prefer
if a line can easily be removed or added without also having to
change something in a neighbouring line. (that tends to spoil
tkdiff's presentation of the file's changes)
If the last line was expected to be volatile as well, I'd even
place the semicolon on a line by itself just below the +'es.
 
D

David Lamb

I'd prefer this:

private static final String HEADER = ""
+ "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n"
+ "<initech:tps-report><initech:coversheet> etc";

I'm a bit over-biased towards simple diffs,

I seem to recall, from about 30 years ago, the company I worked at had
similar formatting rules (details escape me) exactly so that simple text
diffs would work better. In your example the reason would have been
that the real first line of the string (line 2) shouldn't look any
different from the later ones, so that it would not change if one added
another string before it, e.g.
private static final String HEADER = ""
+ "some other new string"
+ "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n"
+ "<initech:tps-report><initech:coversheet> etc";

A simple diff would show a 1-line insertion, with no change to the other
lines.

The thing is, with current programming environments, is this still an
issue for anybody?
 
L

Lew

I seem to recall, from about 30 years ago, the company I worked at had
similar formatting rules (details escape me) exactly so that simple text
diffs would work better. In your example the reason would have been that
the real first line of the string (line 2) shouldn't look any different
from the later ones, so that it would not change if one added another
string before it, e.g.
private static final String HEADER = ""
+ "some other new string"
+ "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n"
+ "<initech:tps-report><initech:coversheet> etc";

A simple diff would show a 1-line insertion, with no change to the other
lines.

The thing is, with current programming environments, is this still an
issue for anybody?

The other thing is, with the rule that the <?xml ...?> line has to be first in
this case, would this trick ever actually have any value in this case?
 
A

Andreas Leitgeb

I'm a bit over-biased towards simple diffs,
I seem to recall, from about 30 years ago, the company I worked at had
similar formatting rules (details escape me) exactly so that simple text
diffs would work better. [...]
The thing is, with current programming environments, is this still an
issue for anybody?

I do use tkdiff a lot, myself, for code. It is substantial part of my
programming environment. I make no claim about whether my p.e. is in any
way "typical", though. (It also involves vim and CVS)

Lew said:
The other thing is, with the rule that the <?xml ...?> line has to be first in
this case, would this trick ever actually have any value in this case?

It could(*) happen, that at some point, a new method is created that writes
the boilerplate itself (to avoid repetition among several xml-generating
parts of code), followed by the less boilerplate parts passed in as String.
In that case, the String itself shouldn't any longer contain the header,
so the first line would need to be removed from each such literal.

*: I won't make any guesses at how often/likely that would actually happen.
 
L

Lars Enderin

On 17/05/2010 7:07 AM, Andreas Leitgeb wrote:
I'm a bit over-biased towards simple diffs,
I seem to recall, from about 30 years ago, the company I worked at had
similar formatting rules (details escape me) exactly so that simple text
diffs would work better. [...]
The thing is, with current programming environments, is this still an
issue for anybody?

I do use tkdiff a lot, myself, for code. It is substantial part of my
programming environment. I make no claim about whether my p.e. is in any
way "typical", though. (It also involves vim and CVS)
I find emacs and its ediff indispensable in my programming environment.
 
J

Jim Janney

Roedy Green said:
He may have experimented with various XML read/write packages, see
http://mindprod.com/jgloss/xml.html and had trouble generating
precisely what he wanted. He then decided to eschew XML packages (at
least for writing) since it is not particularly difficult to do
manually, and since you have precise control.

For what it's worth, here's some code I wrote in 2003. It's part of a
JUnit test for a custom XML parsing framework.

private final static String header = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>";

private final static String doc1 =
header
+ "<root foo='bar' baz='quux'>"
+ "<property name='var' value='value'/>"
+ "<property name='nested' value='var'/>"
+ "yabba dabba doo"
+ "<nested blah='blah blah' subst='${var}' nested-subst='${${nested}}'/>"
+ "</root>";

I didn't use an XML writer because I didn't see any particular point in it.
 
D

David Lamb

The other thing is, with the rule that the <?xml ...?> line has to be
first in this case, would this trick ever actually have any value in
this case?

In this specific case I suppose you're saying there's no reasonable
expectation that the line would change. OTOH one wants one's
programmers to develop coding habits to the point where they're automatic.
 
L

Lew

David said:
In this specific case I suppose you're saying there's no reasonable
expectation that the line would change. OTOH one wants one's programmers
to develop coding habits to the point where they're automatic.

But not to the point where they give up thinking altogether.

Yes, you are correct about my conclusion.
 
J

John B. Matthews

Chris Riesbeck said:
Is that page correctly generated? The examples look fine but their
English captions are just "An," "A," "It," "A," "It" etc. as if only the
first word was being put into the page.

I see the same appearance in recent Safari and Firefox.
 
C

ClassCastException

I find emacs and its ediff indispensable in my programming environment.

Ewwww! Emacs!

Why do you use that waste of cycles instead of opening sixty xterms most
running vim instances like Real Men do?

ObJava: see the from-line.

Oh, all right. ObJava 2: I don't see a problem with hardcoding XML
instead of using StAX if it's just a little bit here and there versus the
complexity overhead of bringing in a whole additional tool, and with it
two whole additional dependencies (one of the project upon the tool, and
another of the programmers upon knowing that additional tool).

Er, that's more XML-and-generic-programming than Java. Okay, ObJava 3:
did anyone notice the further inconsistency that the TEST_DATA_QUERY *is*
declared "final" and *is* capitalized while "header" isn't?

In fact it's likely that the two different chunks of code had different
authors, based just on that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,812
Latest member
GracielaWa

Latest Threads

Top