"heredoc" in javascript

G

Garrett Smith

Stefan said:
In my Greasemonkey code written specifically for Firefox, I use this
"heredoc" syntax a lot:

var myBigBlob = (<><![CDATA[

... insert a bunch of free-form text here ...

]]></>).toString();

Now that Chrome offers built-in Greasemonkey support, I'd like to
support it as well, but it breaks on this syntax.

Does anyone know of a better method that will work in both browsers?

Sorry, I don't. I've always wondered why multi-line string literals
aren't possible in JavaScript. AFAICS, there's nothing ambiguous about
this syntax:

var heredoc = "I am a multi-line
string; I end when the tokenizer
sees a double quote";

This is standard practice in many other programming languages. Maybe ES4
had something like this in the queue, but with ES5 geared towards
backwards compatibility, we probably won't see it for a long time.

The two usual workarounds are

var str = "I wish\n"
+ "I was\n"
+ "a multi-line string";

and

var str = "I am the closest thing\
to multi-line strings that we can get\
with JavaScript";

I don't know how well supported the second version is. At least the
first version will can optimized into a single string when it's
processed with a minimizer.

Says in Ecma 262r3:
7.3 Line Terminators
A line terminator cannot occur within any token, not even a string.

It seems unsafe to rely on that. It might produce the desired outcome,
but if that failed, then it would be your fault.

Seems to have changed in ES5.

Ecma 262r5
7.3
| Line terminators may only occur within a StringLiteral token as part
| of a LineContinuation.

And

| A line terminator character cannot appear in a string literal, except
| as part of a LineContinuation to produce the empty character sequence.
| The correct way to cause a line terminator character to be part of the
| String value of a string literal is to use an escape sequence such as
| \n or \u000A.

And
| The SV of LineContinuation :: \ LineTerminatorSequence is the empty
| character sequence.

Good to see you back, BTW.
 
T

Thomas 'PointedEars' Lahn

Garrett said:
Stefan said:
[...]
var str = "I am the closest thing\
to multi-line strings that we can get\
with JavaScript";

I don't know how well supported the second version is. At least the
first version will can optimized into a single string when it's
processed with a minimizer.

Says in Ecma 262r3:
7.3 Line Terminators
A line terminator cannot occur within any token, not even a string.

It seems unsafe to rely on that. It might produce the desired outcome,
but if that failed, then it would be your fault.

Seems to have changed in ES5.

Good morning, sunshine.

<

PointedEars
 
N

nick

John G Harris wrote:

You really are pitiable.

Given the existence of preprocessor macros, almost any piece of junk can be
made to compile by a C++ compiler provided there are further definitions
and declarations.  Your code *as it is*, however, is clearly not C++ code;
that is, code that compiles *as it is* (in a main() function) without
syntax error messages.

What does this have to do with preprocessor macros?

"typedef const char * var;" is not a preprocessor macro.

Something like "#include said:
Yes, and by contrast this example compiles as it is (in a main() function).

....but only if you remember to #include <string>.

....and it will only compile to an executable program if you remember
to link the C++ standard library, which John's example would not
require "as it is".

I don't think your definition of "valid C++ code" being equal to "code
that will compile inside of main() 'as it is' (with no other code
inside main? main has to at least return an int!) really holds much
water...

Here is a full example of John's (implied) code, this will compile
into an executable program (which does nothing) without errors. No
libraries need to be linked, and no preprocessor instructions are
used.

int main()
{
typedef const char * var;
var heredoc = "I am a multi-line "
"string; I end when the tokenizer "
"sees a double quote";
return 0;
}

Here is an example of your (implied) code.

#include <string>
int main()
{
std::string heredoc = "I am a multi-line"
" string; I end when the tokenizer"
" sees a double quote";
return 0;
}
// make sure you link the standard C++ library!
// gcc -lstdc++ foo.cpp

In other words, both examples would really need more stuff in addition
to main() to compile, but since this isn't really a C++ audience it
doesn't matter. The point was the way the quoted text is written:

(doesn't matter) = "text, text "
"and other text";

....but you seem to have missed it.

-- Nick
 
J

John G Harris

Pot, kettle, black.


No, it is not.


You really are pitiable.

Given the existence of preprocessor macros, almost any piece of junk can be
made to compile by a C++ compiler provided there are further definitions
and declarations. Your code *as it is*, however, is clearly not C++ code;
that is, code that compiles *as it is* (in a main() function) without
syntax error messages.

Get a life.
<snip>

Here's a little program that compiles, links, and runs using Borland (as
was) C++ Builder 6 Pro :

//-- 'var' demo 2010-2-4
#pragma hdrstop
#include <iostream>

class var
{ private:
const char * str;
public:
var(const char * s)
{ str = s; }
const char * value()
{ return str; }
}; // class var

#pragma argsused
int main(int argc, char* argv[])
{
var heredoc = "I am a multi-line "
"string; I end when the tokenizer "
"sees a double quote";
std::cout << heredoc.value() << std::endl;
return 0;
} // main


When run from a Win Console and redirected to a file it outputs :

I am a multi-line string; I end when the tokenizer sees a double quote


Prove me wrong if you can.

John
 
T

Thomas 'PointedEars' Lahn

John said:
Thomas said:
John said:
Thomas 'PointedEars' Lahn wrote:
John G Harris wrote:
Stefan Weiss wrote:
I've always wondered why multi-line string literals
aren't possible in JavaScript. AFAICS, there's nothing ambiguous
about this syntax:

var heredoc = "I am a multi-line
string; I end when the tokenizer
sees a double quote";

This is standard practice in many other programming languages.
<snip>

In C++ you would write

var heredoc = "I am a multi-line "
"string; I end when the tokenizer "
"sees a double quote";
No, that is not C++ code. [...]
It's perfectly valid C++ code. [...]
var is obviously a type-name, for a type having a constructor with the
signature
var(const char * )

[...]
Given the existence of preprocessor macros, almost any piece of junk can ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
be made to compile by a C++ compiler provided there are further
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^
definitions and declarations. Your code *as it is*, however, is clearly
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
not C++ code; that is, code that compiles *as it is* (in a main() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
function) without syntax error messages. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[...]

Here's a little program that compiles, links, and runs using Borland (as
was) C++ Builder 6 Pro :

[...]
class var ^^^^^^^^^
[...]
int main(int argc, char* argv[])
{
var heredoc = "I am a multi-line "
"string; I end when the tokenizer "
"sees a double quote";
std::cout << heredoc.value() << std::endl;
return 0;
} // main


When run from a Win Console and redirected to a file it outputs :

I am a multi-line string; I end when the tokenizer sees a double quote


Prove me wrong if you can.

I'll rest my case instead. You still manage to miss the point.


PointedEars
 
M

Michael Wojcik

Thomas said:
Still it remains to be seen if there are enough programming languages that
do not require special syntax to justify your "many". For example, it does
not apply to the following languages I know rather well: BASIC (and
variants), Pascal (variants, and derivates), C (variants, and derivates),
Tcl (and derivates), Java, and Python. (And maybe I forgot some.)

Add COBOL and FORTRAN, so if the metric were source lines of code, we
could say that most existing software is written in a language that
does not allow string literals to be split across lines.

Personally, I prefer C's compromise: string literals can't contain
line terminators, but adjacent string literals are concatenated during
translation, so it's still convenient to create long literals. I
generally prefer this even to here-doc syntax in Bourne shell and
derivatives, as the latter tends to break the source formatting, even
with the tab-removal option.
 
J

John G Harris

John G Harris wrote:

I wrote an extract from my little program, thus :

<snip>

If I'd included the rest of the program I'm sure that some loud-mouth
would have complained about including so much irrelevant text, see
RFC ... , and maybe the FAQ.

You still manage to miss the point.

The point is that Stefan wrote an example of a multi-line string literal
and wished it were legal in ECMAScript. I wrote an example showing how
C++ does multi-line string literals, a better way in my opinion.

Another thing: you forgot to tell Stefan that his 'program' would be
rejected by any good Lint program on the grounds that heredoc is never
used.

John
 
T

Thomas 'PointedEars' Lahn

Michael said:
Add COBOL and FORTRAN, so if the metric were source lines of code, we
could say that most existing software is written in a language that
does not allow string literals to be split across lines.

ACK, thanks.
Personally, I prefer C's compromise: string literals can't contain
line terminators, but adjacent string literals are concatenated during
translation, so it's still convenient to create long literals. I
generally prefer this even to here-doc syntax in Bourne shell and
derivatives, as the latter tends to break the source formatting, even
with the tab-removal option.

Perhaps that is why Python supports both approaches:

# equivalent to "b\na\nr"
foo = """b
a
r"""

# equivalent to "bar"
foo = "b" \
"a" \
"r" \

JavaScriptâ„¢ becoming more pythonic as each year passes, I would really like
JavaScriptâ„¢ or ECMAScript to start supporting any or all of that. Breaking
up longer RegExp literals into concatenated strings passed to RegExp() is a
PITA, too; I had expected ES5 to allow

/f\\o\\o\
\sb\na\/r/

in order to avoid

new RegExp("f\\\\o\\\\o"
+ "\\sb\\na/r");

but it didn't. (There is a way to make this easier, but it does not always
work.)


PointedEars
 
M

Michael Haufe (\TNO\)

On Feb 4, 7:08 pm, Thomas 'PointedEars' Lahn <[email protected]>
wrote:

[snipped multiline string examples]
JavaScript™ becoming more pythonic as each year passes, I would really like
JavaScript™ or ECMAScript to start supporting any or all of that.

There was some discussion a short while ago on es-discuss about this
topic, so it's not far fetched to expect something like it to crop up
in the near future.
(sorry, can't find the relevant thread(s) atm)
 Breaking up longer RegExp literals into concatenated strings passed to RegExp() is a
PITA, too; I had expected ES5 to allow

  /f\\o\\o\
\sb\na\/r/

in order to avoid

  new RegExp("f\\\\o\\\\o"
    + "\\sb\\na/r");

but it didn't.  (There is a way to make this easier, but it does not always
work.)

It looks like Mozilla plans to introduce the /x flag in JS1.9 which
would allow something like this.

https://bugzilla.mozilla.org/show_bug.cgi?id=384232
http://wiki.ecmascript.org/doku.php?id=proposals:extend_regexps&s=flag#x_flag
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,079
Messages
2,570,575
Members
47,207
Latest member
HelenaCani

Latest Threads

Top