How do I evaluate a JSON response?

G

Garrett Smith

The FAQ mentions JSON, but only for "when do I use eval". That entry is
not focused on a specific task.

An FAQ entry JSON, in contrast, would be focused on a specific task, and
a common one. It would be useful to mention JSON.parse support there.

FAQ Entry Proposal:

| How do I evaluate a JSON response?
|
| An XMLHttpRequest's responseText can be evaluated in a few ways. The
| Function constructor and eval are both widely supported; either can
| be used to evaluate trusted code.
|
| The Function constructor creates a globally-scoped Function. In
| contrast, eval runs in the calling context's scope.
|
| To evaluate code with the Function constructor, you could use:
|
| function evalResponse(responseText) {
| return new Function("return(" + responseText + ");")();
| }
|
| Where supported, JSON.parse may be used.
|
| var NATIVE_JSON_PARSE_SUPPORT = window.JSON &&
| typeof JSON.parse === 'function' &&
| JSON.parse('true').test;
|
| function evalResponse(responseText) {
| if(NATIVE_JSON_PARSE_SUPPORT) {
| try {
| return JSON.parse(responseText);
| } catch(ex) {
| return "Error";
| }
| } else {
| return new Function("return(" + responseText + ")")();
| }
| }
|
| If the argument to JSON.parse is not JSON, a SyntaxError will be
| thrown.

Garrett
 
A

Asen Bozhilov

Garrett said:
| In contrast, eval runs in the calling context's scope.

As you know, that is not true in ECMA-262-5 strict mode. And `eval' is
not run in calling execution context. ECMA-262-3 define `10.1.2 Types
of Executable Code` and eval code is a part of that section. So eval
code is running at separate execution context. That execution context
use the same Variable Object and `this` value as calling execution
context.

| Where supported, JSON.parse may be used.
|
| var NATIVE_JSON_PARSE_SUPPORT = window.JSON &&
|   typeof JSON.parse === 'function' &&
|   JSON.parse('true').test;

window.JSON ? Why do you use `window'? At environment where Global
Object does not have property `window' or refer host object your code
has two options:

- ReferenceError `window' is not defined
- Result of expression is evaluated to `false'
| function evalResponse(responseText) {
|   if(NATIVE_JSON_PARSE_SUPPORT) {
|     try {
|       return JSON.parse(responseText);
|     } catch(ex) {
|       return "Error";
|     }
|   } else {
|     return new Function("return(" + responseText + ")")();
|   }
| }

What do you think about the follow string:

'{JSON : false}'

By JSON syntax grammar that is not syntactical valid. So if I run your
code in implementation where is presented built-in `JSON' I will have
result "Error". If I run the code in implementation without `JSON'
that string will be evaluated as well because is bound by syntax rules
of `11.1.5 Object Initialiser`. Think about it ;~)
 
T

Thomas 'PointedEars' Lahn

Garrett said:
| var NATIVE_JSON_PARSE_SUPPORT = window.JSON &&
| typeof JSON.parse === 'function' &&
| JSON.parse('true').test;

Don't. `JSON' is supposed to be a built-in object in conforming
implementations, so the name of a property of the _Global Object_.
And as Asen already noted, the test is bogus. Better:

var _global = this;
var NATIVE_JSON_PARSE_SUPPORT =
typeof _global.JSON == "object" && _global.JSON
&& typeof JSON.parse == "function"
&& JSON.parse('{"x": "42"}').x == "42";
| function evalResponse(responseText) {
| if(NATIVE_JSON_PARSE_SUPPORT) {
| try {
| return JSON.parse(responseText);
| } catch(ex) {
| return "Error";
| }

Don't. Let the code throw the original exception (`SyntaxError') or a user-
defined one (e.g., `JSONError') instead. (Unfortunately, the ES5 authors
neglected to define an easily distinguishable exception type for JSON
parsing.)


PointedEars
 
G

Garrett Smith

As you know, that is not true in ECMA-262-5 strict mode. And `eval' is
not run in calling execution context. ECMA-262-3 define `10.1.2 Types
of Executable Code` and eval code is a part of that section. So eval
code is running at separate execution context. That execution context
use the same Variable Object and `this` value as calling execution
context.

Yes that is what I meant to say. eval gets is scope from the calling
context.
window.JSON ? Why do you use `window'? At environment where Global
Object does not have property `window' or refer host object your code
has two options:

Can use `this.JSON` in global context.
- ReferenceError `window' is not defined
- Result of expression is evaluated to `false'

True, for an environment with no window.


[...]
What do you think about the follow string:

'{JSON : false}'

Benign. Worse: any new or call expressions -- those would run to, for
Function().
By JSON syntax grammar that is not syntactical valid. So if I run your
code in implementation where is presented built-in `JSON' I will have
result "Error". If I run the code in implementation without `JSON'
that string will be evaluated as well because is bound by syntax rules
of `11.1.5 Object Initialiser`. Think about it ;~)

If JSON is going to be used, then the fallback should work just the same.

The problem is that I don't have a good regexp filter to test valid JSON.

JSON.org does, but it seems flawed.

http://www.json.org/json2.js

The filter allows invalid JSON through. For example, passing in "{,,1]}"
results in a truthy value that is then passed to eval. The problem wit
that is the error that is thrown is different. Now, nstead of "invalid
JSON", you'll get something like "invalid property id.".

Example:
// Input
var text = "{,,1]}";

// Code from json2.js
/^[\],:{}\s]*$/.
test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@').
replace(
/"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]\d+)?/g,
']').
replace(/(?:^|:|,)(?:\s*\[)+/g, ''))

Result:

In browsers with no JSON support, the result is going to be true, and if
`text` is subsequently passed to Function, the result is going to be a
different error message.

I did some searching for isValidJSON to see if anyone wrote one. I
learned that apparently jQuery has taken the same approach I proposed.
That is: Use JSON where available and where not, use a fallback enforces
valid JSON. Having a look:

| parseJSON: function( data ) {
| if ( typeof data !== "string" || !data ) {
| return null;
| }

What's that? An empty string becomes null? Why only allow strings?
ECMAScript Ed 5 requires that the argument be converted to a string.

Next I see:

| // Make sure the incoming data is actual JSON
| // Logic borrowed from http://json.org/json2.js

The RegExp in json2.js does not make sure the code is valid JSON.

jQuery's approach is no good, but the function's got a better name so
I'll use that. Back to the problem: I want to find a good regexp to
verify if something is JSON or not so that I can provide an equivalent
fallback.

// TODO: define validJSONExp.
var parseJSON = NATIVE_JSON_PARSE_SUPPORT ?
function(responseText) {
return JSON.parse(responseText);
} :
function(responseText) {
if(validJSONExp.test(responseText)) {
return new Function("return(" + responseText + ")")();
} else {
throw SyntaxError("JSON parse error");
}
};

Garrett
 
G

Garrett Smith

Garrett Smith wrote:
[...]

Don't. Let the code throw the original exception (`SyntaxError') or a user-
defined one (e.g., `JSONError') instead. (Unfortunately, the ES5 authors
neglected to define an easily distinguishable exception type for JSON
parsing.)

JSON was designed to throw an error with the least amount of information
possible. This is, AIUI, to thwart attackers.

If a fallback is made, the fallback will also throw a similar error. In
that case, it should be fairly clear that the interface performs
equivalently across implementations.

Getting this right and getting the code short enough for the FAQ seems
to be a challenge. Meeting those goals, the result should be valuable
and appreciated by many.

I'm also working on something else and that thing is wearing me out.

Garrett
 
T

Thomas 'PointedEars' Lahn

Garrett said:
Thomas said:
Garrett Smith wrote: [...]
Don't. Let the code throw the original exception (`SyntaxError') or a
user-defined one (e.g., `JSONError') instead. (Unfortunately, the ES5
authors neglected to define an easily distinguishable exception type for
JSON parsing.)

JSON was designed to throw an error with the least amount of information
possible. This is, AIUI, to thwart attackers.

If so, that would be nonsense. Security by obscurity is a bad idea.
If a fallback is made, the fallback will also throw a similar error.

But it should throw an exception. Your suggestion does not.
In that case, it should be fairly clear that the interface performs
equivalently across implementations.
And?

Getting this right and getting the code short enough for the FAQ seems
to be a challenge.

Not unless one has a serious reading problem.
Meeting those goals, the result should be valuable and appreciated by
many.

Which part of my suggestion did you not like?
I'm also working on something else and that thing is wearing me out.

All the more reason to consider my suggestion.


PointedEars
 
G

Garrett Smith

[...]

If so, that would be nonsense. Security by obscurity is a bad idea.
If a fallback is made, the fallback will also throw a similar error.

But it should throw an exception. Your suggestion does not.

Right - that one was no good and so it has to change.
Not unless one has a serious reading problem.

I did not see an expression for validJSONExp. The goal is: If a fallback
is made, the fallback will also throw a similar error.

To do that, the input must be verified against the same rules. Those
rules are defined in ECMA-262 Ed 5.
Which part of my suggestion did you not like?

Nothing, its fine but I did not see a regexp there that tests to see if
the string is valid JSON.
All the more reason to consider my suggestion.

The suggestion to use `this` in global context is right. The suggestion
to use an object literal as the string to the argument to JSON.parse is
not any better than using "true". Both are valid JSONValue, as specified
in ES5.

Garrett
 
T

Thomas 'PointedEars' Lahn

Garrett said:
Nothing, its fine but I did not see a regexp there that tests to see if
the string is valid JSON.

There cannot be such a regular expression in ECMAScript as it does not
support PCRE's recursive matches feature. An implementation of a push-down
automaton, a parser, is required.
[...] The suggestion to use an object literal as the string to the
argument to JSON.parse is not any better than using "true".

But it is. It places further requirements on the capabilities of the
parser. An even better test would be a combination of all results of all
productions of the JSON grammar.
Both are valid JSONValue, as specified in ES5.

ACK


PointedEars
 
G

Garrett Smith

There cannot be such a regular expression in ECMAScript as it does not
support PCRE's recursive matches feature. An implementation of a push-down
automaton, a parser, is required.

A parser would be too much for the FAQ.

The approach on json2.js would fit nicely, but is it acceptable?

The first thing that jumped out at me was that number matching allows
trailing decimal for numbers -- something that is disallowed in JSONNumber.

json2.js has the regular expression used below. I've assigned the result
`isValidJSON` to refer to it later in this message.

var text = "...";

var isValidJSON = /^[\],:{}\s]*$/.
test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@').
replace(
/"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
']').
replace(/(?:^|:|,)(?:\s*\[)+/g, ''))

Number is defined as:
-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g

But this allows numbers like "2." so it can be changed to disallow that:
-?\d+(?:\.\d+)?(?:[eE][+\-]?\d+)?/g

Some implementations violate the spec by allowing the trailing decimal
(Firefox, Webkit), while others do not.

Another problem is that the json2.js strategy allows some invalid syntax
and multiple expressions to go through to eval. For example, the values
'[],{}' or ':}' or - will have a true result when used with the regex
pattern above. The first will throw a SyntaxError when passed to eval,
or Function, with a different error message than JSON.parse and the
second will actually succeed in eval, whereas with native JSON.parse, a
SyntaxError results.

JSON.parse('[],{}');
SyntaxError

eval('[],{}');
{}

An inconsistent result. What sorts of problems can that cause and how
significant are they?
[...] The suggestion to use an object literal as the string to the
argument to JSON.parse is not any better than using "true".

But it is. It places further requirements on the capabilities of the
parser. An even better test would be a combination of all results of all
productions of the JSON grammar.

Cases that are known to be problematic can be filtered.

Every possibility of JSON Grammar cannot be checked unless either the
string is parsed or the code performs a set of feature tests that tests
every possibility. The possibilities would not be limited to valid JSON,
but would include every invalid JSON construct as well.

The alternative to that is to check for known cases where native JSON
fails and to check that to see how it fails and then if that failure is
unacceptable, to filter that out with a feature test.

Garrett
 
G

Garrett Smith

[...]


JSON.parse('[],{}');
SyntaxError

eval('[],{}');
{}

One way to get around that is to wrap the whole expression not in
Grouping Operator, but array literal characters '[' + text + ']'. After
evaluating that, the only valid result could be an array with length =
0; anything else should result in SyntaxError.

result = new Function("return[" + responseText + "]")();

I've also noticed that implementations allow some illegal characters.

| JSONStringCharacter ::
| SourceCharacter but not double-quote " or backslash \ or U+0000 thru
| U+001F
| \ JSONEscapeSequence

That specification is not hard to read so I'm not sure why
implementations are having such errors (especially considering they are
all working on the committee).

The specification even goes out of the way to state:

| Conforming implementations of JSON.parse and JSON.stringify must
| support the exact interchange format described in this specification
| without any deletions or extensions to the format. This differs
| from RFC 4627 which permits a JSON parser to accept non-JSON forms
| and extensions.

Didn't stop Firefox and IE from shipping buggy implementations.

The production for JSONStringCharacter states that a string character is
not any of: ", \, or \u0000-\u001f.

This quite simple to test in a regexp:

var jsonString = /"[^"\\\n\r\u0000-\u001f]*"/

jsonString.test('""'); // true
jsonString.test('"\u0000"'); // false
jsonString.test('"\u001f"'); // false
jsonString.test('"\u0020"'); // true

I also noticed that json2.js does something strange, but does not
explain where the failure case exists:

| var cx =
/[\u0000\u00ad\u0600-\u0604\u070f\u17b4\u17b5\u200c-\u200f\u2028-\u202f\u2060-\u206f\ufeff\ufff0-\uffff]/g,

(that long line will inevitably wrap and become obscured when quoted)

| // Parsing happens in four stages. In the first stage, we replace
| // certain Unicode characters with escape sequences. JavaScript
| // handles many characters incorrectly, either silently deleting
| // them, or treating them as line endings.
|
| text = String(text);
| cx.lastIndex = 0;
| if (cx.test(text)) {
| text = text.replace(cx, function (a) {
| return '\\u' +
| ('0000' + a.charCodeAt(0).toString(16)).slice(-4);
| });
| }

The code comment should provide a url that describes the problem
(bugzilla, etc).

So far, I've identified four bugs in json2.js; two of these exist in
Mozilla's and on IE's implementation.

1) JSONNumber - allows trailing decimal after integer.
2) JSONString - allows invalid characters \u0000-\u001f.
3) Evaluates expressions separated by comma.
4) Does not always throw generic syntax error as specified.

#4 seems unimportant but I addressed it anyway. #1, and #2 don't seem
that substantially bad, but if the goal is to mimic the spec, then why
not do that (and it isn't that hard).

#3 is potentially bad, but unlikely. Again, just stick with the spec.

I've addressed these issues in the code below, but there might be more.

I do not know what the purpose of allowing \uFFFF into JSON.parse, but
it is allowed by the specification.

Code:

var parseJSON = function(responseText) {

var NATIVE_JSON_PARSE_SUPPORT = typeof JSON == "object"
&& typeof JSON.parse == 'function'
&& JSON.parse('{"a":true}').a;

function isValidJSON(text) {
text = String(text);

// Based on code by Douglas Crockford, from json2.js
return !!text && /^[\],:{}\s]*$/
.test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@')
.replace(
/"[^"\\\n\r\u0000-\u001f]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
']')
.replace(/(?:^|:|,)(?:\s*\[)+/g, ''));
}

return(parseJSON = NATIVE_JSON_PARSE_SUPPORT ?
function(responseText) {
return JSON.parse(responseText);
} : function(responseText) {
var result;
if(isValidJSON(responseText)) {
try {
result = new Function("return[" + responseText + "]")();
} catch(ex) { /*throw generic SyntaxError*/ }
if(result && result.length === 1) {
return result[0];
}
}
throw SyntaxError("JSON parse error");
})(responseText);
};

The questionable part of this strategy is relying on native support
which has been shown to be buggy. They shipped it, and it's too late, I
am sorry for that. Alternatives are to either not use JSON or to run all
input through isValidJSON function.

Garrett
 
G

Garrett Smith

Garrett Smith wrote:

Thomas 'PointedEars' Lahn wrote:
Garrett Smith wrote:
[...]


JSON.parse('[],{}');
SyntaxError

eval('[],{}');
{}

One way to get around that is to wrap the whole expression not in
Grouping Operator, but array literal characters '[' + text + ']'. After
evaluating that, the only valid result could be an array with length =
0; anything else should result in SyntaxError.
Correction: the only valid result could be an array with length = 1.
This quite simple to test in a regexp:

var jsonString = /"[^"\\\n\r\u0000-\u001f]*"/

Actually simpler than that. \n or \r are redundant and included in the
character range that follows. I copied the first half of that from json2.js.

var jsonString = /"[^"\\\u0000-\u001f]*"/

That can be used in the longer pattern borrowed from json2.js. Actually,
json2.js should use that.

Garrett
 
T

Thomas 'PointedEars' Lahn

Garrett said:
A parser would be too much for the FAQ.

Probably, although I think it could be done in not too many lines for the
purpose of validation.
The approach on json2.js would fit nicely, but is it acceptable?

The first thing that jumped out at me was that number matching allows
trailing decimal for numbers -- something that is disallowed in
JSONNumber.

json2.js has the regular expression used below. I've assigned the result
`isValidJSON` to refer to it later in this message.

var text = "...";

var isValidJSON = /^[\],:{}\s]*$/.
test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@').
replace(
/"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
']').
replace(/(?:^|:|,)(?:\s*\[)+/g, ''))

Is this from json2.js? If yes, then it is not acceptable. To begin with,
it does not regard "\"" valid JSON even though it is.
Number is defined as:
-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g

But this allows numbers like "2." so it can be changed to disallow that:
-?\d+(?:\.\d+)?(?:[eE][+\-]?\d+)?/g

It would still be insufficient. You simply cannot parse a context-free
non-regular language using only one application of only one non-PCRE.
[...] The suggestion to use an object literal as the string to the
argument to JSON.parse is not any better than using "true".

But it is. It places further requirements on the capabilities of the
parser. An even better test would be a combination of all results of all
productions of the JSON grammar.

Cases that are known to be problematic can be filtered.

Your point being?
Every possibility of JSON Grammar cannot be checked unless either the
string is parsed

Exactly. But that is not what I suggested.
or the code performs a set of feature tests that tests every possibility.

Do you realize that this is not possible?


PointedEars
 
G

Garrett Smith

Probably, although I think it could be done in not too many lines for the
purpose of validation.

That would require more code to be downloaded and more processing to run
it. What about mobile devices?

[...]
var isValidJSON = /^[\],:{}\s]*$/.
test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@').
replace(
/"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
']').
replace(/(?:^|:|,)(?:\s*\[)+/g, ''))

Is this from json2.js? If yes, then it is not acceptable. To begin with,
it does not regard "\"" valid JSON even though it is.

The code is from json2.js:
http://www.json.org/json2.js

The character sequence "\"" is valid JSON value in ecmascript, however
in ecmascript, if enclosed in a single quote string, as - '"\""' - the
backslash would escape the double quote mark, resulting in '"""', which
is not valid JSON.

To pass a string value containing the character sequence "\"" to
JSON.parse, the backslash must be escaped. Thus, you would use:

var quoteMarkInJSONString = '"\\""';

And that works.

JSON.parse(quoteMarkInJSONString) == JSON.parse('"\\""') == "\""

Result: string value containing the single character: ".

JSON.parse(quoteMarkInJSONString) === "\""
Number is defined as:
-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g

But this allows numbers like "2." so it can be changed to disallow that:
-?\d+(?:\.\d+)?(?:[eE][+\-]?\d+)?/g

It would still be insufficient. You simply cannot parse a context-free
non-regular language using only one application of only one non-PCRE.

The goal of json2.js's JSON.parse is not to filter out values that are
valid; it is to eliminate values that are invalid. So far, it was
noticed to fail at that in four ways and I addressed those.

A fifth way that json2.js fails is to allow digits beginning with 0.

JSON.parse("01");

That results 1, but should thow an error. Firefox does the same thing.
[...] The suggestion to use an object literal as the string to the
argument to JSON.parse is not any better than using "true".

But it is. It places further requirements on the capabilities of the
parser. An even better test would be a combination of all results of all
productions of the JSON grammar.

Cases that are known to be problematic can be filtered.

Your point being?

My point is that instead of trying every possible valid grammar check,
known bugs -- such as allowing 1. and +1 and 01, as seen in Spidermonkey
-- could be checked.

Checking every possible input is not possible.

JSON.parse('{"x": "42"}').x == "42" - tests a benign case. It doesn't
filter out any known error cases. Any implementation that can handle
JSON.parse('true') should be able to handle parsing the JSONObject.

Allowing the native implementation to fail on invalid cases such as
parsing "2." as JSON results in an inconsistent interface whereby in
current versions Firefox and IE, the value `2` results, but in current
versions of Opera and Chrome, an error is thrown.

The inconsistency might seem minor but it could result in a
hard-to-track down bug. For example, consider an application that sends
a money value back to the client as {"dollars" : 2.11}. When the money
value includes a decimal cents, it runs fine, but when, say, a value
`2.` is passed, it correctly throws an error in untested browsers.

This brings me back to the idea of developing a strategy of identifying
known bugs in implementations and then devising feature tests for those
bugs and then only allowing an implementation that does not pass the
feature test to run its own native JSON.parse.

The fallback can disallow anything not allowed by JSON Grammar,

The stage I'm at now is identifying implementation bugs and defining a
validator for, actually, isInvalidJSON.

A thorough test case for valid JSON is needed.

I'm considering porting the test cases from Opera; AFAIK, Opera's suite
is the only suite for JSON and it is not offered as a zipped download.
One must download each JS file. It would also probably be a good idea to
not use sync requests, as that test runner does. I got a freeze/crash in
IE8 with that.
Exactly. But that is not what I suggested.


Do you realize that this is not possible?

That was my point.

Garrett
 
G

Garrett Smith

[...]

The character sequence "\"" is valid JSON value in ecmascript, however
in ecmascript, if enclosed in a single quote string, as - '"\""' - the
backslash would escape the double quote mark, resulting in '"""', which
is not valid JSON.

To pass a string value containing the character sequence "\"" to
JSON.parse, the backslash must be escaped. Thus, you would use:

var quoteMarkInJSONString = '"\\""';

And that works.

JSON.parse(quoteMarkInJSONString) == JSON.parse('"\\""') == "\""

Paste error. Should have been just:

JSON.parse(quoteMarkInJSONString)
 
L

Lasse Reichstein Nielsen

Thomas 'PointedEars' Lahn said:
Is this from json2.js? If yes, then it is not acceptable. To begin with,
it does not regard "\"" valid JSON even though it is.

It didn't use to be valid.
Originally, a JSON text had to be either an object or an array, but not
a simple value.
This was changed at some point (I'm guessing during ES5 development) so
that the grammar on json.org and the one in the ES5 spec allow JSON text
to be any JSON value.
JSON2 implements the original version.
Number is defined as:
-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g

But this allows numbers like "2." so it can be changed to disallow that:
-?\d+(?:\.\d+)?(?:[eE][+\-]?\d+)?/g

It would still be insufficient. You simply cannot parse a context-free
non-regular language using only one application of only one non-PCRE.

The idea of the regexp isn't to check that the grammar is correct, but
merely that all the tokens are valid.

It's almost enough to guarantee that a successful eval on the string
would mean that the grammar was also correct. But only almost,
e.g., '{"x":{"y":42}[37,"y"]}' uses only correct tokens.

Still, it disallows arbitrary code execution, which I guess is the main
reason, and it correctly handles all valid JSON.

/L
 
L

Lasse Reichstein Nielsen

Lasse Reichstein Nielsen said:
It didn't use to be valid.
Originally, a JSON text had to be either an object or an array, but not
a simple value.
This was changed at some point (I'm guessing during ES5 development) so
that the grammar on json.org and the one in the ES5 spec allow JSON text
to be any JSON value.
JSON2 implements the original version.

Silly me, answering before checking.
It actually does allow "\"" as valid JSON.

/L
 
G

Garrett Smith

Nope.

Silly me, answering before checking.

You've not read my posts yet, apparently. Bug count of json2.js is up to
5 and some of those bugs exist in IE's and Firefox' native implementations.
It actually does allow "\"" as valid JSON.
It does; you just need to make sure that if you're passing a string
value, that you take into account ecmascript string escaping rules.

The string '"\""' in ecmascript, is equivalent to '"""'. Passed to
JSON.parse, '"""', would be unparseable, as """ appears to be a valid
JSONString followed by a quote mark. A SyntaxError would be thrown.

The backslash character and quote must be escaped in a JSONString.

The backslash in the ecmascript string must be escaped before it is
passed to JSON.parse, similarly to the way one escapes a string when
passing it to the RegExp constructor.

JSON.parse('"\\"')
- Error: the character \ may not appear unescaped in a JSONString
JSON.parse('"\\\\"')
- Successfully parses a string with the single character: \
JSON.parse('"\\""')
- Successfully parses a JSONString containing the single character "

Garrett
 
T

Thomas 'PointedEars' Lahn

Lasse said:
It didn't use to be valid.
Originally, a JSON text had to be either an object or an array, but not
a simple value.
This was changed at some point (I'm guessing during ES5 development) so
that the grammar on json.org and the one in the ES5 spec allow JSON text
to be any JSON value.
JSON2 implements the original version.

OK, but I don't care. A viable fallback for JSON.parse() has to accept the
same strings that JSON.parse() accepts, and only those.
Number is defined as:
-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g

But this allows numbers like "2." so it can be changed to disallow that:
-?\d+(?:\.\d+)?(?:[eE][+\-]?\d+)?/g

It would still be insufficient. You simply cannot parse a context-free
non-regular language using only one application of only one non-PCRE.

The idea of the regexp isn't to check that the grammar is correct,

Nobody wanted to check the grammar for correctness in the first place.
We have to accept its correctness as an axiom here.
but merely that all the tokens are valid.

Which means that the string can be produced by application of the grammar.
It's almost enough to guarantee that a successful eval on the string
would mean that the grammar was also correct. But only almost,
e.g., '{"x":{"y":42}[37,"y"]}' uses only correct tokens.

Still, it disallows arbitrary code execution, which I guess is the main
reason, and it correctly handles all valid JSON.

What the heck are you talking about?


PointedEars
 
T

Thomas 'PointedEars' Lahn

Garrett said:
That would require more code to be downloaded and more processing to run
it. What about mobile devices?

You are confused. What good is shorter code that is not a solution?
var isValidJSON = /^[\],:{}\s]*$/.
test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@').
replace(
/"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
']').
replace(/(?:^|:|,)(?:\s*\[)+/g, ''))

Is this from json2.js? If yes, then it is not acceptable. To begin
with, it does not regard "\"" valid JSON even though it is.

The code is from json2.js:
http://www.json.org/json2.js

Then it must be either summarily dismissed, or updated at least as follows:

/"([^"\\]|\\.)*"|.../

because *that* is the proper way to match a double-quoted string with
optional escape sequences. Refined for JSON, it must be at least

/"([^"\\^\x00-\x1F]|\\["\\\/bfnrt]|\\u[0-9A-Fa-f]{4})*"|.../
The character sequence "\"" is valid JSON value in ecmascript,

That is gibberish. Either it is JSON, or it is ECMAScript.
however in ecmascript, if enclosed in a single quote string, as - '"\""' -
the backslash would escape the double quote mark, resulting in '"""',
which is not valid JSON.

You are confused.

"\"" is both an ES string literal and JSON for the string containing "

"\\\"" is both an ES string literal and JSON for the string containing \"

"\\"" is neither an ES string literal nor JSON.
To pass a string value containing the character sequence "\"" to

But that was not the purpose of the JSON string.
JSON.parse, the backslash must be escaped. Thus, you would use:

var quoteMarkInJSONString = '"\\""';

Yes, but that is not how JSON is usually being put in. That is, the
escaping backslash is _not_ escaped then, and the characters that
quoteMarkInJSONString contains are

\"

and not

"

whereas only the latter was intended.
And that works.

A JSON string *literal* may very well contain a literal backslash character,
and it may also contain a literal double quote. The expression fails to
recognize that.
JSON.parse(quoteMarkInJSONString) == JSON.parse('"\\""') == "\""

Result: string value containing the single character: ".

JSON.parse(quoteMarkInJSONString) === "\""

You miss the point.
Number is defined as:
-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g

But this allows numbers like "2." so it can be changed to disallow that:
-?\d+(?:\.\d+)?(?:[eE][+\-]?\d+)?/g

It would still be insufficient. You simply cannot parse a context-free
non-regular language using only one application of only one non-PCRE.

The goal of json2.js's JSON.parse is not to filter out values that are
valid; it is to eliminate values that are invalid. So far, it was
noticed to fail at that in four ways and I addressed those.

You are very confused.
[...] The suggestion to use an object literal as the string to the
argument to JSON.parse is not any better than using "true".

But it is. It places further requirements on the capabilities of the
parser. An even better test would be a combination of all results of
all productions of the JSON grammar.

Cases that are known to be problematic can be filtered.

Your point being?

My point is that instead of trying every possible valid grammar check,
known bugs -- such as allowing 1. and +1 and 01, as seen in Spidermonkey
-- could be checked.

The purpose of this was to provide a viable fallback for JSON.parse().
Both your suggestion and the one in json2.js fail to do that.
Checking every possible input is not possible.

Yes, it is.


PointedEars
 
T

Thomas 'PointedEars' Lahn

Thomas said:
Garrett said:
Thomas said:
Garrett Smith wrote:
var isValidJSON = /^[\],:{}\s]*$/.
test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@').
replace(
/"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?\d+)?/g,
']').
replace(/(?:^|:|,)(?:\s*\[)+/g, ''))

Is this from json2.js? If yes, then it is not acceptable. To begin
with, it does not regard "\"" valid JSON even though it is.

The code is from json2.js:
http://www.json.org/json2.js

Then it must be either summarily dismissed, or updated at least as
follows:

/"([^"\\]|\\.)*"|.../

because *that* is the proper way to match a double-quoted string with
optional escape sequences. Refined for JSON, it must be at least

/"([^"\\^\x00-\x1F]|\\["\\\/bfnrt]|\\u[0-9A-Fa-f]{4})*"|.../
^
Typo, must be

/"([^"\\\x00-\x1F]|\\["\\\/bfnrt]|\\u[0-9A-Fa-f]{4})*"|…/


PointedEars
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top