Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newline/Paragraph separators not escaped for use in JSONP #1132

Closed
tidoust opened this issue May 11, 2012 · 14 comments
Closed

Newline/Paragraph separators not escaped for use in JSONP #1132

tidoust opened this issue May 11, 2012 · 14 comments

Comments

@tidoust
Copy link

tidoust commented May 11, 2012

For better or for worse, JSON is not a true subset of JavaScript as two characters sneaked their way into JSON:

  • \u2028: line separator
  • \u2029: paragraph separator

For JavaScript, these two characters are considered to be the same as \n. That's transparent most of the time, but JSONP is all about returning JSON wrapped into JavaScript and, as such, the JSON that gets wrapped must conform to the JavaScript language. JSON.stringify does not guarantee that (well, it depends on the parser used, but the one in node.js does not).

In particular, the following code will trigger a JavaScript exception if fetched from within a Web browser:

var express = require('express');
var app = express.createServer();

app.enable('jsonp callback');
app.get('/', function (req, res) {
 res.json({ property: 'hello\u2028world' }, { 'Content-Type': 'text/javascript' });
});
app.listen(5000);

The separator characters need to be escaped before they are returned, with code such as:
body.replace(/\u2028/g, '\\u2028').replace(/\u2029/g, '\\u2029')

See JSON: The JavaScript subset that isn't for a good summary of the problem.

@tj
Copy link
Member

tj commented May 11, 2012

it uses JSON.stringify(), so I dont want to mess with anything it does, if there really is a problem it's with v8

@tj tj closed this as completed May 11, 2012
@tidoust
Copy link
Author

tidoust commented May 11, 2012

Er, not sure if I was clear enough. JSON.stringify() does the right thing: it does not escape \U+2028 and \U+2029 because they do not need to be escaped per the JSON spec. JSON.stringify() return a string that is perfectly valid in JS. That's not the problem.

The problem arises when this string is wrapped in JS code for use in JSONP. In that case, the client receives JavaScript, not JSON, and uses the JavaScript parser. When that parser bumps into the U+2028 character, it considers it's a newline and as newlines are not allowed in the middle of strings in JavaScript it throws an exception.

In short, the problem lies with the JSONP mode in Express, not with JSON.stringify().

@tj
Copy link
Member

tj commented May 11, 2012

ohhh sorry somehow I missed the P on JSON haha

@tj tj reopened this May 11, 2012
@c4milo
Copy link

c4milo commented Aug 22, 2012

I'm running into this issue as well right now @visionmedia

@tj
Copy link
Member

tj commented Aug 22, 2012

I wont have time to look at it right away, I head out of town tomorrow morning

@c4milo
Copy link

c4milo commented Aug 22, 2012

@c4milo
Copy link

c4milo commented Aug 23, 2012

But, we have the case where the JSON returned by expressjs through res.json is not escaping \u2028 and \u2029 and it's breaking other applications. I think having body.replace(/\u2028/g, '\\u2028').replace(/\u2029/g, '\\u2029'); in res.json would be pretty useful and safe.

@dougwilson
Copy link
Contributor

It looks like the "bug" will not be fixed in v8, because fixing it would violate es5. Using the replace solution from @c4milo would be the correct solution, then.

body.replace(/\u2028/g, '\\u2028').replace(/\u2029/g, '\\u2029');

Or perhaps the following (not sure which is faster):

body.replace(/[\u2028\u2029]/g, function (c) { return '\\u' + c.charCodeAt(0).toString(16); });

@c4milo
Copy link

c4milo commented Dec 28, 2012

@visionmedia any update?

@tj tj closed this as completed in 64a2349 Dec 28, 2012
@dougwilson
Copy link
Contributor

@c4milo @visionmedia the issue doesn't actually affect JSON. Those characters do not need to be escaped in JSON, only in JavaScript. Since JSONP is technically JavaScript, that's why it is a problem there.

// Works (JSON):
console.log(JSON.parse(JSON.stringify(["test\u2028ing"])))

// Doesn't work (JSONP):
eval('console.log(' + JSON.stringify(["test\u2028ing"]) + ')')

Putting the replaces in res.json is unnecessary and would just impact performance for no reason, i.m.o.

@dougwilson
Copy link
Contributor

@c4milo Your example there was not even JSON. JSON is always a string, but you gave an object as an argument to JSON.parse which is what the error is from. You meant to write JSON.parse("{\"foo\":\"h\u2028i\"}") which works, because the argument to JSON.parse is a string.

@c4milo
Copy link

c4milo commented Dec 28, 2012

LOL, yeah you are totally right. Gah, that's what happen when you work from home when two kids are visiting

@dougwilson
Copy link
Contributor

Anyway, the thing is JSON.stringify in v8 does not escape the two characters, which is correct according to JSON spec. The issue is only when the JSON is inserted into JavaScript code, which is only when performing a JSONP response (not a direct JSON response, which is what the stock res.json does). Thus only res.jsonp needs the replacement done on the JSON string. Your example res.json override above is doing double-duty with JSON and JSONP, which the stock res.json does not do, thus why the replacements are not needed within res.json.

@c4milo
Copy link

c4milo commented Dec 28, 2012

yes sir, no need for that middleware at all.

ajhyndman added a commit to ajhyndman/json-loader that referenced this issue Apr 2, 2016
I think that the Webpack JSON loader seems like the right place to handle 
this issue.

For whatever reason, the JSON and Javascript specifications disagree on
whether or not strings can contain the unicode Newline or Paragraph 
characters.

In the case that a JSON object containing one of these characters is being
printed to a script tag, we should escape them.

See also, this discussion:
expressjs/express#1132 (comment)
laughinghan pushed a commit to ajhyndman/json-loader that referenced this issue Dec 7, 2016
I think that the Webpack JSON loader seems like the right place to handle 
this issue.

For whatever reason, the JSON and Javascript specifications disagree on
whether or not strings can contain the unicode Newline or Paragraph 
characters.

In the case that a JSON object containing one of these characters is being
printed to a script tag, we should escape them.

See also, this discussion:
expressjs/express#1132 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants