1

I have a index.html with a which sends a text to a PHP code. This PHP sends it again by POST (curl) to a Node.js server, inserted in a JSON message (utf8-encoded)

//Node.js server file (app.js) -- gets the json and shows it in a <script> to save it in client JS
render(index, {json:{string:"mystring"}})

//Template to render (index.ejs)
var data = <%=JSON.stringify(json)%>;

So that I can pass those variables in the JSON to data. JSON is way bigger than here, I wrote only the part which creates a bug : the string contained here makes an "INvalid character" JS bug. What should I do ? Which encoding/decoding/escaping should I use ?

I have utf-8 everywhere, as all my other strings work, even with german or arabic characters. In this particular case, this is the "mystring" below which breaks the app :

enter image description here

If I remove the characters in the red circles It works.

Here is the string as it is in the JSON i receive :

"Otto\nTheater-, Konzert- und Gpb\n\u2028\u2028Rhoasse\u00dfe 20\u2028\n51065 K\u00f6ln\n\nTelefon: 0000-000000-0\u2028\nTelefax: 0000-000000\n\nE-Mail: [email protected]\u2028"

Because it is a user-entered text, I must handle this kind of characters. I don't have access to the PHP part of the code, only to the nodeJS and client JS. How can I find and remove/convert those chars in JS ?

5
  • If you want to get help put the exact code, the exact result (not "HTML like this") and the expected result, may be you use a non-UTF8 encoding, may be you forgot to add some quotes, may be the configuration for the template engine is wrong, there may be a lot of causes Commented Aug 1, 2014 at 9:25
  • I am pretty sure the problem comes from this string as if I remove it, it works fine. I also found which characters are causing problems, but still don't know what to do. I am going to edit the question with it Commented Aug 1, 2014 at 9:30
  • Can you supply an example "mystring" that shows the behavior? Also what node version? Commented Aug 1, 2014 at 11:13
  • 1
    stackoverflow.com/questions/2965293/… Commented Aug 1, 2014 at 11:29
  • Yes as soon as I've found the buggy character code I found this stack answer which helped me. Commented Aug 1, 2014 at 12:14

2 Answers 2

7
<%- JSON.stringify(data).replace(/[\u0000\u00ad\u0600-\u0604\u070f\u17b4\u17b5\u200c-\u200f\u2028-\u202f\u2060-\u206f\ufeff\ufff0-\uffff]/g, "\\n") %>;

I ended up replacing invalid unicode characters (which are valid for JSON but not in JS code) with line breaks. This solves the problem

Sign up to request clarification or add additional context in comments.

Comments

3

JSON is commonly thought to be a subset of JavaScript, but it isn't quite. Due to an unfortunate oversight, the raw characters U+2028 and U+2029 are permitted in JSON string literals, but not in JavaScript string literals. In JavaScript, they are interpreted as newlines and so having one in a string literal is a syntax error.

Consequently this:

var data = <%=JSON.stringify(json)%>;

isn't safe. You can make it so by manually replacing them with string-literal-escaped versions:

JSON.stringify(json).replace('\u2028', '\\u2028').replace('\u2029', '\\u2029')

Typically it's best to avoid this kind of problem, and keep code and data strictly separated, by dropping the JSON data into an HTML data- attribute. It can then be read out of the DOM from the client-side script and passed through JSON.parse. Then the only kind of escaping you have to worry about is normal HTML-escaping, which hopefully your templating language does by default.

The other characters in your answer are actually okay for JS string literals, except for the control characters, which JSON also escapes.

It may well make sense to remove some of these characters anyway, as an input filtering step. It's unusual and almost always undesirable to have cruft like U+2028 in your data. You could consider filtering out the characters unsuitable for use in markup which include U+2028/9 and other bad things like bidi overrides that can mess up your page rendering.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.