3

I'm attempting to look up the word "flower" in Google's dictionary semi-api. Source:

https://gist.github.com/DelvarWorld/0a83a42abbc1297a6687

Long story short, I'm calling JSONP with a callback paramater then regexing it out.

But it hits this snag:

undefined:1
ple","terms":[{"type":"text","text":"I stopped to buy Bridget some \x3cem\x3ef
                                                                    ^
SyntaxError: Unexpected token x
    at Object.parse (native)

Google is serving me escaped HTML characters, which is fine, but JSON.parse cannot handle them?? What's weirding me out is this works just fine:

$ node

> JSON.parse( '{"a":"\x3cem"}' )
  { a: '<em' }

I don't get why my thingle is crashing

Edit These are all nice informational repsonses, but none of them help me get rid of the stacktrace.

1
  • Take a look at string in json.org Commented Jul 31, 2013 at 3:59

4 Answers 4

2

\xHH is not part of JSON, but is part of JavaScript. It is equivalent to \u00HH. Since the built-in JSON doesn't seem to support it and I doubt you'd want to go through the trouble of modifying a non-built-in JSON implementation, you might just want to run the code in a sandbox and collect the resulting object.

Sign up to request clarification or add additional context in comments.

2 Comments

Another hack for if you need to parse a “nearly JSON” structure is to replace \x with \u00 before parsing. This is slightly safer as it avoids eval'ing.
@bobince: Right; that's sort of why I included the “\xHH\u00HH” bit. The problem with that is that you have to be careful of other escapes, e.g., don't change \\xHH (which is the literal text \xHH) into \\u00HH (the literal text \u00HH). I, too, agree that evaling is usually undesirable, but if you do it in a sandbox without access to…almost anything, with a timeout, it should be safe.
0

According to http://json.org, a string character in a JSON representation of string may be:

  • any-Unicode-character- except-"-or--or- control-character
  • \"
  • \
  • \/
  • \b
  • \f
  • \n
  • \r
  • \t
  • \u four-hex-digits

So according to that list, the "json" you are getting is malformed at \x3

Comments

0

The reason why it works is because these two are equivalent.

JSON.parse( '{"a":"\x3cem"}' )

and

JSON.parse( '{"a":"<em"}' )

you string is passed to JSON.parse already decoded since its a literal \x3cem is actually <em

Now, \xxx is valid in JavaScript but not in JSON, according to http://json.org/ the only characters you can have after a \ are "\/bfnrtu.

Comments

0

answer is correct, but needs couple of modifications. you might wanna try this one: https://gist.github.com/Selmanh/6973863

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.