11

Consider I have a special object which may hold a literal json string, that I intend to use as a field in a larger JSON object, as the literal value itself (not a string containing the JSON).

I want to write a custom encoder that can accomplish this, ie:

> encoder.encode({
>     'a': LiteralJson('{}')
> })
{"a": {}}

I don't believe subclassing JSONEncoder and overriding default will work, because at best there, I can return the string, which would make the result {"a": "{}"}.

Overriding encode also appears not to work when the LiteralJson is nested somewhere inside another dictionary.

The background for this, if you are interested, is that I am storing JSON-encoded values in a cache, and it seems to me to be a waste to deserialize then reserialize all the time. It works that way, but some of these values are fairly long and it just seems like a huge waste.

The following encoder would accomplish what I like (but seems unnecessarily slow):

class MagicEncoder(json.JSONEncoder):

    def default(self, obj):
        if isinstance(obj, LiteralJson):
            return json.loads(obj.content)
        else:
            return json.JSONEncoder.default(self, obj)
3
  • I had exactly the same problem today, and I checked the python 2.7 json module contents. Unfortunately this is impossible to achieve without forking json module completely; all encoding happens in displays long nested functions. Commented Sep 13, 2012 at 10:10
  • And besides that, I think it prefers to use the C version when available, which would require even more forking. It's not a huge deal, we're talking about ns here, but it just feels wrong. Commented Sep 13, 2012 at 14:15
  • Well, it is an issue if for example with the upcoming PostgreSQL 9.2 you are storing actual json documents in db, and expect to serve them fast. I for one thought this was possible, but no. Commented Sep 14, 2012 at 10:19

2 Answers 2

6

I've just realised I had a similar question recently. The answer suggested to use a replacement token.

It's possible to integrate this logic more or less transparently using a custom JSONEncoder that generates these tokens internally using a random UUID. (What I've called "RawJavaScriptText" is the equivalent of your "LiteralJson".)

You can then use json.dumps(testvar, cls=RawJsJSONEncoder) directly.

import json
import uuid

class RawJavaScriptText:
    def __init__(self, jstext):
        self._jstext = jstext
    def get_jstext(self):
        return self._jstext

class RawJsJSONEncoder(json.JSONEncoder):
    def __init__(self, *args, **kwargs):
        json.JSONEncoder.__init__(self, *args, **kwargs)
        self._replacement_map = {}

    def default(self, o):
        if isinstance(o, RawJavaScriptText):
            key = uuid.uuid4().hex
            self._replacement_map[key] = o.get_jstext()
            return key
        else:
            return json.JSONEncoder.default(self, o)

    def encode(self, o):
        result = json.JSONEncoder.encode(self, o)
        for k, v in self._replacement_map.iteritems():
             result = result.replace('"%s"' % (k,), v)
        return result

testvar = {
   'a': 1,
   'b': 'abc',
   'c': RawJavaScriptText('{ "x": [ 1, 2, 3 ] }')
}

print json.dumps(testvar, cls=RawJsJSONEncoder)

Result (using Python 2.6 and 2.7):

{"a": 1, "c": { "x": [ 1, 2, 3 ] }, "b": "abc"}
Sign up to request clarification or add additional context in comments.

Comments

-2

it seems to me to be a waste to deserialize then reserialize all the time.

It is wasteful, but just in case anybody was looking for a quick fix this approach works fine.

Cribbing the example from Bruno:

testvar = {
   'a': 1,
   'b': 'abc',
   'c': json.loads('{ "x": [ 1, 2, 3 ] }')
}

print json.dumps(testvar)

Result:

{"a": 1, "c": {"x": [1, 2, 3]}, "b": "abc"}

3 Comments

I think the problem with this approach is that LiteralJson('{}') is meant to represent an object that would be in memory, whereas json.loads('{ "x": [ 1, 2, 3 ] }') is the result of a callable, already deserialised.
@Bruno Yes, it is explicitly a serialize->deserialize->serialize sequence. Which is unpleasant, but certainly the simplest code that could possibly work.
I suppose it depends on the objective. I'm not sure what the OP was aiming to do ultimately. In my question (linked from my answer), I was trying to insert JavaScript code in a few places (in what was otherwise a JSON structure), so my "LiteralJson" equivalent wasn't actually valid JSON, hence the specific encoder (because json.loads wouldn't have worked). I presume the OP's objective was at least to avoid this deserialize->serialize->derialize chain.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.