7

Currently I am using the following code to print a large data structure

print(json.dumps(data, indent=4))

I would like to see all the integers that get printed in hex instead of decimal. Is that possible? It seems that there is no way to override the existing encoder for integers. You can only provide a default for types not already handled by the JSONEncoder class, but no way to override how it encodes integers.

I figured out I can override the default integer printing behavior using sys.displayhook if I was running in the command line but I am not.

Just for reference the data structure is a mix bag of dicts, lists, strings, ints, etc. So that is why I went with the json.dumps(). The only other way I can think of doing it is to parse it myself and then I would be re-writing the json module.

Update: So I ended up implementing it with serializing functions that just print a copy of the original data structure with all integer types converted to hex strings:

def odprint(self, hexify=False):
    """pretty print the ordered dictionary"""
    def hexify_list(data):
        _data = []
        for i,v in enumerate(data):
            if isinstance(v, (int,long)):
                _data.insert(i,hex(v))
            elif isinstance(v,list):
                _data.insert(i, hexify_list(v))
            else:
                _data.insert(i, val)
        return _data

    def hexify_dict(data):
        _data = odict()
        for k,v in data.items():
            if isinstance(v, (dict,odict)):
                _data[k] = hexify_dict(v)
            elif isinstance(v, (int, long)):
                _data[k] = hex(v)
            elif isinstance(v,list):
                _data[k] = hexify_list(v)
            else:
                _data[k] = v
        return _data

    if hexify:
        print(json.dumps(hexify_dict(self), indent=4))
    else:
        print(json.dumps(self, indent=4))

Thanks for the help. I realize that I end up making an odict from a standard dict, but its just for printing so its fine for what I need.

6
  • Octal and hex forms are not allowed in JSON Commented Feb 1, 2012 at 19:10
  • note: your hexify_*() functions can loose data. If you go this road you could use something like Commented Feb 1, 2012 at 21:48
  • Can you explain how it can loose data? Commented Feb 1, 2012 at 23:15
  • bare else: makes sure that it doesn't loose data except that it erases the difference between a string/integer with hex digits. I've overlooked that. But it doesn't convert data that it should convert e.g., hexify_list() doesn't call hexify_dict(). tuples are ignored. btw, don't use .insert(i, item), use .append(item) Commented Feb 1, 2012 at 23:34
  • Makes sense. This code makes some assumptions about the data structure. (i.e. no dicts inside lists, no tuples). But I will make it more generic, in case someone decides to change the data structure. As far as .insert vs .append, why say "don't" use? Is it a performance thing? Commented Feb 2, 2012 at 1:32

7 Answers 7

2

A possible approach is to have a serialize function, which produces a copy of your dictionary on the fly and uses the standard json module to dump the string. A preliminary implementation looks like:

import json

def serialize(data):
    _data = {}
    for k, v in data.items():
        if isinstance(v, int):
            _data[k] = hex(v)
        else:
            _data[k] = v
    return json.dumps(_data, indent=4)


if __name__ == "__main__":
    data = {"a":1, "b":2.0, "c":3}
    print serialize(data)

output:

{
    "a": "0x1", 
    "c": "0x3", 
    "b": 2.0
}

Notice that this preliminary implementation does not work with lists, but this is easily changed.

Some may claim that the approach is memory-intensive because it creates a copy of the original data. This may be the case, but if your data structure is that big, then maybe you should (a) not be using JSON, or (b) create a copy of the JSON module in your working directory and tailor it to your needs.

Cheers.

Sign up to request clarification or add additional context in comments.

1 Comment

The memory argument is not valid in my case. So I like this approach. I am testing it out and trying to figure out how to make it work for lists and list of lists. My data structure is not huge but it is ugly :)
2

Octal and hexadecimal formats are not supported in JSON.

You could use YAML instead.

>>> import json, yaml
>>> class hexint(int):
...     def __str__(self):
...         return hex(self)
...
>>> json.dumps({"a": hexint(255)})
'{"a": 0xff}'
>>> yaml.load(_)
{'a': 255}

Or without wrapping integers:

import yaml

def hexint_presenter(dumper, data):
    return dumper.represent_int(hex(data))
yaml.add_representer(int, hexint_presenter)

print yaml.dump({"a": 255}), # -> {a: 0xff}
assert yaml.load('{a: 0xff}') == {"a": 255}

3 Comments

Yaml is not part of the Python installation on the server I am using, and I don't want to add the module locally for now. But this looks good.
@Plazgoth: you won't be able to load hexadecimal numbers as integers with json.
Ah, I understand your comment. I am not actually going to import the output of this as json. This is just an attempt to print the data structure to stdout in a human readable fashion. Thanks, I should have stated that in my question.
1

You can't override the existing encoder for integers...but there might be another way to get what you want. What about something like this:

import json
import re

data = {'test': 33, 'this': 99, 'something bigger':[1,2,3, {'a':44}]}  
s = json.dumps(data, indent=4)
print(re.sub('(\d+)', lambda i: hex(int(i.group(0))),s))

Results in:

{
    "test": 0x21,
    "this": 0x63,
    "something bigger": [
        0x1,
        0x2,
        0x3,
        {
            "a": 0x2c
        }
    ]
}

Note: This isn't especially "robust" (fails on numbers embedded in strings, floats, etc.), but might be good enough for what you want (You could also enhance the regex here so it would work in a few more cases).

3 Comments

Thanks this looks promising, I'll digest it, test it and get back to you.
So this works but it converts numbers even when they are part of a string like x86_64 becomes x0x54_0x40 I spent a few minutes toying with the regular expression to try and fix that but gave up :)
your approach is a quick one! I've added a blank before the digit like ' (\d+)' which omits keys like "test123". The downside is, that you'll miss the blank on the output too. I'm still looking for something which pays attention to the numbers which are no keys. saying, are not wrapped in "strings". How ever, thx!
1

You could always reparse the json, where you do have some control over int parsing, so that you can override int repr:

class hexint(int):
   def __repr__(self):
     return "0x%x" % self

json.loads(json.dumps(data), parse_int=hexint)

And using the data as in Gerrat's answer, the output is:

{u'test': 0x21, u'this': 0x63, u'something bigger': [0x1, 0x2, 0x3, {u'a': 0x2c}]}

Comments

1

One-liner

If you don't mind your hex strings quoted, use this one-liner:

print(json.dumps(eval(str(json.loads(json.dumps(data), parse_int=lambda i:hex(int(i))))), indent=4))

Output (using Gerrat's data again):

{
    "test": "0x21", 
    "this": "0x63", 
    "something bigger": [
        "0x1", 
        "0x2", 
        "0x3", 
        {
            "a": "0x2c"
        }
    ]
}

This is a better answer than my previous post as I've dealt with getting a pretty-printed result.

1 Comment

This works ok, however it does not keep the order in the data which is an ordered dict.
1

This is admittedly not the cleanest or most elegant way to do this, but it was the quickest for me as I didn't have to look into JSONEncoder and JSONDecoder

def obj_to_hex(obj: Any):
    """Recursively convert integers to ascii hex"""
    if isinstance(obj, int):
        return '0x%.8x' % obj
    if isinstance(obj, dict):
        return {k: obj_to_hex(v) for k, v in obj.items()}
    if isinstance(obj, list):
        return [obj_to_hex(l) for l in obj]
    return obj


def obj_from_hex(obj: Any):
    """Recursively convert ascii hex values to integers"""
    if all((isinstance(obj, str), obj.startswith('0x'))):
            return int(obj, 16)
    if isinstance(obj, dict):
        return {k: obj_from_hex(v) for k, v in obj.items()}
    if isinstance(obj, list):
        return [obj_from_hex(l) for l in obj]
    if isinstance(obj, int):
        return '0x%.8x' % obj
    return obj


def json_dump_hex(obj, stream, **kwargs):
    return json.dump(obj_to_hex(obj), stream, **kwargs)


def json_dumps_hex(obj, **kwargs):
    return json.dumps(obj_to_hex(obj), **kwargs)


def json_load_hex(stream, **kwargs):
    return obj_from_hex(json.load(stream, **kwargs))


def json_loads_hex(buf, **kwargs):
    return obj_from_hex(json.loads(buf, **kwargs))

This gives you the following behavior

obj = {'base_address': 4096, 'base_length': 4096, 'mappings': {'text': 16384, 'bss': 65536}}
print(json_dumps_hex(obj, indent=2))
print(json.dumps(obj, indent=2))

Outputs:

{
  "base_address": "0x00001000",
  "base_length": "0x00001000",
  "mappings": {
    "text": "0x00004000",
    "bss": "0x00010000"
  }
}
{
  "base_address": 4096,
  "base_length": 4096,
  "mappings": {
    "text": 16384,
    "bss": 65536
  }
}

If you really wanted to, you could then use something like this, to not need to use the wrappers- but beware this will impact all calls:

json.loads = json_loads_from_hex
json.load = json_load_from_hex
json.dump = json_dump_to_hex
json.dumps = json_dumps_to_hex

You could probably make this a little cleaner using a decorator or a contextlib.contextmanager to reduce the clunkiness a bit

Note: After seeing that json.load() and json.loads() support a parse_int callable kwarg, that is probably what most people want in most cases. An exception to this is where you want to emit JSON with hex so that another tool can read it (e.g. jq)

Note: Just realized this is roughly what @plazgoth said... oops

1 Comment

Thanks for still keeping this relevant 10 years later.
0

Dirty hack for Python 2.7, I wouldn't recomend to use it:

import __builtin__

_orig_str = __builtin__.str

def my_str(obj):
    if isinstance(obj, (int, long)):
        return hex(obj)
    return _orig_str(obj)
__builtin__.str = my_str

import json 

data = {'a': [1,2,3], 'b': 4, 'c': 16**20}
print(json.dumps(data, indent=4))

Output:

{
    "a": [
        0x1,
        0x2,
        0x3
    ],
    "c": 0x100000000000000000000L,
    "b": 0x4
}

On Python 3 __builtin__ module is now builtins, but I can't test it (ideone.com fails with ImportError: libz.so.1 ...)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.