3

Is there any way to populate an empty string at any position (without knowing the json's structure) in a json received from a certain endpoint before inserting it into DynamoDB? As you all know it has issues with floats that you must transform them into Decimals, but can't seem to figure out an easy way to populate the empty string such as "full_name": "" with a value like "N/A".

I'm looking for something like json.loads(json.dumps(data), parse_float=Decimal), as for the parse_float thing but for empty strings. Something clean and easy to use. I've seen you can use a custom cls class for that but I don't quite get it how to do it properly especially without knowing the structure of the json which might vary.

JSON example:

{
  "campaign_id": "9c1c6cd7-fd4d-480b-8c80-07091cdd4103",
  "creation_date": 1530804132,
  "objects": [
     {
        "id": 12345,
        "full_name": ""
     },
     ...
  ],
  ...
}

1 Answer 1

2

You can do this by defining an object_hook to pass to json.loads.

From the docs:

object_hook is an optional function that will be called with the result of any object literal decoded (a dict). The return value of object_hook will be used instead of the dict.

Given this dict:

>>> pprint(d)
{'campaign_id': '9c1c6cd7-fd4d-480b-8c80-07091cdd4103',
 'creation_date': 1530804132,
 'float': 1.2345,
 'objects': [{'full_name': '', 'id': 12345}],
 'strs': ['', 'abc', {'a': ''}],
 'top_str': ''}

This pair of functions will recurse over the result of json.loads and change instance of the empty string to 'N/A'.

def transform_dict(mapping=None):
    if mapping is None:
        mapping = {}
    for k, v in mapping.items():
        if v == '':
            mapping[k] = 'N/A'
        elif isinstance(v, dict):
            mapping[k] = transform_dict(v)
        elif isinstance(v, list):
            mapping[k] = transform_list(v)
        else:
            # Make it obvious that we aren't changing other values
            pass
    return mapping


def transform_list(lst):
    for i, x in enumerate(lst):
        if x == '':
            lst[i] = 'N/A'
        elif isinstance(x, dict):
            lst[i] = transform_dict(x)
        elif isinstance(x, list):
            lst[i] = transform_list(x)
        else:
            # Make it obvious that we aren't changing other values
            pass
    return lst

>>> res = json.loads(
        json.dumps(d), 
        parse_float=decimal.Decimal, 
        object_hook=transform_dict,
    )
>>> pprint(res)
{'campaign_id': '9c1c6cd7-fd4d-480b-8c80-07091cdd4103',
 'creation_date': 1530804132,
 'float': Decimal('1.2345'),
 'objects': [{'full_name': 'N/A', 'id': 12345}],
 'strs': ['N/A', 'abc', {'a': 'N/A'}],
 'top_str': 'N/A'}

Note that this approach depends on the input json being a json object ({...}).

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot. I've managed to work around with something similar.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.