1

I have to perform some tests to tune some numeric parameters of a json file. To make simple, I replaced all these values by the string 'variable', then did the following:

numeric_vals = [10,20, 30, 40]  # numeric values to replace in that order 
with open ('mypath') as my_file:
    json_str = my_file.read()
for i in numeric_vals:
    json_str = json_str.replace("\"variable\"", str(i), 1)
c = json.loads(json_str)  #loading in order to work with

This works fine, but is there a more efficient way to do this ? Values that need to be replaced are in variable depths, and may be inside lists etc. My json file is 15Kb and I need to test many (really many!) configurations. At each test, about 200 variables need to be replaced. I am on python 2.7 , but python 3.5 is also an option. Thanks for your help!

EDIT :

here is a sample of my dict. It should be noted that the real thing is far longer and deeper:

 {
"1": {
    "transition": {
        "value": "variable", # edit here
        "unite": "sec"
    },
    "sequence": [
        {
            "step": "STEP",
            "name": "abc",
            "transition": {
                "value": "variable", #edit here
                "unite": "sec"
            },
            "entity": "aa",
            "_equipement": 3,
            "_value": 0
        },
        {
            "step": "FORK",
            "BRANCHES": [
                {
                    "": {
                        "sequence": [
                            {
                                "step": "STEP",
                                "name": "klm",
                                "transition": {
                                    "value": "variable", # edit here
                                    "unite": "sec"
                                },
                                "entity": "vvv",
                                "_equipement": 10,
                                "_value": 0,
                                "conditions": [
                                    [
                                        {
                                            "name": "ddd",
                                            "type": "el",
                                            "_equipement": 7,
                                            "_value": 1
                                        }
                                    ]
                                ]
                            }
                        ]
                    }
                },
                {
                    "SP": {
                        "sequence": [
                            {
                                "step": "STEP",
                                "name": "bbb",
                                "transition": {
                                    "value": "variable", # edit here
                                    "unite": "sec"
                                },
                                "entity": "bb",
                                "_equipement": 123,
                                "_value": 0,
                                "conditions": [
                                    [
                                        {
                                            "name": "abcd",
                                            "entity": "dgf",
                                            "type": "lop",
                                            "_equipement": 7,
                                            "_valeur": 0
                                        }
                                    ]
                                ]
                            }
                        ]
                    }
                }
            ]
        }
    ]
}

}

7
  • can you show a small example of how the file looks like (not the whole file but an example with variable depths) ? Also have you tried re.sub? Commented Jan 29, 2019 at 10:44
  • re.sub seems like agood idea, no I didn't try it, thanks for the tip. Sharing dict in a moment. Commented Jan 29, 2019 at 10:50
  • it's useful to know if \"variable\" is a key, value or whatnot to know whether string manipulation is more efficient or simply making alterations to the dictionary directly after you load the json file. Commented Jan 29, 2019 at 10:51
  • it is a value. That is why I am replacing \"variable\" by str(value) so as to transform a string value to an integer one Commented Jan 29, 2019 at 10:53
  • so just paste one example and I'll see what to do. Also does your dictionary have lists in it, like list of dicts or dicts of lists at variable depths ? Commented Jan 29, 2019 at 10:54

2 Answers 2

2

It's generally a bad idea to do string operations on hierarchical/structured data as there may be many cases where you can break the structure. Since you're already parsing your JSON you can extend the decoder to specifically deal with your case during parsing, e.g.:

numeric_vals = [10, 20, 30, 40]  # numeric values to replace in that order

SEARCH = 'variable'
REPLACE = iter(numeric_vals)  # turn into an iterator for sequential access

def replace_values(value):
    return {k: next(REPLACE) if v == SEARCH else v for k, v in value.items()}

with open('path/to/your.json') as f:
    a = json.JSONDecoder(object_hook=replace_values).decode(f.read())

This ensures you're properly parsing your JSON and that it won't replace, for example, a key that happens to be called 'variable'.

Beware, tho, that it will raise an StopIteration exception if there are more "variable" values in the JSON than there are numeric_vals - you can unravel the dict comprehension in replace_values and deal with such case if you expect to encounter such occurrences.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks. I am checking this out, it actally seems like a very good solution
Best part in what you proposed is the fact that your solution works on f.read() , that is on a string. I was actually counting on opening only one time and load the string in RAM, the perform tests (on many - many lists of numeric values) so this helps a lot in order to limit disk access latency.
1

You can get a performance improvement by taking the call to json.loads() outside of the loop, you only need to do this once at the end:

numeric_vals = [10, 20, 30, 40]

with open('mypath') as my_file:
    json_str = my_file.read()
    for i in numeric_vals:
        json_str = json_str.replace('"variable"', str(i), 1)
    c = json.loads(json_str)

Also prefer to use string.replace over re.sub, as per this post.

2 Comments

@zwer notice the 1 parameter at the end of the call to replace(). OP is replacing one element at a time in each iteration.
Thanks, the json.loads identation was a typo. The 1 parameter is because I need to replace first variable by first number, then second variable, which is now first, by second variable etc.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.