1

Given a template/formatting string "{foo}_{bar}", how can I programmatically extract the required formatting keys ["foo", "bar"]?

I have dicts of parameters for various experiments

[
    {"parameters": {"foo": 1, "bar": 2}, "format": `"{foo}_{bar}"`},
    {"parameters": {"biz": 3}, "format": "{biz}_{baz}"}
]

As you can see, the second parameter set is missing key baz. So when I do something like

"{biz}_{baz}".format(**parameters), it raises a KeyError, because baz is missing.

I want to replace all missing parmaters with NR, and fill all available parameters with their values.

The output is then:

[
    {"parameters": {"foo": 1, "bar": 2}, "format": `"{foo}_{bar}"`, "formatted": "1_2"},
    {"parameters": {"biz": 3}, "format": "{biz}_{baz}", "formatted": "3_NR"}
]

For context: I have 100+ strings, with no consistency between the expected parameters required for that string.

7
  • @jonrsharpe It feels to me that this is not an exact duplicate as its intentions are a bit different or elaborate. The title is a bit unfortunate, though. Commented Dec 13, 2020 at 21:54
  • @BramVanroy the duplicate exactly answers the bolded first sentence Commented Dec 13, 2020 at 21:55
  • @jonrsharpe Exactly, but if you read further down, this user actually has a very specific end goal in mind. I believe that that is what this question revolves around as that is the part that would actually help them a lot more than merely that first question. I'm sure that if OP put in some effort to rewrite the post to emphasize the end goal, that this can be re-opened. Commented Dec 13, 2020 at 22:00
  • @BramVanroy well maybe, but voting to open prior to their doing that seems premature, and maybe they asked about the bit they were actually stuck on. Commented Dec 13, 2020 at 22:05
  • Hi guys, the bolded sentence Is all that I really needed help with. It seems as if the correct answer to this question is in fact the accepted answer in the linked post. stackoverflow.com/questions/22830226/…. I did a stack overflow search before asking the question, but somehow missed that post. Thank you both for your help! Commented Dec 13, 2020 at 22:14

1 Answer 1

2

You can efficiently gather the required parameter names from the string, and then check whether there are any missing keys by checking for set similarity between the actual keys. If there are missing keys, add them with the value "NR". Finally, use .format to correctly format the string into a "formatted" key.

ds = [
    {"parameters": {"foo": 1, "bar": 2}, "format": "{foo}_{bar}"},
    {"parameters": {"biz": 3}, "format": "{biz}_{baz}"}
]

for d in ds:
    # Copy params so that we do not change params in-place
    params = d["parameters"]
    req_keys = set(d["format"][1:-1].split("}_{"))
    missing_keys = req_keys.difference(params.keys())

    if len(missing_keys) > 0:
        params = {**params, **{key: "NR" for key in missing_keys}}

    d["formatted"] = d["format"].format(**params)

print(ds)

# [{'parameters': {'foo': 1, 'bar': 2}, 'format': '{foo}_{bar}', 'formatted': '1_2'}, {'parameters': {'biz': 3}, 'format': '{biz}_{baz}', 'formatted': '3_NR'}]

Sign up to request clarification or add additional context in comments.

1 Comment

Did almost exactly this, except replaced req_keys = set(d["format"][1:-1].split("}_{")) with req_keys=[var for _, var, _, _ in Formatter().parse(d['format']) if var]. This avoids coupling everything to "}_{", in case there is a change in the future.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.