0

For input, I have a dictionary

{
"TAX:10672": "[
    {\"entity_id\":10672,\"profile_id\":20321,\"metric_type_name\":\"CAPEX\",\"metric\":null,\"perform_metric\":null},
    {\"entity_id\":10672,\"profile_id\":32583,\"metric_type_name\":\"CAPEX\",\"metric\":null,\"perform_metric\":null},
    {\"entity_id\":10672,\"profile_id\":8526,\"metric_type_name\":\"CAPEX\",\"metric\":null,\"perform_metric\":null}
]",    
"TAX:10869": "[
    {\"entity_id\":10869,\"profile_id\":20430,\"metric_type_name\":\"OPEX\",\"metric\":null,\"perform_metric\":null,},
    {\"entity_id\":10869,\"profile_id\":32692,\"metric_type_name\":\"CAPEX\",\"metric\":null,\"perform_metric\":null},
    {\"entity_id\":10869,\"profile_id\":8631,\"metric_type_name\":\"Revenue\",\"metric\":null,\"perform_metric\":null}
]"

}

In the code given below, I have taken the dictionary values converted them to a list of JSON and then to a Dataframe with columns as "entity_id", "profile_id", "metric_type_name" etc.

input_dict = /*Sample values given above*/
temp = list()
{temp.append(pd.read_json(v)) for v in list(input_dict.values())}
output_df = pd.concat(temp)

However, the performance is very poor, for a directory of 5000 entries it takes approx 100-120 seconds

I want to know if there is a way to improve the performance of this code further

1 Answer 1

1

You can use fast third-party libraries to parse json first (orjson, ujson), then feed them into pandas as dicts. An example using orjson:

import orjson
from itertools import chain

parsed = map(orjson.loads, input_dict.values())
output_df = pd.DataFrame(chain.from_iterable(parsed))

Note that your input_dict values must be a valid json array (no trailing commas, etc)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.