Encoding json response to csv in python

Question

I am trying to encode json into csv in python with pandas, which is supposed to be easy, but the output isn't close to right. Example json

{'energy': {'timeUnit': 'DAY', 'unit': 'Wh', 'measuredBy': 'INVERTER', 'values': [{'date': '2022-01-01 00:00:00', 'value': 322.0}, {'date': '2022-01-02 00:00:00', 'value': 12.0}, {'date': '2022-01-03 00:00:00', 'value': 0.0}]}}

With the following code:

data = r.json()

print(data)

json_object = json.dumps(data)

json_object

with open(r'\\shared\AppDev\Production\data\solaredge\import.json','w') as n:
 n.write(json_object)

df = pd.read_json(r'\\shared\AppDev\Production\data\solaredge\import.json')
df.to_csv(r'\\shared\AppDev\Production\data\solaredge\import.csv', index = None)

Produces

energy INVERTER DAY Wh [{'date': '2022-01-01 00:00:00', 'value': 322.0}, {'date': '2022-01-02 00:00:00', 'value': 12.0}, {'date': '2022-01-03 00:00:00', 'value': 0.0}]

It appears the inner portion of the json hasn't been parsed at all. I'm wondering if I am missing something obvious, I am considering just stripping most of the content out manually with string functions but that seems like there has to be an easier way.

Not an answer to your question, but you can reduce the code here by passing the JSON string directly to pandas: df = pd.read_json(data). There is no need to save it to a file first. See pandas.pydata.org/pandas-docs/version/1.1.3/reference/api/… for more details. — Code-Apprentice
– Code-Apprentice, Commented Jan 19, 2022 at 19:39
CSV and most DataFrames are flat views of data (think only 2D). JSON with nested objects introduces more dimensions. This is why the nested items seem not processed as in a flat representation they cannot be represented. You need to decide how to flatten this hierarchy. — MYousefi
– MYousefi, Commented Jan 19, 2022 at 19:47

MYousefi · Accepted Answer · 2022-01-19 20:15:09Z

Here's an example of something that processes the values column.

data = {'energy': {'timeUnit': 'DAY', 'unit': 'Wh', 'measuredBy': 'INVERTER', 'values': [{'date': '2022-01-01 00:00:00', 'value': 322.0}, {'date': '2022-01-02 00:00:00', 'value': 12.0}, {'date': '2022-01-03 00:00:00', 'value': 0.0}]}}
import pandas as pd
energy = pd.DataFrame(data['energy'])
pd.concat((energy.drop('values', axis=1), energy['values'].apply(pd.Series)), axis=1)

I am dropping the first level by making the DataFrame from the 'energy' key within the dictionary. This produces a frame with timeUnit, unit and measuredBy value repeated for each date, value dictionary.

Next, by applying pd.Series we can create a new table with two columns date and value. Finally, drop the old values column and replace it by date and value columns. This is all done by pd.concat, drop and apply(pd.Series)

It should look like below:

    timeUnit    unit    measuredBy  date                    value
0   DAY         Wh      INVERTER    2022-01-01 00:00:00     322.0
1   DAY         Wh      INVERTER    2022-01-02 00:00:00     12.0
2   DAY         Wh      INVERTER    2022-01-03 00:00:00     0.0

Collectives™ on Stack Overflow

Encoding json response to csv in python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related