0

I have following data:

{"links":[{"rel":"self","href":"https://api.pjm.com"},
{"rel":"next","href":"https://api.pjm.com"},{"rel":"metadata","href":"https://api.pjm.com/api/v1/ftr_cong_lmp/metadata"}],
"items":[{"effective_day":"2020-12-01T00:00:00","terminate_day":"2020-12-31T00:00:00","pnode_name":"02AMSTED138 KV  TR2","offpeak_clmp":-0.290000,"onpeak_clmp":-0.240000,"24hour_clmp":-0.270000,"lt_sim_offpeak_clmp":-0.240000,"lt_sim_onpeak_clmp":-0.220000,"lt_sim_clmp":-0.240000},{"effective_day":"2020-12-01T00:00:00","terminate_day":"2020-12-31T00:00:00","pnode_name":"02AMSTED138 KV  TR6","offpeak_clmp":-0.290000,"onpeak_clmp":-0.240000,"24hour_clmp":-0.270000,"lt_sim_offpeak_clmp":-0.240000,"lt_sim_onpeak_clmp":-0.220000,"lt_sim_clmp":-0.240000},{"effective_day":"2020-12-01T00:00:00","terminate_day":"2020-12-31T00:00:00","pnode_name":"02CPP_NH138 KV  TR2","offpeak_clmp":0.010000,"onpeak_clmp":1.530000,"24hour_clmp":0.660000,"lt_sim_offpeak_clmp":0.010000,"lt_sim_onpeak_clmp":1.520000,"lt_sim_clmp":0.660000}],"searchSpecification":{"rowCount":25,"sort":"terminate_day","order":"Desc","startRow":1,"isActiveMetadata":true,"fields":["24hour_clmp","effective_day","lt_sim_clmp","lt_sim_offpeak_clmp","lt_sim_onpeak_clmp","offpeak_clmp","onpeak_clmp","pnode_name","terminate_day"],"filters":[{"effective_day":"2020-01-01T00:00:00.0000000 to 2020-12-31T23:59:59.0000000"}]},"totalRows":163378}'

I am trying to get the above data into a dataframe so I am tryin the following:

from io import StringIO    
s=str(bytes_data,'utf-8')    
data = StringIO(s)     
df=pd.read_csv(data)

But it is giving me empty dataframe with entire data in the column.

Edit:

The information is contained here:

{"effective_day":"2020-12-01T00:00:00","terminate_day":"2020-12-31T00:00:00","pnode_name":"02AMSTED138 KV  TR2","offpeak_clmp":-0.290000,"onpeak_clmp":-0.240000,"24hour_clmp":-0.270000,"lt_sim_offpeak_clmp":-0.240000,"lt_sim_onpeak_clmp":-0.220000,"lt_sim_clmp":-0.240000}

i.e. I am trying to put the above in a dataframe with columns as keys of above dictionary but how do I extract out only these items from my original data to put it in dataframe.

4
  • The arrays in your sample JSON are not of the same length. Please shorten the bytestring and explain how the resulting dataframe is supposed to look like when the columns have different length. Commented Jun 4, 2020 at 10:39
  • Does the above edit answer your question? Commented Jun 4, 2020 at 10:44
  • Thanks. But now it's unclear whether you have a dictionary, a bytes like object, or something else. Also, how does the final dataframe you want look like? Commented Jun 4, 2020 at 10:46
  • I have highlighted it in bold letters to show what the final output should have. Commented Jun 4, 2020 at 10:51

1 Answer 1

1

You can eval the string data to a dictionary and use this to create a dataframe:

pd.DataFrame(eval(s)['items'])

Before you need to define the value of true used in the expression, e.g. by true = True.

Result:

         effective_day        terminate_day  ... lt_sim_onpeak_clmp  lt_sim_clmp
0  2020-12-01T00:00:00  2020-12-31T00:00:00  ...              -0.22        -0.24
1  2020-12-01T00:00:00  2020-12-31T00:00:00  ...              -0.22        -0.24
2  2020-12-01T00:00:00  2020-12-31T00:00:00  ...               1.52         0.66

For safety reasons it is recommended, however, to use ast.literal_eval instead of eval. In this case the variable definition for true doesn't work, so you'll need to manually replace it in the string:

import ast
pd.DataFrame(ast.literal_eval(s.replace('true','True'))['items'])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.