11

I have large pandas tabular dataframe to convert into JSON. The standard .to_json() functions does not make a compact format for JSON. How to get JSON output forma like this, using pandas only ?

{"index": [ 0, 1 ,3 ],
 "col1": [ "250", "1" ,"3" ],
 "col2": [ "250", "1" ,"3" ]
}

This is a much compact format form of JSON for tabular data. (I can do a loop over the rows.... but)

2 Answers 2

15

It seems you need to_dict first and then dict to json:

df = pd.DataFrame({"index": [ 0, 1 ,3 ],
 "col1": [ "250", "1" ,"3" ],
 "col2": [ "250", "1" ,"3" ]
})
print (df)
  col1 col2  index
0  250  250      0
1    1    1      1
2    3    3      3


print (df.to_dict(orient='list'))
{'col1': ['250', '1', '3'], 'col2': ['250', '1', '3'], 'index': [0, 1, 3]}

import json

print (json.dumps(df.to_dict(orient='list')))
{"col1": ["250", "1", "3"], "col2": ["250", "1", "3"], "index": [0, 1, 3]}

Because it is not implemented yet:

print (df.to_json(orient='list'))

ValueError: Invalid value 'list' for option 'orient'

EDIT:

If index is not column, add reset_index:

df = pd.DataFrame({"col1": [250, 1, 3],
                   "col2": [250, 1, 3]})
print (df)
   col1  col2
0   250   250
1     1     1
2     3     3

print (df.reset_index().to_dict(orient='list'))
{'col1': [250, 1, 3], 'index': [0, 1, 2], 'col2': [250, 1, 3]}
Sign up to request clarification or add additional context in comments.

3 Comments

You are always faster ;-)
If you may have NaN values in your data, to_dict + json.dump will not be able to manage them as null as to_json can do. You will get NaN word in your result which may be invalid JSON file.
@PascalH. - then convert NaN to None, so json.dump create null - print (json.dumps(df.mask(df.isna(), None).to_dict(orient='list')))
0

You can use to_dict and json (and add the index as extra column if required via assign):

import json

df = pd.DataFrame({"col1": [250, 1, 3],
                   "col2": [250, 1, 3]})

json_dict = df.assign(index=df.index).to_dict(orient="list")
print(json.dumps(json_dict))

>>> '{"index": [0, 1, 2], "col1": [250, 1, 3], "col2": [250, 1, 3]}'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.