I have following nested json file, which I need to convert in pandas dataframe, the main problem is there is only one unique item in the whole json and it is very deeply nested.
I tried to solve this problem with the following code, but it gives repeating output.
[{
"questions": [{
"key": "years-age",
"responseKey": null,
"responseText": "27",
"responseKeys": null
},
{
"key": "gender",
"responseKey": "male",
"responseText": null,
"responseKeys": null
}
],
"transactions": [{
"accId": "v1BN3o9Qy9izz4Jdz0M6C44Oga0qjohkOV3EJ",
"tId": "80o4V19Kd9SqqN80qDXZuoov4rDob8crDaE53",
"catId": "21001000",
"tType": "80o4V19Kd9SqqN80qDXZuoov4rDob8crDaE53",
"name": "Online Transfer FROM CHECKING 1200454623",
"category": [
"Transfer",
"Acc Transfer"
]
}
],
"institutions": [{
"InstName": "Citizens company",
"InstId": "inst_1",
"accounts": [{
"pAccId": "v1BN3o9Qy9izz4Jdz0M6C44Oga0qjohkOV3EJ",
"pAccType": "depo",
"pAccSubtype": "check",
"_id": "5ad38837e806efaa90da4849"
}]
}]
}]
I need to convert this to pandas dataframe as follows:
id pAccId tId
5ad38837e806efaa90da4849 v1BN3o9Qy9izz4Jdz0M6C44Oga0qjohkOV3EJ 80o4V19Kd9SqqN80qDXZuoov4rDob8crDaE53
The main problem I am facing is with the "id" as it very deeply nested which is the only unique key for the json.
here is my code:
import pandas as pd
import json
with open('sub.json') as f:
data = json.load(f)
csv = ''
for k in data:
for t in k.get("institutions"):
csv += k['institutions'][0]['accounts'][0]['_id']
csv += "\t"
csv += k['institutions'][0]['accounts'][0]['pAccId']
csv += "\t"
csv += k['transactions'][]['tId']
csv += "\t"
csv += "\n"
text_file = open("new_sub.csv", "w")
text_file.write(csv)
text_file.close()
Hope above code makes sense, as I am new to python.