Here is a JSON string that contains a list of objects with each having another list embedded.
[
{
"name": "Alice",
"hobbies": [
"volleyball",
"shopping",
"movies"
]
},
{
"name": "Bob",
"hobbies": [
"fishing",
"movies"
]
}
]
Using pandas.read_json() this turns into a DataFrame like this:
name hobbies
--------------------------------------
1 Alice [volleyball, shopping, movies]
2 Bob [fishing, movies]
However, I would like to flatten the lists into Boolean columns like this:
name volleyball shopping movies fishing
----------------------------------------------------
1 Alice True True True False
2 Bob False False True True
I.e. when the list contains a value, the field in the corresponding column is filled with a Boolean True, otherwise with False.
I have also looked into pandas.io.json.json_normalize(), but that does not seem support this idea either. Is there any built-in way (either Python3, or pandas) to do this?
(PS. I realize that you can cook up your own code to 'normalize' the dictionary objects before loading the whole list into a DataFrame, but I might be reinventing the wheel with that and probably in a very inefficient way).