How to create a sparse DataFrame from a list of dicts

Question

I create DataFrame from a list of dicts like this:

pd.DataFrame([{"id":"a","v0":3,"v2":"foo"},
              {"id":"b","v1":1,"v4":"ouch"}]).set_index(
                 "id",verify_integrity=True)
     v0   v2   v1    v4
id                    
a   3.0  foo  NaN   NaN
b   NaN  NaN  1.0  ouch

Alas, for some inputs I run out of RAM in the DataFrame constructor, and I wonder if there is a way to make pandas produce a sparse DataFrame from the list of dicts.

Chris Watts · Accepted Answer · 2024-02-29 14:15:30Z

5

I suggest to use the dtype='Sparse' for this.

If all elements are numbers you can use dtype='Sparse', dtype='Sparse[int]' or dtype='Sparse[float]'

data = [{"id":'a',"v0":3,"v2":6},
        {"id":'b',"v1":1,"v4":7}]
index = [item.pop('id') for item in data]
pd.DataFrame(data, index=index, dtype="Sparse")

If any value is a string you have to use dtype='Sparse[str]'.

data = [{"id":'a',"v0":3,"v2":'foo'},
        {"id":'b',"v1":1,"v4":'ouch'}]
df = pd.DataFrame(data, dtype="Sparse[str]").set_index("id",verify_integrity=True)

edited Feb 29, 2024 at 14:15

Chris Watts

6,8058 gold badges55 silver badges99 bronze badges

answered Feb 23, 2021 at 15:58

mosc9575

6,3922 gold badges12 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

sds Over a year ago

Alas, not all my data is numeric (please see edit)

Collectives™ on Stack Overflow

How to create a sparse DataFrame from a list of dicts

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related