Write a pandas data frame to HDF5

Question

I'm processing large number of files in python and need to write the output (one dataframe for each input file) in HDF5 directly. I am wondering what is the best way to write pandas data frame from my script to HDF5 directly in a fast way? I am not sure if any python module like hdf5, hadoopy can do this. Any help in this regard will be appreciate.

matthewrocklin.com/blog/work/2016/02/22/dask-distributed-part-2 — Nehal J Wani
– Nehal J Wani, Commented Aug 12, 2016 at 10:50
Nickil suggested an edit to change HDFS to HDF5 (and then answered based on this), but both HDFS and HDF5 seems to make sense in the context of your question... which did you mean? — Foon
– Foon, Commented Aug 12, 2016 at 11:43

MaxU - stand with Ukraine · Accepted Answer · 2016-08-12 13:34:35Z

2

It's difficult to give you a good answer to this rather generic question.

It's not clear how are you going to use (read) your HDF5 files - do you want to select data conditionally (using where parameter)?

fir of all you need to open a store object:

store = pd.HDFStore('/path/to/filename.h5')

now you can write (or append) to the store (i'm using here blosc compression - it's pretty fast and efficient), beside that i will use data_columns parameter in order to specify the columns that must be indexed (so you can use these columns in the where parameter later when you will read your HDF5 file):

for f in files:
    #read or process each file in/into a separate `df`
    store.append('df_identifier_AKA_key', df, data_columns=[list_of_indexed_cols], complevel=5, complib='blosc')

store.close()

edited Aug 12, 2016 at 13:34

answered Aug 12, 2016 at 13:04

MaxU - stand with Ukraine

212k37 gold badges402 silver badges437 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Write a pandas data frame to HDF5

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related