I am reading data from image files and I want to append this data into a single HDF file. Here is my code:
datafile = pd.HDFStore(os.path.join(path,'imageData.h5'))
for file in fileList:
data = {'X Position' : pd.Series(xpos, index=index1),
'Y Position' : pd.Series(ypos, index=index1),
'Major Axis Length' : pd.Series(major, index=index1),
'Minor Axis Length' : pd.Series(minor, index=index1),
'X Velocity' : pd.Series(xVelocity, index=index1),
'Y Velocity' : pd.Series(yVelocity, index=index1) }
df = pd.DataFrame(data)
datafile['df'] = df
datafile.close()
This is obviously incorrect as it overwrites each set of data with the new one each time the loop runs.
If instead of datafile['df'] = df, I use
datafile.append('df',df)
OR
df.to_hdf(os.path.join(path,'imageData.h5'), 'df', append=True, format = 'table')
I get the error:
ValueError: Can only append to Tables
I have referred to the documentation and other SO questions, without avail.
So, I am hoping someone can explain why this isn't working and how I can successfully append all the data to one file. I am willing to use a different method (perhaps pyTables) if necessary.
Any help would be greatly appreciated.
df.to_hdf(..., format="table", append=True)) is actually the right one. Have you tried using that (without all of theHDFStorestuff) with a fresh file?HDFStoreused thefixedformat by default which doesn't allow appending. Thetableformat is the one used by PyTables.format="table"to_hdfshould allow appending by using PyTables internally, no need to do that yourself. You might want to update pandas, though. What does "and it worked" mean? Would you update the question?