7

Sometimes I get the following arrays from my HDF5 file:

val1 = {ndarray} [<HDF5 object reference> <HDF5 object reference> <HDF5 object reference>]

If I try to dereference it with HDF5 file object

f[val[0]]

I get an error

Argument 'ref' has incorrect type (expected h5py.h5r.Reference, got numpy.object_)

1 Answer 1

5

I've come across this question while trying to answer what turned out to be basically the same question in another form. A dataset containing references to other objects is a bit of an awkward situation in HDF5, but you can actually read them in a pretty straightforward way. The idea is to get the name of the referenced object, and then just read that object directly from the file.

Given a single HDF5 reference, ref, and a file, file, you can return the name of the referenced dataset by doing:

>>> name = h5py.h5r.get_name(ref, file.id)

Then just read the actual dataset itself, as usual:

>>> data = file[name].value # ndarray with the data in it.

So to read all the referenced datasets, just map this process across the whole dataset of references.

Sign up to request clarification or add additional context in comments.

1 Comment

might be nice to include some searchable terms. For me, i was trying to unpack MATLAB matfiles, specifically the large ones that are not loadable via scipy.io

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.