1

I am trying to create an array of dtype='U' and saving that using numpy.save(), however, when trying to load the saved file into a numpy.memmap I get an error related to the size not being a multiple of 'U3'

I am working with python 3.5.2. I have tried the following code where I am creating an empty array and another array with 3 entries, all with length of 3 letters and then save the array into file1.npy file.

import numpy as np
arr = np.empty((1, 0), dtype='U')
arr2 = np.array(['111', '222', '333'], dtype='U')
arr = np.concatenate((arr, arr2), axis = None)
print(arr)
np.save('file1', arr)

rArr = np.memmap('file1.npy', dtype='U3', mode='r')

However, when I try to load the file into a numpy.memmap I get the the following error ValueError: Size of available data is not a multiple of the data-type size.

Is there a way to load the data into a numpy.memmap using strings? I feel I am missing something simple.

4
  • Possible duplicate of NumPy mmap: "ValueError: Size of available data is not a multiple of data-type size." Commented May 15, 2019 at 7:03
  • I saw that question but I am not sure how it is a duplicate. The saved file is a binary file. I believe there is another reason I am getting the error like some extra data in the file I am not aware of. Commented May 15, 2019 at 7:15
  • 1
    Have you tried np.load with mmap_mode? Commented May 15, 2019 at 7:17
  • @hpaulj, this actually worked. I can't believe I missed it. I was trying different things and I forgot to test with np.load and mmap_mode. On the other hand, to address my original question I believe I found the answer as well. When using numpy.save, it seems the resulted file has a header if I delete that header (the first line in the file) and use numpy.memmap I am able to load the data properly. So I am guessing memmap was more directed for manually saved files without using numpy.save Commented May 15, 2019 at 7:29

2 Answers 2

2

The files used by numpy.memmap are raw binary files, not NPY-format files. If you want to read a memory-mapped NPY file, use numpy.load with the argument mmap_mode='r' (or whatever other value is appropriate).

After creating 'file1.npy' like you did, here's how it can be memory-mapped with numpy.load:

In [16]: a = np.load('file1.npy', mmap_mode='r')                                                                       

In [17]: a                                                                                                             
Out[17]: memmap(['111', '222', '333'], dtype='<U3')
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, this confirmed my suspicion about memmap. So I will have to manually save the binary data that would be loaded later). np.load() works fine.
Alternatively, you can save a raw binary file, which is via array.tofile(), and then load it with memmap.
0

Looks like np.load is your friend here.

Doc

Issue

The following snippet works for me:

rArr = np.load('file1.npy', mmap_mode='r')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.