As the title says, I'm seeing a big difference between the memory usage of a numpy array between Windows and Ubuntu.
Here's a simple code to replicate this issue:
import numpy as np
import joblib
a = [1]*1000
b = [a for i in range(1000)]
np_arr = np.array(b)
joblib.dump(np_arr, 'arr.h5')
If I run this code in a Windows 10 machine, arr.h5's size is 3907KB.
But it I run this on Ubuntu 18.04, it's 7812KB
The main issue is that I'm dealing with large datasets and my code runs fine on a Windows machine with 16GB, but I'm having Memory Errors on Ubuntu with 32GB