How to write the ndarray to text file in python?

Question

I am trying to use MNIST data for my research work.Now the dataset description is:

The training_data is returned as a tuple with two entries. The first entry contains the actual training images. This is a numpy ndarray with 50,000 entries. Each entry is, in turn, a numpy ndarray with 784 values, representing the 28 * 28 = 784 pixels in a single MNIST image.
The second entry in the ``training_data`` tuple is a numpy ndarray
containing 50,000 entries.  Those entries are just the digit
values (0...9) for the corresponding images contained in the first
entry of the tuple.

Now i am converting the training data like this:

In particular, training_data is a list containing 50,000 2-tuples (x, y). x is a 784-dimensional numpy.ndarray containing the input image. y is a 10-dimensional numpy.ndarray representing the unit vector corresponding to the correct digit for x. and the code for that is:

def load_data_nn():
    training_data, validation_data, test_data = load_data()
    #print training_data[0][1]
    #inputs = [np.reshape(x, (784, 1)) for x in training_data[0]]
    inputs = [np.reshape(x, (784,1)) for x in training_data[0]]
    print inputs[0]
    results = [vectorized_result(y) for y in training_data[1]]
    training_data = zip(inputs, results)
    test_inputs = [np.reshape(x, (784, 1)) for x in test_data[0]]
    return (training_data, test_inputs, test_data[1])

Now i want to write the inputs into a text file that means one row will be inputs[0] and another row will be inputs[1] and the data inside inputs[0] will be space separated and no ndarray brackets will present.For Example:

 0 0.45 0.47 0,76

 0.78 0.34 0.35 0.56

Here one row in the text file is inputs[0].How to convert the ndarray to like above in textfile??

swenzel · Accepted Answer · 2015-08-15 10:17:09Z

Since the answer to your question seems quite easy I guess your problem is speed. Fortunately we can use multiprocessing here. Try this:

from multiprocessing import Pool

def joinRow(row):
    return ' '.join(str(cell) for cell in row)

def inputsToFile(inputs, filepath):
    # in python3 you can do:
    # with Pool() as p:
    #     lines = p.map(joinRow, inputs, chunksize=1000)
    # instead of code from here...
    p = Pool()
    try:
        lines = p.map(joinRow, inputs, chunksize=1000)
    finally:
        p.close()
    # ...to here. But this works for both.

    with open(filepath,'w') as f:
        f.write('\n'.join(lines)) # joining already created strings goes fast

Still takes a while on my shitty laptop but is a lot faster than just '\n'.join(' '.join(str(cell) for cell in row) for row in inputs)

By the way, you can speed up the rest of your code as well:

def load_data_nn():
    training_data, validation_data, test_data = load_data()
    # suppose training_data[0].shape == (50000,28,28), otherwise leave it as is
    inputs = training_data[0].reshape((50000,784,1))
    print inputs[0]
    # create identity matrix and use entries of training_data[1] to
    # index corresponding unit vectors
    results = np.eye(10)[training_data[1]]
    training_data = zip(inputs, results)
    # suppose test_data[0].shape == (50000,28,28), otherwise leave it as is
    test_inputs = test_data[0].reshape((50000,784,1))
    return (training_data, test_inputs, test_data[1])

Collectives™ on Stack Overflow

How to write the ndarray to text file in python?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related