I am looking for a right way to implement the next_batch in tensorflow. My training data is train_X=10000x50 where 10000 is a number of sample and 50 is number of the feature vector, and train_Y=10000x1. I used a batch size of 128. This is my function to get training batch during training
def next_batch(num, data, labels):
'''
Return a total of `num` random samples and labels.
'''
idx = np.arange(0 , data.shape[0])
np.random.shuffle(idx)
idx = idx[:num]
data_shuffle = [data[ i,:] for i in idx]
labels_shuffle = [labels[ i] for i in idx]
return np.asarray(data_shuffle), np.asarray(labels_shuffle)
n_samples = 10000
batch_size =128
with tf.Session() as sess:
sess.run(init)
n_batches = int(n_samples / batch_size)
for i in range(n_epochs):
for j in range(n_batches):
X_batch, Y_batch = next_batch(batch_size,train_X,train_Y)
With the above function, I found that the shuffle function is called for each batch, which is not the wanted behavior. We have to scan all shuffled elements in the training data before shuffling once again for the next new epoch. Am I right? How to fix it in tensorflow? Thanks