Tensorflow: classification only based on first input

Question

Getting to know Tensorflow, I built a toy network for classification. It consists of 15 input nodes for features identical to the one-hot encoding of the corresponding class label (with indexing beginning at 1) - so the data to be loaded from an input CSV may look like this:

1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,2
...
0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,15

The network has only one hidden layer and an output layer, the latter containing probabilities for a given class. Here's my problem: during training the network assings a growing probability for whatever was fed in as the very first input.

Here are the relevant lines of code (some lines are omitted):

# number_of_p : number of samples
# number_of_a : number of attributes (features) -> 15
# number_of_s : number of styles (labels) -> 15

# function for generating hidden layers
# nodes is a list of nodes in each layer (len(nodes) = number of hidden layers)

def hidden_generation(nodes):

    hidden_nodes = [number_of_a] + nodes + [number_of_s]
    number_of_layers = len(hidden_nodes) - 1

    print(hidden_nodes)
    hidden_layer = list()
    for i in range (0,number_of_layers):
        hidden_layer.append(tf.zeros([hidden_nodes[i],batch_size]))

    hidden_weights = list()
    for i in range (0,number_of_layers):
        hidden_weights.append(tf.Variable(tf.random_normal([hidden_nodes[i+1], hidden_nodes[i]])))

    hidden_biases = list()
    for i in range (0,number_of_layers):
        hidden_biases.append(tf.Variable(tf.zeros([hidden_nodes[i+1],batch_size])))

    return hidden_layer, hidden_weights, hidden_biases

#loss function
def loss(labels, logits):
    cross_entropy = tf.losses.softmax_cross_entropy(
        onehot_labels = labels, logits = logits)
    return tf.reduce_mean(cross_entropy, name = 'xentropy_mean')

hidden_layer, hidden_weights, hidden_biases = hidden_generation(hidden_layers)

with tf.Session() as training_sess:

    training_sess.run(tf.global_variables_initializer())
    training_sess.run(a_iterator.initializer, feed_dict = {a_placeholder_feed: training_set.data})
    current_a = training_sess.run(next_a)   
    training_sess.run(s_iterator.initializer, feed_dict = {s_placeholder_feed: training_set.target})
    current_s = training_sess.run(next_s) 

    s_one_hot = training_sess.run(tf.one_hot((current_s - 1), number_of_s))

    for i in range (1,len(hidden_layers)+1):
        hidden_layer[i] = tf.tanh(tf.matmul(hidden_weights[i-1], (hidden_layer[i-1])) + hidden_biases[i-1])

    output = tf.nn.softmax(tf.transpose(tf.matmul(hidden_weights[-1],hidden_layer[-1]) + hidden_biases[-1]))

    optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.1)
    # using the AdamOptimizer does not help, nor does choosing a much bigger and smaller learning rate
    train = optimizer.minimize(loss(s_one_hot, output))

    training_sess.run(train)

    for i in range (0, (number_of_p)):

        current_a = training_sess.run(next_a)
        current_s = training_sess.run(next_s)
        s_one_hot = training_sess.run(tf.transpose(tf.one_hot((current_s - 1), number_of_s)))
        # (no idea why I have to declare those twice for the datastream to move)

        training_sess.run(train)

I assume the loss function is being declared at the wrong place and always references the same vectors. However, replacing the loss function did not help me by now. I will gladly provide the rest of the code if anyone is kind enough to help me.

EDIT: I've already discovered and fixed one major (and dumb) mistake: weights go before values node values in tf.matmul.

major_hart · Accepted Answer · 2018-03-09 21:10:23Z

You do not want to be declaring the training op over and over again. That is unnecessary and like you pointed out is slower. You are not feeding your current_a into the neural net. So you are not going to be getting new outputs, also how you are using iterators isn't correct which could also be the cause of the problem.

with tf.Session() as training_sess:

    training_sess.run(tf.global_variables_initializer())
    training_sess.run(a_iterator.initializer, feed_dict = {a_placeholder_feed: training_set.data})
    current_a = training_sess.run(next_a)   
    training_sess.run(s_iterator.initializer, feed_dict = {s_placeholder_feed: training_set.target})
    current_s = training_sess.run(next_s) 

    s_one_hot = training_sess.run(tf.one_hot((current_s - 1), number_of_s))

    for i in range (1,len(hidden_layers)+1):
        hidden_layer[i] = tf.tanh(tf.matmul(hidden_weights[i-1], (hidden_layer[i-1])) + hidden_biases[i-1])

    output = tf.nn.softmax(tf.transpose(tf.matmul(hidden_weights[-1],hidden_layer[-1]) + hidden_biases[-1]))

    optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.1)
    # using the AdamOptimizer does not help, nor does choosing a much bigger and smaller learning rate
    train = optimizer.minimize(loss(s_one_hot, output))

    training_sess.run(train)

    for i in range (0, (number_of_p)):

        current_a = training_sess.run(next_a)
        current_s = training_sess.run(next_s)
        s_one_hot = training_sess.run(tf.transpose(tf.one_hot((current_s - 1), number_of_s)))
        # (no idea why I have to declare those twice for the datastream to move)

        training_sess.run(train)

Here is some pseudocode to help you get the correct data flow. I would do the one hot encoding prior to this just to make things easier for loading the data during training.

train_dataset = tf.data.Dataset.from_tensor_slices((inputs, targets))

train_dataset = train_dataset.batch(batch_size)
train_dataset = train_dataset.repeat(num_epochs)
iterator = train_dataset.make_one_shot_iterator()

next_inputs, next_targets = iterator.get_next()

# Define Training procedure
global_step = tf.Variable(0, name="global_step", trainable=False)
loss = Neural_net_function(next_inputs, next_targets)
optimizer = tf.train.AdamOptimizer(learning_rate)
grads_and_vars = optimizer.compute_gradients(loss)
train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step)



with tf.Session() as training_sess:
    for i in range(number_of_training_samples * num_epochs):
            taining_sess.run(train_op)

major_hart to the rescue! I implemented your suggestions in my code and now the network is way faster. Also, how could I have missed not feeding the input into the network?... I noticed that you must be very careful using next_inputs and next_targets. Both must be used equal number of times when defining other variables and especially during training - else the data flow will be shifted between inputs and targets. Or, well, maybe I still don't get iterators. Eighter way: the training session runs fast and the classification isn't bad. Thank you very much!

Raute · Accepted Answer · 2018-03-09 15:22:18Z

0

Solved it! Backpropagation works properly when the training procedure is redeclared for every new dataset.

for i in range (0, (number_of_p)):

        current_a = training_sess.run(next_a)
        current_s = training_sess.run(next_s)
        s_one_hot = training_sess.run(tf.transpose(tf.one_hot((current_s - 1), number_of_s)))

        optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.1)
        train = optimizer.minimize(loss(s_one_hot, output))

        training_sess.run(train)

...makes training considerably slower, but it works.

answered Mar 9, 2018 at 15:22

Raute

631 silver badge8 bronze badges

Collectives™ on Stack Overflow

Tensorflow: classification only based on first input

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related