Always same output for tensorflow autoencoder

Question

At the moment I try to build an Autoencoder for timeseries data in tensorflow. I have nearly 500 days of data where each day have 24 datapoints. Since this is my first try my architecture is very simple. After my input of size 24 the hidden layers are of size: 10; 3; 10 with an output of again 24. I normalized the data (datapoints are in range [-0.5; 0.5]), use the sigmoid activation function and the RMSPropOptimizer.

After training (loss function in picture) the output is the same for every timedata i give into the network. Does someone know what is the reason for that? Is it possible that my Dataset is the issue (code below)?

class TimeDataset:
def __init__(self,data):
    self._index_in_epoch = 0
    self._epochs_completed = 0
    self._data = data
    self._num_examples = data.shape[0]
    pass


@property
def data(self):
    return self._data

def next_batch(self, batch_size, shuffle=True):
    start = self._index_in_epoch

    # first call
    if start == 0 and self._epochs_completed == 0:
        idx = np.arange(0, self._num_examples)  # get all possible indexes
        np.random.shuffle(idx)  # shuffle indexe
        self._data = self.data[idx]  # get list of `num` random samples

    if start + batch_size > self._num_examples:
        # not enough samples left -> go to the next batch
        self._epochs_completed += 1
        rest_num_examples = self._num_examples - start
        data_rest_part = self.data[start:self._num_examples]
        idx0 = np.arange(0, self._num_examples)  # get all possible indexes
        np.random.shuffle(idx0)  # shuffle indexes
        self._data = self.data[idx0]  # get list of `num` random samples

        start = 0
        self._index_in_epoch = batch_size - rest_num_examples #avoid the case where the #sample != integar times of batch_size
        end =  self._index_in_epoch  
        data_new_part =  self._data[start:end]  
        return np.concatenate((data_rest_part, data_new_part), axis=0)
    else:
        # get next batch
        self._index_in_epoch += batch_size
        end = self._index_in_epoch
        return self._data[start:end]

*edit: here are some examples of the output (red original, blue reconstructed):

**edit: I just saw an autoencoder example with a more complicant luss function than mine. Someone know if the loss function self.loss = tf.reduce_mean(tf.pow(self.X - self.decoded, 2)) is sufficient?

***edit: some more code to describe my training This is my Autoencoder Class:

class AutoEncoder():
def __init__(self):
    # Training Parameters
    self.learning_rate = 0.005
    self.alpha = 0.5

    # Network Parameters
    self.num_input = 24 # one day as input
    self.num_hidden_1 = 10 # 2nd layer num features
    self.num_hidden_2 = 3 # 2nd layer num features (the latent dim)

    self.X = tf.placeholder("float", [None, self.num_input])

    self.weights = {
        'encoder_h1': tf.Variable(tf.random_normal([self.num_input, self.num_hidden_1])),
        'encoder_h2': tf.Variable(tf.random_normal([self.num_hidden_1, self.num_hidden_2])),
        'decoder_h1': tf.Variable(tf.random_normal([self.num_hidden_2, self.num_hidden_1])),
        'decoder_h2': tf.Variable(tf.random_normal([self.num_hidden_1, self.num_input])),
    }
    self.biases = {
        'encoder_b1': tf.Variable(tf.random_normal([self.num_hidden_1])),
        'encoder_b2': tf.Variable(tf.random_normal([self.num_hidden_2])),
        'decoder_b1': tf.Variable(tf.random_normal([self.num_hidden_1])),
        'decoder_b2': tf.Variable(tf.random_normal([self.num_input])),
    }    

    self.encoded = self.encoder(self.X)
    self.decoded = self.decoder(self.encoded)

    # Define loss and optimizer, minimize the squared error
    self.loss = tf.reduce_mean(tf.pow(self.X - self.decoded, 2))

    self.optimizer = tf.train.RMSPropOptimizer(self.learning_rate).minimize(self.loss)

def encoder(self, x):
    # sigmoid, tanh, relu
    en_layer_1 = tf.nn.sigmoid (tf.add(tf.matmul(x, self.weights['encoder_h1']),
                                   self.biases['encoder_b1']))

    en_layer_2 = tf.nn.sigmoid (tf.add(tf.matmul(en_layer_1, self.weights['encoder_h2']),
                                   self.biases['encoder_b2']))

    return en_layer_2

def decoder(self, x):
    de_layer_1 = tf.nn.sigmoid (tf.add(tf.matmul(x, self.weights['decoder_h1']),
                                   self.biases['decoder_b1']))

    de_layer_2 = tf.nn.sigmoid (tf.add(tf.matmul(de_layer_1, self.weights['decoder_h2']),
                                   self.biases['decoder_b2']))

    return de_layer_2

and this is how I train my network (input data have shape (number_days, 24)):

model = autoencoder.AutoEncoder()

num_epochs = 3
batch_size = 50
num_batches = 300

display_batch = 50
examples_to_show = 16

loss_values = []

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

#training
for e in range(1, num_epochs+1):
    print('starting epoch {}'.format(e))
    for b in range(num_batches):
        # get next batch of data
        batch_x = dataset.next_batch(batch_size)

        # Run optimization op (backprop) and cost op (to get loss value)
        l = sess.run([model.loss], feed_dict={model.X: batch_x})
        sess.run(model.optimizer, feed_dict={model.X: batch_x})            

        # Display logs
        if b % display_batch == 0:
            print('Epoch {}: Batch ({}) Loss: {}'.format(e, b, l))
            loss_values.append(l)


# testing
test_data = dataset.next_batch(batch_size)
decoded_test_data = sess.run(model.decoded, feed_dict={model.X: test_data})

Have you found a solution to this? I have run into the same bug, same output regardless of input. — Ambareesh
– Ambareesh, Commented Jun 25, 2019 at 4:19

Matthieu Brucher · Accepted Answer · 2018-11-07 15:24:45Z

3

Just a suggestion, I have had some issues with autoencoders using the sigmoid function.

I switched to tanh or relu and those improved the results. With the autoencoder it is basically learning to recreate the output from the input, by encoding and decoding. If you mean it's the same as the input, then you are getting what you want. It has learned the data set.

Ultimately you can compare by reviewing the Mean Squared Error between the input and output and see if it is exactly the same. If you mean that the output is exactly the same regardless of the input, that isn't something I've run into. I guess if your input doesn't vary much from day to day, then I could imagine that would have some impact. Are you looking for anomalies?

Also, if you have a time series for training, I wouldn't shuffle the data in this particular case. If the temporal order is significant, you introduce data leakage (basically introducing future data into the training set) depending on what you are trying to achieve.

Ah, I didn't initially see your post with the graph results.. thanks for adding.

edited Nov 7, 2018 at 15:24

Matthieu Brucher

22.1k7 gold badges43 silver badges66 bronze badges

answered Nov 7, 2018 at 14:38

CJP

512 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Marvin K Over a year ago

Thank you for your advices! I've tried some other activation functions. The result is still the same. Do you might think I just have not enough data? I didn't really get why data shuffle is not a good idea, but at the moment the order of the data is not important. So I also think this is will not help me with my problem. At the moment I just want the autoencoder to rebuild the data after breaking it down to two or three dimensions. I've added a comment about my loss function. Do you have some experience with that?

CJP Over a year ago

I've used h2o and keras auto encoding for anomaly detection, and used the encoding to verify the output, and apply MSE analysis. The thing with the data shuffle is dependent on your solution, what you're doing with it in real time. Basically if you're trying to use it to predict something about the future, then when you shuffle the data for training you're inserting future data into it or at least it's not in order which might be if your real time solution is intended to process data as it comes in. It's a red herring for your current problem, but something to think about later.

CJP Over a year ago

You might not have enough data, I am not sure. You can test that out by generating more data, if you don't have enough, by adding in some random variation to data you already have. I think I have seen this issue before, but I'll have to look back at some of my notebooks to make sure.

Marvin K Over a year ago

Thank you for your help. Later on I want to find Anomalies in the dataset. At the moment I just want to run a functioning autoencoder. I think I will try it next with more data. It's good to know that no one has the opinion that the model can't work.

Dharman · Accepted Answer · 2021-04-08 18:58:18Z

2

I know this is a very old post, so this is just an attempt to help whoever wonders here again with the same problem.... If the autoencoder is converging to the same encoding for all the different instances, there may be a problem in the loss function.... Check the size and shape of the return of the loss function, as it may be getting confused and evaluating the wrong tensors (i.e. you may need to transpose something somewhere) Basically, assuming you are using the autoencoder to encode M features of N training instances, your loss function should return N values. the size of your loss tensor should be the amount of instances in your training set. I found that the hard way.....

edited Apr 8, 2021 at 18:58

Dharman♦

33.9k27 gold badges106 silver badges157 bronze badges

answered Apr 8, 2021 at 18:52

Juan Esteban Florez

1151 silver badge4 bronze badges

1 Comment

Konrad Over a year ago

Could you give more information how to implement it? I added reduce param to torch.nn.MSELoss(reduce=False) and I replaced loss.backward() to loss.backward(reconstructed)

Matthieu Brucher · Accepted Answer · 2018-11-07 15:14:30Z

1

The sigmoid output is floored at 0, so it cannot reproduce your data that is below 0.

If you want to use a sigmoid output, then rescale your data between ]0;1[ (0 and 1 excluded).

answered Nov 7, 2018 at 15:14

Matthieu Brucher

22.1k7 gold badges43 silver badges66 bronze badges

5 Comments

Marvin K Over a year ago

Thank you for your help. Sadly still no improvement

Matthieu Brucher Over a year ago

Can you show the result, now that your input data is not scaled between (-0.5;0.5)? Obviously, the graphs will be very different now.

Marvin K Over a year ago

I've updated the picture in the question. You can see that the reconstruction (blue line) is still the same for every output. I have changed the y axis to the interval [0.3;0.7] so you can see the changes in the line.

Matthieu Brucher Over a year ago

You need now to give us more information as to how you train your model and how you create these graphs. As the question is now posed, anything could be the problem.

Marvin K Over a year ago

I've added two more code snippets that might help you.

Collectives™ on Stack Overflow

Always same output for tensorflow autoencoder

3 Answers 3

4 Comments

1 Comment

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

1 Comment

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related