I'm working on a small Tensorflow model using Tensorflow 2.3.0. in the model I use several tf.while_loops and TensorArray. The model is not working as expected. I tried to troubleshoot the issue but unfortunately not all Tensorflow behavior is documented and I'm not sure if there is a bug in my model or it is Tensorflow behavior I'm unaware of. For example in my model I multiply my data with some weights inside a tf.while_loop. I then store the result in TensorArray . The TensorArray content is again used in the same fashion until I minimize the loss to train the model.
my problem is that the model does not train as expected . I suspect that tensorflow is freezing the weights and not updating them as I would expect.
How can I make sure that the content of the last TensorArray remains trainable since it is produced using data with trainable weight variables . I'm trying to avoid the issue mentioned here but not sure if I have.
below is a simple example ( dummy model ) just to clarify what I'm doing :
import tensorflow.compat.v1 as tf
tf.compat.v1.disable_eager_execution()
import numpy as np
size=20
randvalues = np.random.choice(a=[0.0, 1.0], size=(size,10, 10), p=[.5, 1-.5])
x = tf.constant(randvalues , tf.float32)
y = tf.constant(randvalues , tf.float32)
init_range = np.sqrt(6.0 /20)
initial = tf.random_uniform([ 10, 10 ], minval=-init_range,
maxval=init_range, dtype=tf.float32)
weights = tf.Variable(initial, name="testWeight1")
w1_summ = tf.summary.histogram("testWeight1" ,weights )
init_range = np.sqrt(6.0 /15)
initial2 = tf.random_uniform([ 10, 5 ], minval=-init_range,
maxval=init_range, dtype=tf.float32)
weights_tied = tf.Variable(initial2, name="tiedWeight")
w2_summ = tf.summary.histogram("tiedWeight" ,weights_tied )
ta = tf.TensorArray(dtype = tf.float32 , size=0 , dynamic_size=True , clear_after_read=False , infer_shape=False )
ta2 = tf.TensorArray(dtype = tf.float32 , size=0 , dynamic_size=True , clear_after_read=False , infer_shape=False )
def condition(counter ,ta1):
return counter < size
def body(counter ,ta1):
with tf.name_scope("firstloop"):
operation1 = tf.matmul(x[counter],weights)
operation2= tf.nn.relu(operation1)
operation3= tf.matmul(operation2,weights_tied)
operation4= tf.matmul(operation3,tf.transpose(weights_tied))
ta1 = ta1.write(counter,tf.reshape(operation4,[-1]))
return counter +1 , ta1
runloop , array1 = tf.while_loop(condition,body,[0 , ta ] , back_prop=True )
def condition2(counter ,ta1 , array1):
return counter < 1
def body2(counter , ta2 ,array1 ):
with tf.name_scope("secondloop"):
operation = array1.stack()
operation4= tf.nn.relu(operation)
ta2 = ta2.write(counter,tf.reshape(operation4,[-1]))
return counter +1 , ta2 ,array1
runloop2 , array2 , _ = tf.while_loop(condition2,body2,[0 , ta2 ,array1] ,back_prop=True)
predictions= array2.stack()
loss=tf.nn.weighted_cross_entropy_with_logits(logits=tf.reshape(predictions,[-1]), targets=tf.reshape(y,[-1]), pos_weight=1)
cost = tf.reduce_mean(loss)
optimizer = tf.train.AdamOptimizer(learning_rate=.001)
gvs = optimizer.compute_gradients(cost)
additional_summeries = [ tf.summary.histogram( "GRAD"+str(g[1]) , g[0]) for g in gvs]
opt_op= optimizer.apply_gradients(gvs)
merge= tf.summary.merge([w2_summ , w1_summ] + additional_summeries )
sess = tf.Session()
summ_writer = tf.summary.FileWriter('C:\\Users\\USER\\Documents\\Projects\\MastersEnv\\GraphAutoEncoder\\gae\\summaries', sess.graph)
sess.run(tf.global_variables_initializer())
for cc in range(1000) :
a , b, = sess.run([runloop2 , opt_op])
c = sess.run(merge)
summ_writer.add_summary(c,cc)
print(cc)
print('done')
In the above example the content of array2 should be the predictions. how can I make sure that the loops and the tensorArray did not affect the train-ability of my variables ? If what I did was incorrect. whats the best approach to achieve the same thing but keeps my result trainable.
Update ::
Ok , So I ran the my model for some times and monitored the loss, accuracy and other scaler metrics. And found a general trend in loss decrease and accuracy and other accuracy related metric increase. the Accuracy and loss are also inverse of each other , to my understanding this indicates that the updates are not random and the model is learning something.
Additionally I monitored the weights and gradients distributions are changing which confirms that variables are being trained.
Can you some please confirm my conclusion and understanding ?
Thanks for your help in Advance.
