Getting mean and covariance matrix for multivariate normal from keras model

Question

I have a dataset that has 6 input features and 5 output features. I want to use a keras sequential model to estimate the mean vector and covariance matrix from any row of input features assuming the output features to be following Multivariate Normal Distribution.

That is for my dataset for any row of 6 input features, I want to get a mean vector of 5 values and a 5*5 covariance matrix.

sample=pd.DataFrame({'X1':[1,2,3,4,5,6],
              'X2':[1,3,1,5,2,7],
              'X3':[3,0,0,7,5,0],
              'X4':[0,4,3,2,5,8],
              'X5':[9,7,0,2,4,5],
              'X6':[1,1,8,7,0,0],
              'Y1':[0.5,1.2,6.3,4.5,1.5,6.6],
              'Y2':[6.1,4.3,2.1,1.5,4.2,8.7],
              'Y3':[0,0,3.2,3.7,5.5,0.2],
              'Y4':[0.5,1.4,8.3,5.2,1.5,1.8],
              'Y5':[2.9,1.7,6.3,5.2,9.4,1.5]})
sample

    X1  X2  X3  X4  X5  X6  Y1  Y2  Y3  Y4  Y5
0   1   1   3   0   9   1   0.5 6.1 0.0 0.5 2.9
1   2   3   0   4   7   1   1.2 4.3 0.0 1.4 1.7
2   3   1   0   3   0   8   6.3 2.1 3.2 8.3 6.3
3   4   5   7   2   2   7   4.5 1.5 3.7 5.2 5.2
4   5   2   5   5   4   0   1.5 4.2 5.5 1.5 9.4
5   6   7   0   8   5   0   6.6 8.7 0.2 1.8 1.5

For loss function I am using the following, which maximizes the log probability.

def lossF(y_true, mu, cov):

  dist = tfp.distributions.MultivariateNormalTriL(loc=mu, scale_tril=tf.linalg.cholesky(cov))
  return tf.reduce_mean(-dist.log_prob(y_true))

I am trying something like below, but getting confused in the middle.

#X_train has 6 values in each row
#y_train has 5 values in each row
#y_pred should be either a distribution function or mu & cov for each row

opt = Adam(learning_rate=0.001)
inputs = Input(shape=(6,))
layer1 = Dense(24, activation='relu')(inputs)
layer2 = Dense(12, activation='relu')(layer1)
predictions = ???
model = Model(inputs=???, outputs=???)
model.compile(optimizer=opt, loss=loss_fn)
model.fit(X_train, y_train, epochs=100, batch_size=100)
y_pred=model.predict(X_test)

Note: instead of getting mu and cov separately, if its possible to get distribution function as output that would be helpful too.

ML model learn from huge number of instances. What you are trying to do doesn’t really make sense. — Lucas Morin
– Lucas Morin, Commented Dec 3, 2020 at 23:46
@lcrmorin I do have huge number of instances. The dataset in question is just an example of how the data looks like. — Tanzin Farhat
– Tanzin Farhat, Commented Dec 5, 2020 at 4:53

Said Obakrim · Accepted Answer · 2021-04-30 07:00:05Z

Given that the covariance matrix has to be positive definite, the cholesky decomposition is a good way to solve this problem. So the output of the network will be the mean vector mu and the upper triangular part of the cholesky matrix (denoted T here). The diagonal of this matrix must be positive elements (the diagonal of the covariance matrix are standard deviations):

p = y_train.shape[1] # dimension of the covariance matrix 
inputs = Input(shape=(6,))
layer1 = Dense(24, activation='relu')(inputs)
layer2 = Dense(12, activation='relu')(layer1)
mu = Dense(p, activation = "linear")(layer1)
T1 = Dense(p, activation="exponential")(layer1)# diagonal of T
T2 = Dense((p*(p-1)/2), activation="linear")(layer1)
outputs = Concatenate()([mu, T1, T2])

Now let's define the loss function. Firstly, let's define the function that will extract the outputs of the network:

def mu_sigma(output):
    mu = output[0][0:p]
    T1 = output[0][p:2*p]
    T2 = output[0][2*p:]
    ones = tf.ones((p,p), dtype=tf.float32) 
    mask_a = tf.linalg.band_part(ones, 0, -1)  
    mask_b = tf.linalg.band_part(ones, 0, 0)  
    mask = tf.subtract(mask_a, mask_b) 
    zero = tf.constant(0, dtype=tf.float32)
    non_zero = tf.not_equal(mask, zero)
    indices = tf.where(non_zero)
    T2 = tf.sparse.SparseTensor(indices,T2,dense_shape=tf.cast((p,p),
         dtype=tf.int64))
    T2 = tf.sparse.to_dense(T2)
    T1 = tf.linalg.diag(T1)
    sigma = T1 + T2
    return mu, sigma

Now for the loss function:

from tensorflow_probability import distributions as tfd
def gnll_loss(y, pred):
    mu, sigma = mu_sigma(pred)
    gm = tfd.MultivariateNormalTriL(loc=mu, scale_tril=sigma)
    log_likelihood = gm.log_prob(y)          
    return - tf.math.reduce_sum(log_likelihood)

tf.sparse.SparseTensor(indices,sigma2,dense_shape=tf.cast((10,10), dtype=tf.int64)) could you please tell me what's sigma2 and why dense_shape=tf.cast((10,10), dtype=tf.int64) thanks for your answer. — Tanzin Farhat
– Tanzin Farhat, Commented Apr 29, 2021 at 18:27
sorry its tf.sparse.SparseTensor(indices,T2,dense_shape=tf.cast((p,p), dtype=tf.int64)) — Said Obakrim
– Said Obakrim, Commented Apr 30, 2021 at 6:59

Stack Exchange Network

Getting mean and covariance matrix for multivariate normal from keras model

1 Answer 1

Your Answer

Hot Network Questions

Getting mean and covariance matrix for multivariate normal from keras model

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions