0

Using R I am trying to compute the likelihoods of a vector of values. Some of these values are way off into the tails of the distributions. Rounding error appears to be causing the results to be rounded to zero, making my code throw an error when drawing samples.

I looked into using the Rmpfr package to get higher precision, but this changes my variable type. I also considered rewriting the expression into a log-form to avoid multiplication. Then the expression would become: exp(log(w[X]) + dnorm(y, x[X], sigma[X], log = TRUE))

This still causes the function to return a zero, due to the exponentiation.

Would there be a way to formulate this problem with logs, allowing for high numerical precision? What I want in the end is for the following probabilities to be correctly computed, summing to one.

  liks <- lapply(1:3, function(X) {
    w[X] * dnorm(y, x[X], sigma[X])
  })

  probs1 <- liks[[1]]/(liks[[1]] + liks[[2]] + liks[[3]])
  
  probs2 <- liks[[2]]/(liks[[1]] + liks[[2]] + liks[[3]])
  
  probs3 <- liks[[3]]/(liks[[1]] + liks[[2]] + liks[[3]])

Edit: Adding Numerical Example of Log Code

Here are some parameter and sample values that start throwing errors with the log-code. They do not throw NANs. Instead each of the three entries is zero:

$mu
[1] 0.7323412910 0.7742235621 0.4863889347

$w
[1] 0.008464 0.083536 0.908000

$sigma
[1] 0.08209500030 0.08166088502 0.09168991045

Observation values:
c(4.667935371,  5.654500961,  4.383309364,  4.396201611,  4.452524185,  4.441100597,  4.890487194,  4.416962624,  5.241273880,  4.347382069,  4.867616177,  4.895996094,  4.592288494, -3.612523079,  4.817468166,  4.783963203,  4.541391850,  4.709537983,  5.227987289,  5.585811138,  4.497674942,  4.989979267,  4.489729881)

All the observation values are rather extreme / outliers in my dataset. This would explain why they are assigned probabilities so small that they are rounded to zero.

3
  • Example of an extreme value becoming zero: 0.09554102 * dnorm(-3.612523, 0.8133402, 0.03580487) Commented Mar 25, 2023 at 14:48
  • Why do you need to exponentiate? Avoid that. Commented Mar 25, 2023 at 15:08
  • @Roland I would rather avoid it, but then the computations of the probabilities make no sense anymore Commented Mar 25, 2023 at 15:10

1 Answer 1

0

You should work with logs throughout, don't take the exponential so soon. For example, here's a change:

logliks <- lapply(1:3, function(X) {
  log(w[X]) +  dnorm(y, x[X], sigma[X], log = TRUE)
 })

Now, to evaluate expressions like probs1, you want to divide numerator and denominator by the biggest of the liks values, i.e. compute

(liks[[1]]/biggest)/(liks[[1]]/biggest + liks[[2]]/biggest + liks[[3]]/biggest)

but do it all on the log scale:

logbiggest <- max(as.numeric(logliks))
logprobs1 <- (logliks[[1]] - logbiggest) - 
  log( exp( logliks[[1]] - logbiggest ) 
    + exp(logliks[[2]] - logbiggest) 
    + exp(logliks[[3]] - logbiggest) )

and similarly for logprobs2 and logprobs3. Since logbiggest is equal to one of the logliks, one of those exponentials will equal 1.0, and then it doesn't matter if the other ones underflow.

Edited to add: numerical example

You added data to your question. Here is the full calculation using your data. I don't get any zero probabilities, but most of them are very small:

x <- c(0.7323412910, 0.7742235621, 0.4863889347)
w <- c(0.008464, 0.083536, 0.908000)
sigma <- c(0.08209500030, 0.08166088502, 0.09168991045)

y <- c(4.667935371,  5.654500961,  4.383309364,  4.396201611,  4.452524185,  4.441100597,  4.890487194,  4.416962624,  5.241273880,  4.347382069,  4.867616177,  4.895996094,  4.592288494, -3.612523079,  4.817468166,  4.783963203,  4.541391850,  4.709537983,  5.227987289,  5.585811138,  4.497674942,  4.989979267,  4.489729881)

probs <- matrix(NA, length(y), 3)

for (i in seq_along(y)) {
  logliks <- lapply(1:3, function(X) {
    log(w[X]) +  dnorm(y[i], x[X], sigma[X], log = TRUE)
  })
  
  logbiggest <- max(as.numeric(logliks))
  
  logprobs1 <- (logliks[[1]] - logbiggest) - 
    log( exp( logliks[[1]] - logbiggest ) 
         + exp(logliks[[2]] - logbiggest) 
         + exp(logliks[[3]] - logbiggest) )
  
  logprobs2 <- (logliks[[2]] - logbiggest) - 
    log( exp( logliks[[1]] - logbiggest ) 
         + exp(logliks[[2]] - logbiggest) 
         + exp(logliks[[3]] - logbiggest) )
  
  logprobs3 <- (logliks[[3]] - logbiggest) - 
    log( exp( logliks[[1]] - logbiggest ) 
         + exp(logliks[[2]] - logbiggest) 
         + exp(logliks[[3]] - logbiggest) )
  
  probs[i, ] <- exp(c(logprobs1, logprobs2, logprobs3))
}

head(probs)
#>              [,1]         [,2] [,3]
#> [1,] 4.006975e-50 9.059815e-44    1
#> [2,] 1.964023e-93 2.172199e-87    1
#> [3,] 6.103586e-40 1.274113e-33    1
#> [4,] 2.221979e-40 4.668064e-34    1
#> [5,] 2.538854e-42 5.467829e-36    1
#> [6,] 6.336084e-42 1.358276e-35    1

Created on 2023-03-26 with reprex v2.0.2

The last one is equal to 1 because of rounding; the others are very small numbers. This makes sense, because the y values are so much larger than the means, and the 3rd distribution has the largest variance: so the model predicts that all those outliers are likely drawn from that distribution. Multiply sigma by 10 and you'll get less extreme probabilities.

Sign up to request clarification or add additional context in comments.

11 Comments

Thank you for this suggestion. I am still trying to implement it correctly. As of yet, it still seems to be producing rows where prob1, prob2, and prob3 are all equal to zero. This then causes an error in the sample() function.
You should put a numerical example of one of those into your question. Just show us w, x, y and sigma.
I added a numerical example with values your log-code gives 0's for. Hopefully this clarifies things. Agains thank you for the help. Also, I added the following three lines to your code to reverse back to non-log scale. I hope that makes sense: probs1 <- exp(logprobs1) probs2 <- exp(logprobs2) probs3 <- exp(logprobs3)
Thank you, my code must have contained a mistake. When I run yours, I don't generate any rows with only zeroes anymore. One last thing: are you sure about the "max(as.numeric(logliks))"? My understanding is that you want the biggest log-likelihood per row. Wouldn't the max() function get larges value of the entire list?
In my loop, it is only taking the max of 3 values. If you rearrange it to compute all of them in a dataframe, then you do need to take the row-wise max.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.