Using R I am trying to compute the likelihoods of a vector of values. Some of these values are way off into the tails of the distributions. Rounding error appears to be causing the results to be rounded to zero, making my code throw an error when drawing samples.
I looked into using the Rmpfr package to get higher precision, but this changes my variable type. I also considered rewriting the expression into a log-form to avoid multiplication. Then the expression would become:
exp(log(w[X]) + dnorm(y, x[X], sigma[X], log = TRUE))
This still causes the function to return a zero, due to the exponentiation.
Would there be a way to formulate this problem with logs, allowing for high numerical precision? What I want in the end is for the following probabilities to be correctly computed, summing to one.
liks <- lapply(1:3, function(X) {
w[X] * dnorm(y, x[X], sigma[X])
})
probs1 <- liks[[1]]/(liks[[1]] + liks[[2]] + liks[[3]])
probs2 <- liks[[2]]/(liks[[1]] + liks[[2]] + liks[[3]])
probs3 <- liks[[3]]/(liks[[1]] + liks[[2]] + liks[[3]])
Edit: Adding Numerical Example of Log Code
Here are some parameter and sample values that start throwing errors with the log-code. They do not throw NANs. Instead each of the three entries is zero:
$mu
[1] 0.7323412910 0.7742235621 0.4863889347
$w
[1] 0.008464 0.083536 0.908000
$sigma
[1] 0.08209500030 0.08166088502 0.09168991045
Observation values:
c(4.667935371, 5.654500961, 4.383309364, 4.396201611, 4.452524185, 4.441100597, 4.890487194, 4.416962624, 5.241273880, 4.347382069, 4.867616177, 4.895996094, 4.592288494, -3.612523079, 4.817468166, 4.783963203, 4.541391850, 4.709537983, 5.227987289, 5.585811138, 4.497674942, 4.989979267, 4.489729881)
All the observation values are rather extreme / outliers in my dataset. This would explain why they are assigned probabilities so small that they are rounded to zero.