3

I am currently using Approximate Bayesian Computation to parameterize uncertain parameters of a model to reproduce observed patterns of disease outbreak presence and absence. To do this, I am using the abc package. However, when I run the abc function to estimate the posterior distributions of the parameters from neural network regression, I obtain negative values even though all parameters are positive. I don't know if this is related, but I have this warning:

Warning messages:
1: In density.default(x, weights = weights) :
  Selecting bandwidth *not* using 'weights'

Initially, I used the True Skill Statistic (TSS; difference between True Presence Rate and False Presence Rate) as summary statistics (range: -1 to 1), and in a second step, I used the root mean squared error (RMSE) between the simulated TSS and the observed TSS (value of 1) as summary statistics. In both cases, I have negative estimates. So any help would be greatly appreciated to understand what is causing these negative estimates.

Here is the output:

> summary(abc_neuralnet)
Call: 
abc::abc(target = observed_summary_statistics, param = simulated_parameters, 
    sumstat = simulated_summary_statistics, tol = 1, method = "neuralnet")
Data:
 abc.out$adj.values (10000 posterior samples)
Weights:
 abc.out$weights

                               V1         V2         V3         V4         V5         V6         V7         V8         V9        V10        V11        V12        V13        V14        V15        V16
Min.:                     14.0275    -5.5947   475.6561 -2004.7401 -3410.4807    -0.0041    -0.2877    -0.4942    -0.0403    -0.6970    -0.0524    -0.3172    -0.1729    -0.2109     0.4302     0.0319
Weighted 2.5 % Perc.:     21.0431     1.5006   879.1652  2940.7780  -255.8276     0.2331    -0.1156    -0.1222    -0.0179     0.3437     0.0189    -0.1189     0.0248    -0.1586     0.4668     0.0605
Weighted Median:         130.9660    10.4621  5857.8215 18011.2848  2652.1901     0.4609     0.2799     0.5944     0.3044     0.7649     0.5835     0.5960     0.5098     0.5346     0.7495     0.1958
Weighted Mean:           138.2207    10.6428  5880.5932 18186.6218  2798.3743     0.5053     0.2919     0.5823     0.3040     0.7255     0.5612     0.5943     0.5130     0.5410     0.7519     0.1986
Weighted Mode:            41.2750    15.1685  2046.0809  7565.2865   473.4871     0.2850     0.0511     0.6869     0.3914     0.8491     0.9260     0.8935     0.2530     0.4488     0.5668     0.0947
Weighted 97.5 % Perc.:   283.2321    20.3251 10985.0459 34015.6282  6562.8585     0.9261     0.7165     1.2972     0.6245     0.9763     1.0293     1.2934     1.0046     1.2514     1.0480     0.3448
Max.:                    307.9988    20.4851 14114.8565 45242.1735  7852.7231     1.0047     1.1997     1.4287     0.6768     1.0072     1.1691     1.3752     1.0913     1.3801     1.0690     0.4321
                              V17        V18
Min.:                     -0.0444    -0.7655
Weighted 2.5 % Perc.:      0.1018    -0.2778
Weighted Median:           0.5202     0.4043
Weighted Mean:             0.5449     0.4108
Weighted Mode:             0.1790    -0.0694
Weighted 97.5 % Perc.:     1.0907     1.1251
Max.:                      1.2888     2.0197

Here are the data: https://www.dropbox.com/scl/fi/cdse5nhbo60tnv47yjear/Test_abc.csv?rlkey=59umwgnk5juv392te51016t6u&st=924wq05k&dl=0

Here is the code:

library(Metrics)
library(abc)

data <- read.csv("C:/Users/Test_abc.csv")

## Define the observed summary statistics
## observed_TSS <- 1
observed_summary_statistics <- c(RMSE_TSS_C1 = 0, RMSE_TSS_C2 = 0)

## Compute the root mean squared error (RMSE) between simulated and observed variables
data$RMSE_TSS_C1 <- sapply(data$TSS_C1, FUN = function(x){Metrics::rmse(1, x)})
data$RMSE_TSS_C2 <- sapply(data$TSS_C2, FUN = function(x){Metrics::rmse(1, x)})
## summary(data) 

## Retrieve the simulated summary statistics
simulated_summary_statistics <- data[, c("RMSE_TSS_C1", "RMSE_TSS_C2")]
## summary(simulated_summary_statistics)

## Retrieve the simulated parameters
simulated_parameters <- data[, c("V1", "V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11", "V12", "V13", "V14", "V15", "V16", "V17", "V18")]
## summary(simulated_parameters)

## Run the "cv4abc" function
cv_rejection <- abc::cv4abc(param = simulated_parameters, sumstat = simulated_summary_statistics, nval = 5, tols = c(0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1), method = "rejection", transf = "none")

## Run the "abc" function
abc_neuralnet <- abc::abc(target = observed_summary_statistics, param = simulated_parameters, sumstat = simulated_summary_statistics, tol = 1, method = "neuralnet")
1
  • 1
    I suggest you produce density plots of the posteriors (and priors if possible). I don't see any negative medians in your output. If negative values are not allowed (i.e., impossible) you usually encode this as a prior with zero propability for negative values in Bayesian analysis. Alternatively, you could use transformations or even reject negative estimates. However, I'm not familiar with ABC. Commented Oct 7, 2024 at 8:28

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.