library(serosv)
#> Warning: replacing previous import 'magrittr::extract' by 'tidyr::extract' when
#> loading 'serosv'Parametric Bayesian framework
Currently, serosv only has models under parametric
Bayesian framework
Proposed approach
Prevalence has a parametric form \(\pi(a_i, \alpha)\) where \(\alpha\) is a parameter vector
One can constraint the parameter space of the prior distribution \(P(\alpha)\) in order to achieve the desired monotonicity of the posterior distribution \(P(\pi_1, \pi_2, ..., \pi_m|y,n)\)
Where:
- \(n = (n_1, n_2, ..., n_m)\) and \(n_i\) is the sample size at age \(a_i\)
- \(y = (y_1, y_2, ..., y_m)\) and \(y_i\) is the number of infected individual from the \(n_i\) sampled subjects
Farrington
Refer to Chapter 10.3.1
Proposed model
The model for prevalence is as followed
\[ \pi (a) = 1 - exp\{ \frac{\alpha_1}{\alpha_2}ae^{-\alpha_2 a} + \frac{1}{\alpha_2}(\frac{\alpha_1}{\alpha_2} - \alpha_3)(e^{-\alpha_2 a} - 1) -\alpha_3 a \} \]
For likelihood model, independent binomial distribution are assumed for the number of infected individuals at age \(a_i\)
\[ y_i \sim Bin(n_i, \pi_i), \text{ for } i = 1,2,3,...m \]
The constraint on the parameter space can be incorporated by assuming truncated normal distribution for the components of \(\alpha\), \(\alpha = (\alpha_1, \alpha_2, \alpha_3)\) in \(\pi_i = \pi(a_i,\alpha)\)
\[ \alpha_j \sim \text{truncated } \mathcal{N}(\mu_j, \tau_j), \text{ } j = 1,2,3 \]
The joint posterior distribution for \(\alpha\) can be derived by combining the likelihood and prior as followed
\[ P(\alpha|y) \propto \prod^m_{i=1} \text{Bin}(y_i|n_i, \pi(a_i, \alpha)) \prod^3_{i=1}-\frac{1}{\tau_j}\text{exp}(\frac{1}{2\tau^2_j} (\alpha_j - \mu_j)^2) \]
-
Where the flat hyperprior distribution is defined as followed:
\(\mu_j \sim \mathcal{N}(0, 10000)\)
\(\tau^{-2}_j \sim \Gamma(100,100)\)
The full conditional distribution of \(\alpha_i\) is thus \[ P(\alpha_i|\alpha_j,\alpha_k, k, j \neq i) \propto -\frac{1}{\tau_i}\text{exp}(\frac{1}{2\tau^2_i} (\alpha_i - \mu_i)^2) \prod^m_{i=1} \text{Bin}(y_i|n_i, \pi(a_i, \alpha)) \]
Fitting data
To fit Farrington model, use
hierarchical_bayesian_model() and define
type = "far2" or type = "far3" where
type = "far2"refers to Farrington model with 2 parameters (\(\alpha_3 = 0\))type = "far3"refers to Farrington model with 3 parameters (\(\alpha_3 > 0\))
df <- mumps_uk_1986_1987
model <- hierarchical_bayesian_model(df, type="far3")
#>
#> SAMPLING FOR MODEL 'fra_3' NOW (CHAIN 1).
#> Chain 1: Rejecting initial value:
#> Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
#> Chain 1: Stan can't start sampling from this initial value.
#> Chain 1:
#> Chain 1: Gradient evaluation took 0.000126 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.26 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 5000 [ 0%] (Warmup)
#> Chain 1: Iteration: 500 / 5000 [ 10%] (Warmup)
#> Chain 1: Iteration: 1000 / 5000 [ 20%] (Warmup)
#> Chain 1: Iteration: 1500 / 5000 [ 30%] (Warmup)
#> Chain 1: Iteration: 1501 / 5000 [ 30%] (Sampling)
#> Chain 1: Iteration: 2000 / 5000 [ 40%] (Sampling)
#> Chain 1: Iteration: 2500 / 5000 [ 50%] (Sampling)
#> Chain 1: Iteration: 3000 / 5000 [ 60%] (Sampling)
#> Chain 1: Iteration: 3500 / 5000 [ 70%] (Sampling)
#> Chain 1: Iteration: 4000 / 5000 [ 80%] (Sampling)
#> Chain 1: Iteration: 4500 / 5000 [ 90%] (Sampling)
#> Chain 1: Iteration: 5000 / 5000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 14.91 seconds (Warm-up)
#> Chain 1: 14.326 seconds (Sampling)
#> Chain 1: 29.236 seconds (Total)
#> Chain 1:
#> Warning: There were 1236 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is 1.07, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
model$info
#> mean se_mean sd 2.5%
#> alpha1 1.400750e-01 5.772746e-04 5.061637e-03 1.291531e-01
#> alpha2 1.971363e-01 6.891601e-04 6.915543e-03 1.855918e-01
#> alpha3 7.289266e-03 1.518418e-03 7.518513e-03 1.120555e-04
#> tau_alpha1 4.356277e-01 1.212644e-01 9.969059e-01 8.370028e-06
#> tau_alpha2 1.047710e+00 3.895075e-01 2.035456e+00 1.255557e-05
#> tau_alpha3 2.040321e+00 1.247945e+00 3.455228e+00 6.342945e-06
#> mu_alpha1 -3.309940e+00 2.884706e+00 3.864842e+01 -8.296621e+01
#> mu_alpha2 6.802868e-01 1.689830e+00 2.588370e+01 -6.616912e+01
#> mu_alpha3 2.083459e+00 1.746417e+00 2.930201e+01 -5.886563e+01
#> sigma_alpha1 6.371450e+01 1.307429e+01 3.332064e+02 5.015577e-01
#> sigma_alpha2 8.334825e+01 4.038113e+01 1.560984e+03 3.719014e-01
#> sigma_alpha3 8.502171e+01 2.868993e+01 1.157427e+03 3.159379e-01
#> lp__ -2.534185e+03 6.823333e-01 3.897737e+00 -2.541944e+03
#> 25% 50% 75% 97.5%
#> alpha1 1.367395e-01 1.412622e-01 1.437116e-01 1.490064e-01
#> alpha2 1.929334e-01 1.950358e-01 2.016050e-01 2.122837e-01
#> alpha3 1.336679e-03 4.927735e-03 1.129148e-02 2.722506e-02
#> tau_alpha1 1.060576e-03 4.558585e-03 2.647481e-01 3.975193e+00
#> tau_alpha2 1.928452e-03 1.281223e-02 8.139368e-01 7.230101e+00
#> tau_alpha3 2.614862e-03 5.170448e-02 2.966259e+00 1.001836e+01
#> mu_alpha1 -1.703190e+01 -9.766189e-01 1.329607e+00 9.839007e+01
#> mu_alpha2 -1.427125e+00 4.241891e-01 1.024570e+01 5.114096e+01
#> mu_alpha3 -1.369344e+00 -2.821753e-02 1.753812e+00 7.802320e+01
#> sigma_alpha1 1.943496e+00 1.481106e+01 3.070642e+01 3.456672e+02
#> sigma_alpha2 1.108421e+00 8.834616e+00 2.277171e+01 2.822211e+02
#> sigma_alpha3 5.806246e-01 4.397813e+00 1.955597e+01 3.970900e+02
#> lp__ -2.536565e+03 -2.534811e+03 -2.531149e+03 -2.526663e+03
#> n_eff Rhat
#> alpha1 76.880659 1.020363
#> alpha2 100.696016 1.002847
#> alpha3 24.517797 1.010750
#> tau_alpha1 67.583632 1.001489
#> tau_alpha2 27.308103 1.001907
#> tau_alpha3 7.665891 1.125965
#> mu_alpha1 179.498324 1.004001
#> mu_alpha2 234.620692 1.010656
#> mu_alpha3 281.513284 1.001463
#> sigma_alpha1 649.516789 1.001963
#> sigma_alpha2 1494.307920 1.000572
#> sigma_alpha3 1627.527210 1.000915
#> lp__ 32.631107 1.019529
plot(model)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.
Log-logistic
Proposed approach
The model for seroprevalence is as followed
\[ \pi(a) = \frac{\beta a^\alpha}{1 + \beta a^\alpha}, \text{ } \alpha, \beta > 0 \]
The likelihood is specified to be the same as Farrington model (\(y_i \sim Bin(n_i, \pi_i)\)) with
\[ \text{logit}(\pi(a)) = \alpha_2 + \alpha_1\log(a) \]
- Where \(\alpha_2 = \text{log}(\beta)\)
The prior model of \(\alpha_1\) is specified as \(\alpha_1 \sim \text{truncated } \mathcal{N}(\mu_1, \tau_1)\) with flat hyperprior as in Farrington model
\(\beta\) is constrained to be positive by specifying \(\alpha_2 \sim \mathcal{N}(\mu_2, \tau_2)\)
The full conditional distribution of \(\alpha_1\) is thus
\[ P(\alpha_1|\alpha_2) \propto -\frac{1}{\tau_1} \text{exp} (\frac{1}{2 \tau_1^2} (\alpha_1 - \mu_1)^2) \prod_{i=1}^m \text{Bin}(y_i|n_i,\pi(a_i, \alpha_1, \alpha_2) ) \]
And \(\alpha_2\) can be derived in the same way
Fitting data
To fit Log-logistic model, use
hierarchical_bayesian_model() and define
type = "log_logistic"
df <- rubella_uk_1986_1987
model <- hierarchical_bayesian_model(df, type="log_logistic")
#>
#> SAMPLING FOR MODEL 'log_logistic' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 5.2e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.52 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 5000 [ 0%] (Warmup)
#> Chain 1: Iteration: 500 / 5000 [ 10%] (Warmup)
#> Chain 1: Iteration: 1000 / 5000 [ 20%] (Warmup)
#> Chain 1: Iteration: 1500 / 5000 [ 30%] (Warmup)
#> Chain 1: Iteration: 1501 / 5000 [ 30%] (Sampling)
#> Chain 1: Iteration: 2000 / 5000 [ 40%] (Sampling)
#> Chain 1: Iteration: 2500 / 5000 [ 50%] (Sampling)
#> Chain 1: Iteration: 3000 / 5000 [ 60%] (Sampling)
#> Chain 1: Iteration: 3500 / 5000 [ 70%] (Sampling)
#> Chain 1: Iteration: 4000 / 5000 [ 80%] (Sampling)
#> Chain 1: Iteration: 4500 / 5000 [ 90%] (Sampling)
#> Chain 1: Iteration: 5000 / 5000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 4.335 seconds (Warm-up)
#> Chain 1: 2.113 seconds (Sampling)
#> Chain 1: 6.448 seconds (Total)
#> Chain 1:
#> Warning: There were 830 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is 1.05, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
model$type
#> [1] "log_logistic"
plot(model)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.