Skip to contents

Parametric Bayesian framework

Currently, serosv only has models under parametric Bayesian framework

Proposed approach

Prevalence has a parametric form \(\pi(a_i, \alpha)\) where \(\alpha\) is a parameter vector

One can constraint the parameter space of the prior distribution \(P(\alpha)\) in order to achieve the desired monotonicity of the posterior distribution \(P(\pi_1, \pi_2, ..., \pi_m|y,n)\)

Where:

  • \(n = (n_1, n_2, ..., n_m)\) and \(n_i\) is the sample size at age \(a_i\)
  • \(y = (y_1, y_2, ..., y_m)\) and \(y_i\) is the number of infected individual from the \(n_i\) sampled subjects

Farrington

Refer to Chapter 10.3.1

Proposed model

The model for prevalence is as followed

\[ \pi (a) = 1 - exp\{ \frac{\alpha_1}{\alpha_2}ae^{-\alpha_2 a} + \frac{1}{\alpha_2}(\frac{\alpha_1}{\alpha_2} - \alpha_3)(e^{-\alpha_2 a} - 1) -\alpha_3 a \} \]

For likelihood model, independent binomial distribution are assumed for the number of infected individuals at age \(a_i\)

\[ y_i \sim Bin(n_i, \pi_i), \text{ for } i = 1,2,3,...m \]

The constraint on the parameter space can be incorporated by assuming truncated normal distribution for the components of \(\alpha\), \(\alpha = (\alpha_1, \alpha_2, \alpha_3)\) in \(\pi_i = \pi(a_i,\alpha)\)

\[ \alpha_j \sim \text{truncated } \mathcal{N}(\mu_j, \tau_j), \text{ } j = 1,2,3 \]

The joint posterior distribution for \(\alpha\) can be derived by combining the likelihood and prior as followed

\[ P(\alpha|y) \propto \prod^m_{i=1} \text{Bin}(y_i|n_i, \pi(a_i, \alpha)) \prod^3_{i=1}-\frac{1}{\tau_j}\text{exp}(\frac{1}{2\tau^2_j} (\alpha_j - \mu_j)^2) \]

  • Where the flat hyperprior distribution is defined as followed:

    • \(\mu_j \sim \mathcal{N}(0, 10000)\)

    • \(\tau^{-2}_j \sim \Gamma(100,100)\)

The full conditional distribution of \(\alpha_i\) is thus \[ P(\alpha_i|\alpha_j,\alpha_k, k, j \neq i) \propto -\frac{1}{\tau_i}\text{exp}(\frac{1}{2\tau^2_i} (\alpha_i - \mu_i)^2) \prod^m_{i=1} \text{Bin}(y_i|n_i, \pi(a_i, \alpha)) \]

Fitting data

To fit Farrington model, use hierarchical_bayesian_model() and define type = "far2" or type = "far3" where

  • type = "far2" refers to Farrington model with 2 parameters (\(\alpha_3 = 0\))

  • type = "far3" refers to Farrington model with 3 parameters (\(\alpha_3 > 0\))

df <- mumps_uk_1986_1987
model <- hierarchical_bayesian_model(age = df$age, pos = df$pos, tot = df$tot, type="far3")
#> 
#> SAMPLING FOR MODEL 'fra_3' NOW (CHAIN 1).
#> Chain 1: Rejecting initial value:
#> Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
#> Chain 1:   Stan can't start sampling from this initial value.
#> Chain 1: Rejecting initial value:
#> Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
#> Chain 1:   Stan can't start sampling from this initial value.
#> Chain 1: Rejecting initial value:
#> Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
#> Chain 1:   Stan can't start sampling from this initial value.
#> Chain 1: Rejecting initial value:
#> Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
#> Chain 1:   Stan can't start sampling from this initial value.
#> Chain 1: Rejecting initial value:
#> Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
#> Chain 1:   Stan can't start sampling from this initial value.
#> Chain 1: Rejecting initial value:
#> Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
#> Chain 1:   Stan can't start sampling from this initial value.
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0.000131 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.31 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 5000 [  0%]  (Warmup)
#> Chain 1: Iteration:  500 / 5000 [ 10%]  (Warmup)
#> Chain 1: Iteration: 1000 / 5000 [ 20%]  (Warmup)
#> Chain 1: Iteration: 1500 / 5000 [ 30%]  (Warmup)
#> Chain 1: Iteration: 1501 / 5000 [ 30%]  (Sampling)
#> Chain 1: Iteration: 2000 / 5000 [ 40%]  (Sampling)
#> Chain 1: Iteration: 2500 / 5000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 3000 / 5000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 3500 / 5000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 4000 / 5000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 4500 / 5000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 5000 / 5000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 16.968 seconds (Warm-up)
#> Chain 1:                214.855 seconds (Sampling)
#> Chain 1:                231.823 seconds (Total)
#> Chain 1:
#> Warning: There were 136 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: There were 772 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
#> https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
#> Warning: Examine the pairs() plot to diagnose sampling problems

model$info
#>                       mean      se_mean           sd          2.5%
#> alpha1        1.399786e-01 2.194875e-04 6.202144e-03  1.282681e-01
#> alpha2        1.989102e-01 3.054403e-04 8.229676e-03  1.842053e-01
#> alpha3        8.332895e-03 2.211557e-04 6.788916e-03  3.036523e-04
#> tau_alpha1    8.148141e+00 9.406602e-01 2.344040e+01  2.893148e-06
#> tau_alpha2    5.968980e+00 6.527110e-01 2.148066e+01  3.457587e-06
#> tau_alpha3    4.108423e+00 5.274325e-01 1.202170e+01  8.001462e-06
#> mu_alpha1    -6.639966e-01 1.589536e+00 3.995751e+01 -1.018009e+02
#> mu_alpha2    -2.306983e-01 1.968753e+00 3.552652e+01 -8.230304e+01
#> mu_alpha3     1.363740e+00 1.021062e+00 3.264235e+01 -6.984844e+01
#> sigma_alpha1  7.575109e+02 6.708429e+02 1.502736e+04  1.096794e-01
#> sigma_alpha2  6.952148e+01 1.159740e+01 3.170528e+02  1.287552e-01
#> sigma_alpha3  1.155990e+02 5.083666e+01 2.491292e+03  1.494615e-01
#> lp__         -2.532985e+03 3.597988e-01 4.419623e+00 -2.542501e+03
#>                        25%           50%           75%         97.5%     n_eff
#> alpha1        1.357323e-01  1.399721e-01  1.442216e-01  1.523069e-01  798.4798
#> alpha2        1.931849e-01  1.984240e-01  2.041394e-01  2.165614e-01  725.9601
#> alpha3        2.891279e-03  6.645977e-03  1.214402e-02  2.472649e-02  942.3338
#> tau_alpha1    1.289415e-03  7.498103e-02  2.854412e+00  8.313444e+01  620.9612
#> tau_alpha2    1.074517e-03  3.976950e-02  1.293098e+00  6.032163e+01 1083.0619
#> tau_alpha3    1.690312e-03  5.572060e-02  1.412039e+00  4.476567e+01  519.5152
#> mu_alpha1    -1.811954e+00  1.512750e-01  2.278709e+00  9.070072e+01  631.9115
#> mu_alpha2    -2.632213e+00  1.794434e-01  2.609694e+00  8.602647e+01  325.6290
#> mu_alpha3    -2.258013e+00  3.198256e-02  2.455614e+00  8.454328e+01 1022.0182
#> sigma_alpha1  5.918909e-01  3.651949e+00  2.784863e+01  5.879190e+02  501.7919
#> sigma_alpha2  8.793965e-01  5.014499e+00  3.050680e+01  5.377918e+02  747.3807
#> sigma_alpha3  8.415516e-01  4.236394e+00  2.432297e+01  3.535213e+02 2401.5691
#> lp__         -2.535889e+03 -2.532823e+03 -2.529846e+03 -2.525212e+03  150.8868
#>                   Rhat
#> alpha1       1.0006350
#> alpha2       0.9999682
#> alpha3       0.9997171
#> tau_alpha1   0.9997314
#> tau_alpha2   0.9999287
#> tau_alpha3   1.0008507
#> mu_alpha1    1.0009338
#> mu_alpha2    1.0061554
#> mu_alpha3    1.0002310
#> sigma_alpha1 1.0017315
#> sigma_alpha2 1.0062792
#> sigma_alpha3 0.9999964
#> lp__         0.9999222
plot(model)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.

Log-logistic

Proposed approach

The model for seroprevalence is as followed

\[ \pi(a) = \frac{\beta a^\alpha}{1 + \beta a^\alpha}, \text{ } \alpha, \beta > 0 \]

The likelihood is specified to be the same as Farrington model (\(y_i \sim Bin(n_i, \pi_i)\)) with

\[ \text{logit}(\pi(a)) = \alpha_2 + \alpha_1\log(a) \]

  • Where \(\alpha_2 = \text{log}(\beta)\)

The prior model of \(\alpha_1\) is specified as \(\alpha_1 \sim \text{truncated } \mathcal{N}(\mu_1, \tau_1)\) with flat hyperprior as in Farrington model

\(\beta\) is constrained to be positive by specifying \(\alpha_2 \sim \mathcal{N}(\mu_2, \tau_2)\)

The full conditional distribution of \(\alpha_1\) is thus

\[ P(\alpha_1|\alpha_2) \propto -\frac{1}{\tau_1} \text{exp} (\frac{1}{2 \tau_1^2} (\alpha_1 - \mu_1)^2) \prod_{i=1}^m \text{Bin}(y_i|n_i,\pi(a_i, \alpha_1, \alpha_2) ) \]

And \(\alpha_2\) can be derived in the same way

Fitting data

To fit Log-logistic model, use hierarchical_bayesian_model() and define type = "log_logistic"

df <- rubella_uk_1986_1987
model <- hierarchical_bayesian_model(age = df$age, pos = df$pos, tot = df$tot, type="log_logistic")
#> 
#> SAMPLING FOR MODEL 'log_logistic' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 6.1e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.61 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 5000 [  0%]  (Warmup)
#> Chain 1: Iteration:  500 / 5000 [ 10%]  (Warmup)
#> Chain 1: Iteration: 1000 / 5000 [ 20%]  (Warmup)
#> Chain 1: Iteration: 1500 / 5000 [ 30%]  (Warmup)
#> Chain 1: Iteration: 1501 / 5000 [ 30%]  (Sampling)
#> Chain 1: Iteration: 2000 / 5000 [ 40%]  (Sampling)
#> Chain 1: Iteration: 2500 / 5000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 3000 / 5000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 3500 / 5000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 4000 / 5000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 4500 / 5000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 5000 / 5000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 4.031 seconds (Warm-up)
#> Chain 1:                2.507 seconds (Sampling)
#> Chain 1:                6.538 seconds (Total)
#> Chain 1:
#> Warning: There were 951 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess

model$type
#> [1] "log_logistic"
plot(model)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.