Meta-analysis

Common paradigm is to treat studies as complete garbage or completely flawless. Include or exclude. Education. Elections.

Robert Grant: Ciaran O Cathain: biomechanics, strain injuries David Dann: german, IT, industry Afroditi Kanellopoulou: greek, biostats, Cochrane

Two books:

Meta-analysis: Structural Equation Modelling approach
Mega SEF Analysis

Scoping review: https://doi.org/10.1111/jebm.12326 Scoping review: https://doi.org/10.11124/JBIES-23-00430

Book: Network Meta-Analysis for Decision Making https://routledge.com/9781032451893 https://bayesian-ma.net

Publish code always. Publish source data if possible. Do sensitivity analysis to test how the model is dependent on assumptions.

Fixed-effects. Common-effects. Overall effect theta. Heterogenity tau^2 Our model: \(y^{~}_j ~ N(u_j + theta, tau)\) Treat \(s_j\) as known values for simplicity. For m studies, it leads to m+2 parameters: u_j, theta and tau^2.

Allow flexibility of priors. Limit impossible values.

Updating old meta-analysis: use the results from previous meta-analysis as priors for theta and tau^2. Standard error of tau^2 is often not reported -- rerun the meta-analysis. Sometimes asymmetric -- logarithmic transformation.

If your prior for theta is based on expert opinion, your posterior is updated expert opinion. Meta-analysis has to be independent. Different numbers in text and tables. Propagate the uncertainty into the model. Continue with the previous style of the team.

From easy to learn but restrictive to hard to learn but flexible:

JASP
bayesmeta (R)
brms (R)
Stata
BUGS/JAGS code (R, Py, etc)
PyMC (Python)
Stan (R, Py, etc)

BUGS/JAGS: dnorm( mean, precision ) precision = 1/ s^2 = 1 / var !!!

model <- lm(y ~ x, data)

For linear regression, all the assumptions for linear model are in the code:

Homoskedasticity: sigma is a fixed parameter we are trying to estimate, not a distribution. real<lower=0> sigma;
Linearity: mu[i] = beta0 + beta1 * x[i];
Independence: we multiply the parameters.
Normality of the residuals

\(y_i = normal(beta0_i + beta1_i * x_i, sigma)\)

data {
  int<lower=1> N;
  array[N] real x;
  array[N] real y;
}

parameters {
  array[N] real beta0;
  array[N] real bata1;
  real<lower=0> sigma;
}

model {
  array[N] real mu;
  for (i in 1:N) {
    mu[i] ~ beta0[i] + beta1[i] * x[i];
    y[i] ~ normal(mu[i], sigma);
  }
}

model <- lmer(formula = y ~ x + (1|random_effect1) + (1|random_effect2), data = df)

data {
  int<lower=1> N;
  array[N] real x;
  array[N] real y;
}

parameters {
  array[N] real beta0;
  array[N] real beta1;
  array[N] real random_effect1;
  array[N] real random_effect2;
}

model {
  for (i in 1:N) {
    y[i] = beta0[i] + beta1[i]* x[i] + random_effect1[i] + random_effect2[i];
  }
}

Aggregate data -- study level
Individual participants level

theta -- overall effect ("global" effect, but this term may be missleading because of selection bias) tau -- heterogenity \(theta_j\) -> sampling \(theta\hat_j\), standard error \(SE_j\)

Shrinkage effect: if overall theta is in the middle, \(theta\hat_j\) is more likely to bumb the middle. theta ~ prior tau ~ prior \(u_j\) ~ prior

\(theta_j\) ~ theta + \(u_j\) -> \(u_j\) ~ N(0,\(tau^2\)) \(theta\hat_j\) -> N(\(theta_j\), SE(\(theta_j\)))

Random effects
Common effect (tau set to 0)

How to get standard error

reported
calculated from confidence interval
calculated from p-values

Doctors dichotomize to responders and non-responders. You can approximate standard error and use different prior for standard error.

Publication bias:

Still garbage-in --- garbage-out. Sometimes it is a good decision to not do a meta-analysis at all, since it is considered on the top of the evidence pyramid. And that could be quite frustrating. But most time it is fun to be a statistician that works with imperfect data. Slightly tricky is fun.

Megastudies.

\(u_j\) is a prior. Some of the studies could could need a different prior. That leads to subgroup analysis (\(s_j\)) and meta-regression: \(theta_j\) = \(theta\) + \(u_j\) + \(s_j\) + \(beta_j\) * \(x_j\). Meta-regression: covariates \(beta_j\) * \(x_j\).

Different time-points: if all reported at 3 months, 9 months and 12 months and that would lead to: \(theta_j,3\), \(theta_j,9\), \(theta_j,12\). But commonly it is reported at different time points. You could suppose that there will be a time-responsive curve.

My questions

role of u_j: helps the sampler to calculate with small steps, but you can model \(theta\hat_j\) directly
how to calculate SE from confidence interval: expect mean +- 1.96 SE
I2: % of variance for \(tau^2\)/(variance-between-studies + variance-within-studies)

library(meta)
derSimonianLaird <- metacont(
         n_treatment,
         mean_treatment,
         sd_treatment,
         n_control,
         mean_control,
         sd_control,
         data=df,
         comb.random=TRUE,
         comb.fixed = FALSE,
         method.tau='DL'
)
summary(derSimonianLaird)
forest(derSimonianLaird)

data {
  int m;
  array[m] real md; //mean difference = theta
  array[m] real se_md; //standard error of the mean difference = tau
}

parameters {
  real<lower=0> tau;
  real theta;
  array[m] real u;
}

model {
  theta ~ normal(0,2);
  tau ~ student_t(1, 0, 1);
  u ~ normal(0, tau);
  for (j in 1:m) {
    //DerSimonian Laird method:
    md[j] ~ normal(theta + u[j], se_md[j]);
    //but better to use student_t md[j] ~ student_t(theta + u[j], se_md[j]); -- Sidik-Jonkman model
  }
}

generated quantities {
  array[m] real theta_u;
  for (j in 1:m) {
    theta_u[j] = theta + u[j];
  }
}

\(md[j]\) is based on a \(se_{md}[j]\) estimate, not a real \(se_{md}\), so it is better to use student_t rather then normal distribution. It makes difference only in studies with small participants, but its better to use student_t by default.

data {
  int N;
  array[N] real mean_difference;
  array[N] real standard_error_of_mean_difference;
  array[N] int n_treatment;
  array[N] int n_control;
}

parameters {
}

model {
}

data {
  int N;
  array[N] real mean_difference;
  array[N

Should report correlation between theta and tau. They should be uncorrelated.

posterior_draws <- fit$draws()
cov(
    c(posterior_draws[,1,2], posterior_draws[,2,2]), #theta (parameter no. 2) for chain 1 and 2
    c(posterior_draws[,1,3], posterior_draws[,2,3]) #tau (parameter no. 3) for chain 1 and 2
)

https://mc-stan.org/docs/

Half distributions: real<lower=0> -> artificialy truncates to 0 if negative. E.g. half normal, half cauchy. real<lower=0> theta; theta ~ normal(0, 1); real<lower=0> theta; theta ~ cauchy(0, 1);

Strong typing: you cannot do division by integer. For converting real to int, use function to_int(). For converting int to real, multiply by 1.0.

SHELF: SHeffield ELicitation Framework
https://shelf.sites.sheffield.ac.uk/
R package

'Uncertain Judgements' book (Wiley)

# set R_BROWSER variable in ~/.Renviron
library(SHELF)
elicit()

Combined prior:

biasprior = 0;
for (i in 1:n_experts) {
  biasprior += beta_pdf(bias | alpha[i], beta[i]);
}
combined_prior += log(biasprior);

Standardised mean differences: converting WMDs to SMDs. But reason and be smart about it.
Convet to units that are comprehensive for the audience.

Contrast-based vs arm-based: https://pubmed.ncbi.nlm.nih.gov/31583750/

One treatment is an anchor and we calculate contrasts to this anchor. Programming trick for the anchor is to fix it very close to zero:

d[1] ~ normal(0, 0.001); //quasi-degenerate, a programming trick

Then you define all other contrasts:

for (i in 2:k) {
  d[i] ~ normal(0, 10);
}

Separate calculation for complete and for missing statistics. Join them by linking the effects in each block.

Divergence -- Funnel problem: parametr drops to zero. Gamma distributions and chi-square are classical priors for standard variations, but the team should be on the same page with using and informative prior.

Treedepth.

Priors as computing aid. But think about assumptions. And do sensitivity analysis for different priors.

Rare events: poisson likelihoods.

For bivariate normal distribution: report sd of posterior means for var1, var2, but also cov matrix(var1, var2).
Maximum likelyhood estimation: variance-covariance matrix. Report correlation.

Meta-Analysis

Caveats

Software

Linear Regression

Frequentist

Stan

Hierarchical model

Frequentist LMER Mixed-effects models

Stan

Meta-Analysis

My questions

Frequentist Meta-Analysis -- DerSimonian-Laird Method

Stan for Bayesian derSimonian-Laird model

Bayesian Sidik-Jonkman model

Uptating old meta-analysis based on new data

Stan

Eliciting priors

Different scales

Network Meta-Analysis

Ureported Statistics

Troubleshooting

Report