Perform a Bayesian test for differences between two (sub)groups on multiple binary outcomes using multilevel multinomial logistic regression, as described in Kavelaars et al. (2023) .
Usage
bglmm(
data,
grp,
grp_a = NULL,
grp_b = NULL,
id_var,
x_var,
y_vars,
x_method = c("Empirical", "Analytical", "Value"),
x_def = c(-Inf, Inf),
test = c("right_sided", "left_sided"),
rule = c("All", "Any", "Comp"),
w = NULL,
n_burn = 10000,
n_it = 50000,
start = c(0.5, 1),
fixed = NULL,
random = NULL,
b_mu0 = NULL,
b_sigma0 = NULL,
g_mu0 = NULL,
g_sigma0 = NULL,
nu0 = NULL,
tau0 = NULL,
n_chain = 2,
return_thinned = TRUE,
n_thin = 10,
return_diagnostics = TRUE,
return_diagnostic_plots = FALSE,
return_samples = FALSE
)Arguments
- data
Data frame containing the data.
- grp
Character string. Name of the grouping variable.
- grp_a
Value of
grpindicating first group (will be determined from factor levels if NULL).- grp_b
Value of
grpindicating second group (will be determined from factor levels if NULL).- id_var
Character string. Name of cluster/ID variable.
- x_var
Character string. Name of covariate variable.
- y_vars
Character vector. Names of outcome variables (currently supports 2 outcomes).
- x_method
Character. Method for handling covariate. Default is "Empirical".
- x_def
Numeric. Defines subpopulation. Default is
c(-Inf, Inf).- test
Character. Direction of test: "left_sided" for P(A>B) or "right_sided" for P(B>A). Default is "right_sided".
- rule
Character. Decision rule: "All" (all outcomes favor hypothesis), "Any" (at least one outcome favors hypothesis), or "Comp" (weighted combination). Default is "All".
- w
Numeric vector. Weights for compensatory rule. Only used if rule = "Comp". If NULL and rule = "Comp", equal weights are used. Default is NULL.
- n_burn
Integer. Number of burn-in iterations. Default is 10000.
- n_it
Integer. Number of MCMC iterations. Default is 50000 (takes long running time!).
- start
Numeric vector. Starting values for chains. Default is
c(0.5, 1).- fixed
Character vector. Names of fixed effect variables. Default is c(x_var, grp_x_var).
- random
Character vector. Names of random effect variables. Default is c("Intercept", grp).
- b_mu0
Numeric vector. Prior means for fixed effects. Default is
rep(0, length(fixed)).- b_sigma0
Matrix. Prior covariance for fixed effects. Default is
diag(0.1, length(fixed)).- g_mu0
Numeric vector. Prior means for random effects. Default is
rep(0, length(random)).- g_sigma0
Matrix. Prior covariance for random effects. Default is
diag(0.1, length(random)).- nu0
Numeric. Prior degrees of freedom for inverse-Wishart. Default is
length(random).- tau0
Matrix. Prior scale matrix of dimension
length(random) x length(random)for inverse-Wishart. Default isdiag(1e-1, length(random)).- n_chain
Integer. Number of MCMC chains. Default is 2.
- return_thinned
Logical. Return thinned chains? Default is TRUE.
- n_thin
Integer. Thinning interval. Default is 10.
- return_diagnostics
Logical. Return MCMC diagnostics? Default is TRUE.
- return_diagnostic_plots
Logical. Should MCMC chains for diagnostic plots (traceplots, autocorrelation, density) be returned? Default is
FALSE. IfTRUE, diagnostics are returned by default.- return_samples
Logical. Return posterior samples? Default is FALSE.
Value
An object of class bglmm, a list containing:
- estimates
A list with posterior means and standard deviations of group probabilities (
mean_a,mean_b,sd_a,sd_b). If estimated, posterior means and standard deviations of fixed effects (b,b_sd) and random effects and variance components (g,g_sd,tau,tau_sd) are included.- sample_sizes
A list with group sample sizes (
n_a,n_b) and the number of clusters (J).- delta
A list with posterior mean differences (
mean_delta), posterior standard errors (se_delta), posterior probability of the hypothesis (pop), and, ifrule = "Comp", the weighted difference (w_delta).- info
A list with prior specifications, model structure (fixed and random effects), test settings, group labels, covariate handling method, and subpopulation definition.
- diags
If diagnostics are requested, a list with MCMC diagnostic results for fixed effects, random effects, and variance components.
- samples
If
return_samples = TRUE, a list containing posterior draws of group probabilities, differences, fixed effects, random effects, and variance components (if applicable).
References
Kavelaars X, Mulder J, Kaptein M (2023). “Bayesian multilevel multivariate logistic regression for superiority decision-making under observable treatment heterogeneity.” BMC Medical Research Methodology, 23(1). doi:10.1186/s12874-023-02034-z .
Examples
# \donttest{
# Example with simulated data
# Generate data
set.seed(123)
J <- 20 # No. clusters
nJ <- 15 # Sample size per cluster
# Generate random intercepts
uj_1 <- rnorm(J)
uj_2 <- rnorm(J)
data <- data.frame(
id = factor(rep(1:J, each = nJ)),
group = rep(rep(c("A", "B"), each = J/2), each = nJ),
x = rnorm(J * nJ),
stringsAsFactors = FALSE
)
p1 <- p2 <- rep(NA, J * nJ)
for (i in 1:(J * nJ)) {
j <- as.numeric(data$id[i])
grpB <- ifelse(data$group[i] == "B", 1, 0)
p1[i] <- plogis(-0.50 + 0.75 * grpB + 0.10 * data$x[i] + 0.20 * grpB * data$x[i] + uj_1[j])
p2[i] <- plogis(-0.50 + 0.80 * grpB + 0.05 * data$x[i] + 0.15 * grpB * data$x[i] + uj_2[j])
data$y1[i] <- rbinom(1, 1, p1[i])
data$y2[i] <- rbinom(1, 1, p2[i])
}
# Analyze
result <- bglmm(
data = data,
grp = "group",
grp_a = "A",
grp_b = "B",
id_var = "id",
x_var = "x",
y_vars = c("y1", "y2"),
x_method = "Empirical",
x_def = c(-Inf, Inf),
fixed = c("group", "x", "group_x"),
random = c("Intercept"), # Random intercept model
test = "right_sided",
rule = "All",
n_burn = 100, # Too low for proper MCMC sampling
n_it = 500 # Too low for proper MCMC sampling
)
#> Warning: 20 cluster(s) have observations from only one group. This may affect estimation.
#> Warning: Low effective sample size for some parameters. Results may be unreliable.
#> Warning: Low effective sample size for some parameters. Results may be unreliable.
#> Warning: Low effective sample size for some parameters. Results may be unreliable.
#> Warning: MCMC chains may not have converged (MPSRF > 1.1). Consider increasing n_burn or n_it.
print(result) # Warnings due to low number of MCMC iterations (n_burn and n_it)
#>
#> Bayesian Multilevel Multivariate Logistic Regression
#> ======================================================
#>
#> Group mean y1 mean y2
#> A 0.448 0.341
#> B 0.631 0.623
#> J = 20 clusters n(A) = 150 n(B) = 150
#>
#> Posterior probability P(B > A) [All rule]: 1.000
#> Marginalization: Empirical over [-Inf, Inf]
#> MPSRF: fixed = 1.1516 random = 1.1371 variance = 1.0614
#>
#> Use summary() for full coefficient tables, priors and MCMC diagnostics.
#>
# }