Note: All shown code can be copied by simply clicking the top right icon of the code block. The code blocks from the main presentation can also be copied by the same icon, which is visible when hovering over the respective block with your mouse.
For the course a RStudio server has been setup by the University of Göttingen: http://134.76.20.1:8787
To run exercises on your hardware please follow the instructions below.
Please first install the cmdstanr
R package following the instructions from their Getting started with CmdStanR article. This also covers installation of the necessary compiler and accompanying C++ toolchain as well as cmdstan
itself.
An alternative is to use rstan
as described on the wiki page for rstan
to get started. In this case, please use as backend for brms
the setting rstan
. Note that the use of the cmdstanr
backend is recommended over rstan
allowing to use more recent versions of Stan. Moreover, the instructions for installing a C++ toolchain is better described as part of the respective cmdstan guide chapter.
cmdstanr
can automatically install the RTools
based C++ toolchain viamacrtools
provides a comprehensive and simple installation of a full toolchain as used by CRAN https://mac.thecoatlessprofessor.com/macrtools/Please also ensure to have all R packages installed as listed on the next preliminary code section.
To download the full web-site to complement the material you may do so using the following R code snippet:
bamdd_zip <- tempfile(fileext=".zip")
download.file("https://github.com/Novartis/bamdd/archive/refs/heads/main.zip", bamdd_zip)
## extracts web-site into the users home
unzip(bamdd_zip, exdir=normalizePath("~"))
browseURL(normalizePath(file.path("~", "bamdd-main")))
# to install all dependencies needed to build the web-site, please run
source(file.path("~", "bamdd-main", "src", "install_dependencies.R"))
First load the required packages and set some options as described in the case study.
# uncomment below and set to the filename of you R file for the
# exercises
# here::i_am("your-exercise-file.R")
library(here)
library(ggplot2)
library(dplyr)
library(knitr)
library(brms)
library(posterior)
library(tidybayes)
library(bayesplot)
library(RBesT)
library(ggrepel)
library(patchwork)
library(ggdist)
options(
# how many processor cores would you like to use?
mc.cores = 4,
# how would you like to access Stan?
brms.backend = "cmdstanr",
# cache model binaries
cmdstanr_write_stan_file_dir=here::here("_brms-cache"),
# no need to normalize likelihoods
brms.normalize = FALSE
)
dir.create(here::here("_brms-cache"), FALSE)
set.seed(254335)
# in case rstan is used as backend, consider to enable model caching
# by enabling the following line
# rstan_options(auto_write = TRUE)
# Set defaults for ggplot2 ----
theme_set(theme_bw(base_size=18))
scale_colour_discrete <- function(...) {
# Alternative: ggsci::scale_color_nejm(...)
# scale_colour_brewer(..., palette="Set1")
ggthemes::scale_colour_colorblind(...)
}
scale_fill_discrete <- function(...) {
# Alternative: ggsci::scale_fill_nejm(...)
#scale_fill_brewer(..., palette="Set1")
ggthemes::scale_fill_colorblind(...)
}
scale_colour_continuous <- function(...) {
scale_colour_viridis_c(..., option="turbo")
}
update_geom_defaults("point", list(size=2))
update_geom_defaults("line", list(size=1.5))
Create the region dataset used in the example:
withr::with_seed(654873,
AS_region <- bind_cols(RBesT::AS, region=sample(c("asia", "europe", "north_america"), 8, TRUE)))
kable(AS_region)
study | n | r | region |
---|---|---|---|
Study 1 | 107 | 23 | europe |
Study 2 | 44 | 12 | asia |
Study 3 | 51 | 19 | europe |
Study 4 | 39 | 9 | europe |
Study 5 | 139 | 39 | asia |
Study 6 | 20 | 6 | north_america |
Study 7 | 78 | 9 | europe |
Study 8 | 35 | 10 | asia |
And fit the model as shown in the case study:
model <- bf(r | trials(n) ~ 1 + (1 | study), family=binomial)
model_prior <- prior(normal(0, 2), class=Intercept) +
prior(normal(0, 1), class=sd, coef=Intercept, group=study)
map_mc_brms <- brm(model, AS_region, prior = model_prior,
seed = 4767, control=list(adapt_delta=0.99),
silent = 2, refresh = 0)
map_mc_brms
Family: binomial
Links: mu = logit
Formula: r | trials(n) ~ 1 + (1 | study)
Data: AS_region (Number of observations: 8)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Multilevel Hyperparameters:
~study (Number of levels: 8)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.37 0.20 0.04 0.83 1.01 938 1082
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -1.10 0.18 -1.45 -0.74 1.00 1416 1550
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Create a posterior predictive check based on the predictive distribution for the response rate.
Steps:
posterior_predict
to create samples from the predictive distribution of outcomes per trial.sweep(predictive, 2, AS_region$n, "/")
to convert these samples from the predictive distribution of the outcome counts to samples from the predictive distribution for responder rates.ppc_intervals
from bayesplot
to create a plot showing your results. Consider using the with
function to refer more easily to the data columns of the analysis data set.tidybayes
package the add_predicted_rvars
function adding the predictions directly to the analysis data set and then redo task 2 and 3. To convert a rvars
column to the respective draws format of bayesplot
you can use the as_draws_matrix
from the posterior
package.Redo the analysis with region, but treat region as a fixed effect. Evaluate the informativeness of the obtained MAP priors. The model formula for brms
should look like
Steps:
Consider the prior for the region fixed effect first. The reference region is included in the intercept. The reference region is implicitly defined by the first level of the variable region when defined as factor
.
asia
to be the reference region in the example. Also include a level other
in the set of levels.Obtain the MAP prior for each region by using the AS_region_all
data frame defined below and apply posterior_linpred
as shown in the case study.
Convert the MCMC samples from the MAP prior distribution into mixture distributions with the same code as in the case study.
Calculate the effective sample size (ESS) for each prior distribution with the ess
function from the RBesT
R package.
Run the analysis for the normal endpoint in the crohn
data set of RBesT
. Refer to the RBesT
vignette for a normal endpoint on more details and context.
Steps:
family=gaussian
and use the se
response modifier in place of trials
to specify a known standard error. More details on the additional response information specifiers can be found for the documentation of brmsformula.RBesT
.RBesT
and brms
.pp_count_AS_region <- posterior_predict(map_mc_brms)
pp_rate_AS_region <- sweep(pp_count_AS_region, 2, AS_region$n, "/")
with(AS_region, ppc_intervals(r / n, pp_rate_AS_region)) +
scale_x_continuous("Study", breaks=1:nrow(AS_region), labels=AS_region$study) +
ylab("Responder Rate") +
coord_flip() +
theme(legend.position="right",
# suppress vertical grid lines for better readability of intervals
panel.grid.major.y = element_blank())
pp_AS_region <- AS_region |>
add_predicted_rvars(map_mc_brms, value="pp_r") |>
mutate(pp_rate=pp_r/n)
with(pp_AS_region, ppc_intervals(r / n, as_draws_matrix(pp_rate))) +
scale_x_continuous("Study", breaks=1:nrow(AS_region), labels=AS_region$study) +
ylab("Responder Rate") +
coord_flip() +
theme(legend.position="right",
# suppress vertical grid lines for better readability of intervals
panel.grid.major.y = element_blank())
AS_region_all <- data.frame(region=c("asia", "europe", "north_america", "other")) |>
mutate(study=paste("new_study", region, sep="_"), r=0, n=6)
## to get brms to include the other factor level in the model, we have
## to add a fake row with region "other" and n=0
AS_region_2 <- mutate(bind_rows(AS_region, mutate(AS_region_all, n=0)[4,]),
region=factor(region, levels=c("asia", "europe", "north_america", "other")))
str(AS_region_2)
'data.frame': 9 obs. of 4 variables:
$ study : chr "Study 1" "Study 2" "Study 3" "Study 4" ...
$ n : num 107 44 51 39 139 20 78 35 0
$ r : num 23 12 19 9 39 6 9 10 0
$ region: Factor w/ 4 levels "asia","europe",..: 2 1 2 2 1 3 2 1 4
model_fixed <- bf(r | trials(n) ~ 1 + region + (1|study), family=binomial)
get_prior(model_fixed, AS_region_2)
prior class coef group resp dpar nlpar lb ub
(flat) b
(flat) b regioneurope
(flat) b regionnorth_america
(flat) b regionother
student_t(3, 0, 2.5) Intercept
student_t(3, 0, 2.5) sd 0
student_t(3, 0, 2.5) sd study 0
student_t(3, 0, 2.5) sd Intercept study 0
source
default
(vectorized)
(vectorized)
(vectorized)
default
default
(vectorized)
(vectorized)
model_fixed_prior <- prior(normal(0, 2), class=Intercept) +
prior(normal(0, 1), class=sd, coef=Intercept, group=study) +
prior(normal(0, log(2)/1.96), class=b)
fixed_mc_brms <- brm(model_fixed, AS_region_2, prior=model_fixed_prior,
seed=4767,
silent = 2, refresh = 0, control=list(adapt_delta=0.99))
fixed_mc_brms
Family: binomial
Links: mu = logit
Formula: r | trials(n) ~ 1 + region + (1 | study)
Data: AS_region_2 (Number of observations: 9)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Multilevel Hyperparameters:
~study (Number of levels: 9)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.38 0.22 0.04 0.90 1.00 1159 1208
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -1.02 0.24 -1.51 -0.56 1.00 1957 1971
regioneurope -0.17 0.26 -0.67 0.35 1.00 2020 2371
regionnorth_america 0.04 0.31 -0.56 0.64 1.00 3893 2962
regionother -0.00 0.36 -0.69 0.70 1.00 3953 2920
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
post_regions <- posterior_linpred(fixed_mc_brms,
newdata=AS_region_all,
transform=TRUE,
allow_new_levels=TRUE,
sample_new_levels="gaussian")
head(post_regions)
[,1] [,2] [,3] [,4]
[1,] 0.30404914 0.1931906 0.3161137 0.2367646
[2,] 0.07039955 0.2106194 0.1659046 0.3036233
[3,] 0.21368669 0.1590512 0.3322744 0.2683783
[4,] 0.37494614 0.2432188 0.3392217 0.2402818
[5,] 0.20217845 0.2834231 0.1709087 0.3674437
[6,] 0.26282384 0.3563941 0.2888155 0.2961940
colnames(post_regions) <- AS_region_all$region
map_region <- list()
for(r in AS_region_all$region) {
map_region[[r]] <- mixfit(post_regions[,r], type="beta", Nc=3, constrain_gt1=TRUE)
}
kable(bind_rows(lapply(map_region, summary), .id="MAP"), digits=3)
MAP | mean | sd | 2.5% | 50.0% | 97.5% |
---|---|---|---|---|---|
asia | 0.278 | 0.096 | 0.111 | 0.269 | 0.522 |
europe | 0.244 | 0.090 | 0.105 | 0.232 | 0.486 |
north_america | 0.286 | 0.108 | 0.109 | 0.276 | 0.549 |
other | 0.282 | 0.116 | 0.096 | 0.268 | 0.561 |
asia europe north_america other
28.94846 34.15330 19.51994 15.76238
crohn <- RBesT::crohn
crohn_sigma <- 88
crohn$y.se <- crohn_sigma/sqrt(crohn$n)
library(RBesT)
set.seed(1234)
rbest_normal_map_mcmc <- gMAP(cbind(y, y.se) ~ 1 | study,
weights=n, data=crohn,
family=gaussian,
beta.prior=cbind(0, crohn_sigma),
tau.dist="HalfNormal",tau.prior=cbind(0,crohn_sigma/2))
model_normal <- bf(y | se(y.se) ~ 1 + (1 | study), family=gaussian())
prior_normal <- prior(normal(0, 88), class=Intercept) +
prior(normal(0, 88/2), class=sd, coef=Intercept, group=study)
brms_normal_map_mcmc <- brm(model_normal, crohn, prior=prior_normal,
seed=4767,
silent = 2, refresh = 0, control=list(adapt_delta=0.99))
## comparing the outputs we see that the random effect posterior
## matches...
rbest_normal_map_mcmc
Generalized Meta Analytic Predictive Prior Analysis
Call: gMAP(formula = cbind(y, y.se) ~ 1 | study, family = gaussian,
data = crohn, weights = n, tau.dist = "HalfNormal", tau.prior = cbind(0,
crohn_sigma/2), beta.prior = cbind(0, crohn_sigma))
Exchangeability tau strata: 1
Prediction tau stratum : 1
Maximal Rhat : 1
Estimated reference scale : 88
Between-trial heterogeneity of tau prediction stratum
mean sd 2.5% 50% 97.5%
14.50 9.74 1.33 12.60 39.60
MAP Prior MCMC sample
mean sd 2.5% 50% 97.5%
-49.7 19.7 -91.4 -48.3 -10.7
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y | se(y.se) ~ 1 + (1 | study)
Data: crohn (Number of observations: 6)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Multilevel Hyperparameters:
~study (Number of levels: 6)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 14.50 9.80 1.57 39.37 1.00 928 1522
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -49.80 8.77 -68.88 -33.19 1.00 977 971
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.00 0.00 0.00 0.00 NA NA NA
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
brms_normal_map <- posterior_epred(brms_normal_map_mcmc,
newdata=data.frame(study="new", n=1, y=0, y.se=88),
allow_new_levels=TRUE,
sample_new_levels="gaussian")
## ... and the MAP prior is also the same
summarise_draws(brms_normal_map, mean, sd, ~quantile2(., probs = c(0.025, 0.5, 0.975)))
# A tibble: 1 × 6
variable mean sd q2.5 q50 q97.5
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ...1 -49.4 18.7 -88.7 -48.3 -10.1
This exercises accompanies the dose finding case study.
In a hypothetical phase 2b trial children and adolescents with peanut allergy were randomly assigned to placebo or 5 doses (5 to 300 mg) of a new drug.
After 26 weeks the patients underwent a double-blind placebo-controlled food challenge and the primary endpoint was the proportion of patients that could ingest 600 mg or more of peanut protein without experiencing dose-limiting symptoms.
The plots below show an overview of the data. Note that no placebo patients were considered “responders” per the primary endpoint definition.
peanut <- tibble(
TRT01P = structure(c(
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
),levels = c("PBO", "5 mg", "15 mg", "50 mg", "150 mg", "300 mg"), class = "factor"),
dose = c(
300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L,
300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L,
300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 300L, 150L, 150L, 150L, 150L, 150L, 150L, 150L,
150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L,
150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L,
150L, 150L, 150L, 150L, 150L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L,
50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L,
50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L,
15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L,
15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L),
AVAL = c(
1000L, 100L, 0L, 600L, 600L, 300L, 300L, 10L, 10L, 100L, 1000L, 300L, 10L, 1000L, 1000L, 1000L, 3L, 3L, 600L, 300L,
1000L, 1000L, 100L, 1000L, 600L, 600L, 1000L, 1000L, 1000L, 300L, 600L, 1000L, 1000L, 1000L, 1000L, 1000L, 300L,
1000L, 600L, 1000L, 300L, 1000L, 600L, 600L, 1000L, 1000L, 600L, 1000L, 1000L, 1000L, 1000L, 600L, 1000L, 1000L,
1000L, 0L, 600L, 300L, 1000L, 1000L, 1000L, 100L, 1000L, 1000L, 600L, 1000L, 1000L, 1000L, 600L, 300L, 600L, 600L,
100L, 600L, 1000L, 300L, 10L, 3L, 1000L, 300L, 300L, 1000L, 300L, 300L, 10L, 1000L, 1000L, 10L, 10L, 600L, 3L, 10L,
600L, 600L, 600L, 1000L, 100L, 1000L, 1000L, 10L, 1000L, 3L, 10L, 1000L, 100L, 1000L, 100L, 10L, 300L, 10L, 100L,
1000L, 0L, 1000L, 100L, 3L, 10L, 100L, 100L, 300L, 1000L, 1000L, 1000L, 10L, 1000L, 3L, 3L, 600L, 600L, 10L, 3L,
600L, 600L, 300L, 300L, 1000L, 3L, 3L, 1000L, 10L, 1000L, 1000L, 600L, 100L, 300L, 600L, 10L, 100L, 0L, 100L, 3L,
0L, 10L, 3L, 600L, 300L, 300L, 300L, 600L, 300L, 100L, 3L, 0L, 10L, 600L, 300L, 10L, 300L, 600L, 1000L, 600L, 600L,
0L, 0L, 600L, 600L, 600L, 0L, 0L, 300L, 100L, 0L, 10L, 300L, 1000L, 300L, 600L, 600L, 300L, 10L, 600L, 100L, 100L,
300L, 3L, 3L, 300L, 1000L, 10L, 3L, 100L, 3L, 100L, 100L, 300L, 3L, 3L, 600L, 300L, 3L, 3L, 3L, 300L, 3L, 0L, 10L,
3L, 300L, 10L, 10L, 600L, 0L, 300L, 600L, 0L, 0L, 100L, 100L, 10L, 100L, 10L, 100L, 600L, 0L, 600L, 0L, 10L, 100L,
0L, 0L, 0L, 0L, 3L, 10L, 3L, 300L, 600L, 0L, 0L, 300L, 10L, 10L, 100L, 300L, 0L, 0L, 0L, 3L, 0L, 0L, 10L, 0L, 300L,
3L, 0L, 0L, 0L, 0L, 100L, 3L, 0L, 3L, 10L, 10L, 3L, 0L, 3L, 10L, 3L, 0L, 0L, 3L, 100L, 0L, 0L, 300L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 3L, 0L, 3L, 0L, 0L, 0L, 0L),
PARAM = structure(
c(
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
levels = "Tolerated amount of peanut protein (mg)", class = "factor"),
USUBJID = c(
129L, 103L, 104L, 185L, 34L, 15L, 140L, 151L, 7L, 144L, 200L, 23L, 138L, 115L, 255L, 147L, 224L, 101L, 156L, 284L,
136L, 248L, 179L, 298L, 168L, 295L, 289L, 241L, 36L, 27L, 123L, 266L, 11L, 291L, 236L, 130L, 173L, 195L, 203L, 160L,
274L, 167L, 290L, 94L, 9L, 269L, 122L, 135L, 64L, 26L, 95L, 10L, 5L, 234L, 161L, 299L, 88L, 69L, 35L, 233L, 286L,
85L, 91L, 189L, 80L, 152L, 223L, 287L, 244L, 57L, 108L, 18L, 62L, 157L, 300L, 283L, 164L, 243L, 89L, 220L, 24L, 271L,
166L, 118L, 201L, 127L, 121L, 41L, 267L, 213L, 49L, 73L, 202L, 134L, 112L, 25L, 227L, 29L, 251L, 273L, 119L, 132L,
74L, 270L, 83L, 37L, 181L, 258L, 253L, 48L, 120L, 54L, 277L, 176L, 65L, 264L, 107L, 171L, 262L, 162L, 187L, 272L,
288L, 294L, 245L, 109L, 172L, 204L, 275L, 22L, 66L, 186L, 247L, 17L, 149L, 141L, 177L, 280L, 216L, 40L, 75L, 263L,
246L, 14L, 81L, 260L, 153L, 45L, 237L, 8L, 117L, 86L, 296L, 146L, 154L, 116L, 38L, 100L, 191L, 175L, 92L, 158L, 192L,
180L, 256L, 254L, 125L, 222L, 145L, 261L, 155L, 159L, 3L, 206L, 278L, 19L, 31L, 63L, 208L, 55L, 259L, 218L, 111L,
226L, 33L, 2L, 44L, 297L, 53L, 87L, 16L, 28L, 90L, 207L, 56L, 137L, 128L, 178L, 142L, 143L, 148L, 193L, 229L, 265L,
97L, 252L, 205L, 150L, 165L, 188L, 52L, 99L, 93L, 1L, 221L, 124L, 210L, 6L, 232L, 21L, 211L, 163L, 96L, 60L, 183L,
190L, 242L, 42L, 46L, 67L, 126L, 209L, 72L, 194L, 238L, 184L, 39L, 105L, 249L, 61L, 113L, 30L, 77L, 12L, 4L, 51L,
139L, 20L, 268L, 215L, 292L, 217L, 199L, 32L, 276L, 47L, 225L, 230L, 79L, 71L, 98L, 50L, 13L, 76L, 231L, 250L, 58L,
68L, 239L, 198L, 293L, 212L, 110L, 59L, 182L, 133L, 170L, 43L, 282L, 281L, 131L, 114L, 196L, 214L, 285L, 70L, 102L,
279L, 106L, 197L, 257L, 228L, 84L, 235L, 78L, 169L, 240L, 219L, 174L, 82L),
CRIT1 = structure(c(
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
levels = "Tolerating >=600 mg of peanut protein without dose-limiting symptoms", class = "factor"),
CRIT1FL = structure(c(
2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L,
1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L,
1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L,
2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
levels = c("N", "Y"), class = "factor"),
CRIT2 = structure(c(
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
levels = "Tolerating >=1000 mg of peanut protein without dose-limiting symptoms", class = "factor"),
CRIT2FL = structure(c(
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L,
2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L,
1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L,
1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
levels = c("N", "Y"), class = "factor" ) )
peanut |>
ggplot(aes(x=factor(AVAL), fill=TRT01P)) +
geom_bar( position=position_dodge(preserve = "single")) +
scale_fill_manual(values=c("royalblue", "#fb6a4a", "#ef3b2c", "#cb181d", "#a50f15", "#67000d")) +
xlab("Peanut protein tolerated without dose-limiting symptoms (mg)") +
ylab("Patients") +
theme(legend.position="right")
peanut |>
group_by(TRT01P) |>
summarize(proportion = sum(CRIT1FL=="Y") / n()) |>
ggplot(aes(x=TRT01P, y=proportion, fill=TRT01P)) +
geom_bar(stat="identity", position=position_dodge()) +
geom_text(aes(label=scales::percent(proportion)), size=7, vjust=c(0, rep(1.5, 5))) +
scale_fill_manual(values=c("royalblue", "#fb6a4a", "#ef3b2c", "#cb181d", "#a50f15", "#67000d")) +
scale_y_continuous(breaks=seq(0, 0.7, 0.1), labels=scales::percent) +
xlab("Tolerating >=600 mg of peanut protein without dose-limiting symptoms") +
ylab("Percentage of patients")
Fit a sigmoid Emax logistic regression model (i.e. family=binomial(link="logit")
). To accelerate likelihood evaluations, first summarize data as “responders” y
out of n
patients per dose. We tell brms
to use this information using y | trials(n) ~ ...
.
We expect the true placebo proportion to be around 0.05 (-2.944 on the logit scale) with much more than 0.1 or much less than 0.02 considered unlikely. It is a-priori at least possible that there is a huge treatment effect such as 95% versus 5% responders (difference on the log-scale close to 6), but we are at least mildly skeptical.
The dose response is a-priori somewhat likely to follow a Emax curve (with Hill parameter near 1), but we wish to allow for the possibility of a steeper or shallower curve. The dose with half the effect (ED50) might be near 15 mg, but have considerable uncertainty around that.
model1 <- bf( #---- YOUR CODE HERE --- # ~ E0 + Emax * dose^h/(dose^h + ED50^h),
family= #---- YOUR CODE HERE --- # ,
nlf(h ~ exp(logh)), nlf(ED50 ~ exp(logED50)),
E0 ~ 1, logED50 ~ 1, logh ~ 1, Emax ~ 1,
nl=TRUE)
priors1 <- prior(normal(-2.944, 0.38), nlpar=E0) +
prior(normal(0, 1), nlpar=logh) +
prior(normal(0, 6), nlpar=Emax) +
prior(normal(2.7, 2.5), nlpar=logED50)
Now fit the model
Now use
to obtain predicted proportions for every dose from 0 to 300 mg.
Then, plot the curve of predicted proportions for each of these doses using ggplot2
and ggdist
(using ydist=.epred
in the aesthetics and the stat_lineribbon()
geom from the ggdist
package).
If you prefer, replace the default fill colors e.g. with shades of blue using + scale_fill_brewer(palette="Blues")
Next obtain the posterior distribution for the difference in proportions for each of these dose levels compared with placebo.
Try to perform a more efficient statistical analysis by either using an ordinal outcome or treating the data as interval censored.
Try treating the data about the logarithm of the amount of protein tolerated as interval censored (-\infty to < \log(3), >= \log(3) to < \log(10), …) using logAVAL | cens("interval", logAVALnext)
as endpoint and with family = gaussian()
. For the latter approach you might have to set log(0)
to a very low number such as -28 (approximate \log(\text{weight}) of one single peanut protein molecule) and assume the upper interval end when a patient could tolerate 1000 mg to, say, \log(30000) (approximately 100 peanut kernels). This achieves approximately the same thing as setting these to -\infty and \infty, but is handled better by brms
/Stan
.
How would you change the prior distributions in this case?
Now use an ordinal outcome via family = cumulative(link = "logit", threshold = "flexible")
. Details of this model are described in a journal article, for which there is also a PsyArXiv preprint in case you do not have access to the journal.
Note that the placebo response is described by a series of K-1 ordered thresholds \theta_1, \ldots, \theta_{K-1} on the logit scale for K ordered categories. For the placebo group, the probability of being in categories k=1, \ldots, K-1 is be given by \text{logit}^{-1} \theta_k and for category K by 1-\text{logit}^{-1} \theta_{K-1}. In this case, these parameters will appear in summary(brmfit3)
as Intercept[1]
, …, Intercept[6]
.
How would you change the model and the prior distributions in this case?
Note, if we had a baseline amount of protein tolerated, we could treat this as a monotonic covariate e.g. using mo(BASE)
or assume a particular functional form.
model1 <- bf( y | trials(n) ~ E0 + Emax * dose^h/(dose^h + ED50^h),
family=binomial(link="logit"),
nlf(h ~ exp(logh)), nlf(ED50 ~ exp(logED50)),
E0 ~ 1, logED50 ~ 1, logh ~ 1, Emax ~ 1,
nl=TRUE)
priors1 <- prior(normal(-2.944, 0.38), nlpar=E0) +
prior(normal(0, 1), nlpar=logh) +
prior(normal(0, 6), nlpar=Emax) +
prior(normal(2.7, 2.5), nlpar=logED50)
brmfit1 <- brm(
formula = model1,
prior = priors1,
data = summarize(peanut, y=sum(CRIT1FL=="Y"), n=n(), .by=c(TRT01P, dose)),
refresh=0
)
Running MCMC with 4 parallel chains...
Chain 1 finished in 0.2 seconds.
Chain 2 finished in 0.2 seconds.
Chain 3 finished in 0.2 seconds.
Chain 4 finished in 0.2 seconds.
All 4 chains finished successfully.
Mean chain execution time: 0.2 seconds.
Total execution time: 0.5 seconds.
Family: binomial
Links: mu = logit
Formula: y | trials(n) ~ E0 + Emax * dose^h/(dose^h + ED50^h)
h ~ exp(logh)
ED50 ~ exp(logED50)
E0 ~ 1
logED50 ~ 1
logh ~ 1
Emax ~ 1
Data: summarize(peanut, y = sum(CRIT1FL == "Y"), n = n() (Number of observations: 6)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
E0_Intercept -3.22 0.32 -3.87 -2.59 1.00 1810 2055
logED50_Intercept 3.69 1.23 1.99 6.56 1.00 973 1049
logh_Intercept -0.62 0.38 -1.31 0.16 1.00 994 1295
Emax_Intercept 5.56 1.63 3.52 9.65 1.00 885 1004
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
tibble(dose = seq(0, 300, 1), n=1) |>
tidybayes::add_epred_rvars(brmfit1) |>
ggplot(aes(x=dose, ydist=.epred)) +
stat_lineribbon() +
scale_fill_brewer(palette="Blues") +
ylab("Model predicted proportion") +
xlab("Dose [mg]")
pr <- tibble(dose = seq(0, 300, 1), n=1) |>
tidybayes::add_epred_rvars(brmfit1)
left_join(pr, filter(pr, dose == 0)|>
dplyr::select(-dose) |>
rename(.pbo=.epred),
by="n") |>
mutate(diff = .epred - .pbo) |>
ggplot(aes(x=dose, ydist=diff)) +
stat_lineribbon() +
scale_fill_brewer(palette="Blues") +
ylab("Model predicted difference\nin proportion vs. placebo") +
xlab("Dose [mg]")
model2 <- bf( logAVAL | cens("interval", logAVALnext) ~ E0 + Emax * dose^h/(dose^h + ED50^h),
family = gaussian(),
nlf(h ~ exp(logh)), nlf(ED50 ~ exp(logED50)),
E0 ~ 1, logED50 ~ 1, logh ~ 1, Emax ~ 1,
nl=TRUE )
priors2 <- prior(normal(0, 5), nlpar=E0) + # centered on 1 mg, but
# very uncertain
prior(normal(0, 1), nlpar=logh) + # No particular reason
# to change prior (dose
# response could be
# just as steep here as
# for binomial outcome)
prior(normal(0, 6.5), nlpar=Emax) + # decent prior probability
# for >= 1000 mg, still conceivable could reach 300000 No
# particular reason to change prior here (ED50 could be same as
# for binomial outcome)
prior(normal(2.7, 2.5), nlpar=logED50) + # Additional parameter: residual SD, clearly a decent amount of
# variability, want to
# allow large SD on
# the log-scale
prior(lognormal(5, 10), class=sigma)
brmfit2 <- brm(
formula = model2,
prior = priors2,
data = peanut |>
mutate(logAVAL = ifelse(AVAL==0, -28, log(AVAL)),
logAVALnext = case_when(AVAL==0 ~ log(3),
AVAL==3 ~ log(10),
AVAL==10 ~ log(100),
AVAL==100 ~ log(300),
AVAL==300 ~ log(600),
AVAL==600 ~ log(1000),
TRUE ~ log(30000))),
refresh=0
)
Running MCMC with 4 parallel chains...
Chain 4 finished in 4.6 seconds.
Chain 3 finished in 4.6 seconds.
Chain 2 finished in 4.8 seconds.
Chain 1 finished in 5.1 seconds.
All 4 chains finished successfully.
Mean chain execution time: 4.8 seconds.
Total execution time: 5.2 seconds.
Family: gaussian
Links: mu = identity; sigma = identity
Formula: logAVAL | cens("interval", logAVALnext) ~ E0 + Emax * dose^h/(dose^h + ED50^h)
h ~ exp(logh)
ED50 ~ exp(logED50)
E0 ~ 1
logED50 ~ 1
logh ~ 1
Emax ~ 1
Data: mutate(peanut, logAVAL = ifelse(AVAL == 0, -28, lo (Number of observations: 300)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
E0_Intercept 0.90 0.39 0.12 1.66 1.00 2378 2397
logED50_Intercept 3.13 1.04 1.74 5.63 1.00 1002 879
logh_Intercept -0.64 0.33 -1.24 0.04 1.00 1027 1405
Emax_Intercept 7.66 1.80 5.33 11.97 1.00 942 795
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 2.42 0.12 2.20 2.66 1.00 2488 2235
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# Note that we omit E0 from the model, because the expected placebo outcome
# is already described by the intercept parameters of the distribution
model3 <- bf(AVAL ~ Emax * dose^h/(dose^h + ED50^h),
family = cumulative(link = "logit", threshold = "flexible"),
nlf(h ~ exp(logh)),
nlf(ED50 ~ exp(logED50)),
logED50 ~ 1,
logh ~ 1,
Emax ~ 1,
nl=TRUE)
priors3 <- prior(normal(0, 1), nlpar=logh) + # No particular reason to
# change prior here
prior(normal(0, 6), nlpar=Emax) + # Prior could be
# different (odds from
# one category to the
# next not necessarily
# the same as for top two
# vs. rest)
prior(normal(2.7, 2.5), nlpar=logED50) + # No particular
# reason to change
# prior here
prior(student_t(3, 0, 2.5), class=Intercept) +
prior(student_t(3, -0.2, 1), class=Intercept, coef=1) +
prior(student_t(3, 2.197, 1), class=Intercept, coef=5)
brmfit3 <- brm(
formula = model3,
prior = priors3,
data = mutate(peanut, AVAL = ordered(AVAL)),
refresh=0
)
Running MCMC with 4 parallel chains...
Chain 2 finished in 13.6 seconds.
Chain 1 finished in 16.0 seconds.
Chain 3 finished in 16.0 seconds.
Chain 4 finished in 17.6 seconds.
All 4 chains finished successfully.
Mean chain execution time: 15.8 seconds.
Total execution time: 17.9 seconds.
Family: cumulative
Links: mu = logit; disc = identity
Formula: AVAL ~ Emax * dose^h/(dose^h + ED50^h)
h ~ exp(logh)
ED50 ~ exp(logED50)
logED50 ~ 1
logh ~ 1
Emax ~ 1
Data: mutate(peanut, AVAL = ordered(AVAL)) (Number of observations: 300)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1] 0.07 0.24 -0.41 0.55 1.00 3580 2784
Intercept[2] 1.03 0.26 0.54 1.55 1.00 4218 2715
Intercept[3] 1.75 0.28 1.23 2.31 1.00 3980 2849
Intercept[4] 2.30 0.29 1.75 2.87 1.00 4128 2761
Intercept[5] 3.05 0.30 2.48 3.66 1.00 4472 2738
Intercept[6] 4.04 0.32 3.42 4.68 1.00 4340 3034
logED50_Intercept 3.95 1.23 2.25 7.04 1.01 1275 1009
logh_Intercept -0.64 0.30 -1.18 -0.02 1.00 1372 2008
Emax_Intercept 5.67 1.71 3.52 10.29 1.01 1155 982
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc 1.00 0.00 1.00 1.00 NA NA NA
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
S. Weber, L. Widmer - Exercises