Use of copula to model within‐study association in bivariate meta‐analysis of binomial data at the aggregate level: A Bayesian approach and application to surrogate endpoint evaluation

Bivariate meta‐analysis provides a useful framework for combining information across related studies and has been utilized to combine evidence from clinical studies to evaluate treatment efficacy on two outcomes. It has also been used to investigate surrogacy patterns between treatment effects on the surrogate endpoint and the final outcome. Surrogate endpoints play an important role in drug development when they can be used to measure treatment effect early compared to the final outcome and to predict clinical benefit or harm. The standard bivariate meta‐analytic approach models the observed treatment effects on the surrogate and the final outcome outcomes jointly, at both the within‐study and between‐studies levels, using a bivariate normal distribution. For binomial data, a normal approximation on log odds ratio scale can be used. However, this method may lead to biased results when the proportions of events are close to one or zero, affecting the validation of surrogate endpoints. In this article, we explore modeling the two outcomes on the original binomial scale. First, we present a method that uses independent binomial likelihoods to model the within‐study variability avoiding to approximate the observed treatment effects. However, the method ignores the within‐study association. To overcome this issue, we propose a method using a bivariate copula with binomial marginals, which allows the model to account for the within‐study association. We applied the methods to an illustrative example in chronic myeloid leukemia to investigate the surrogate relationship between complete cytogenetic response and event‐free‐survival.


A.2 Bootstrap method to estimate the within-study association parameter of BRMA-BC
Another bootstrap method was used to estimate the dependence parameters ρ A , ρ B of the joint density made with the normal copula. Similarly as in the previous bootstrap method 3000 bootstrap samples with replacement were drawn from IPD. Summary data were calculated for each bootstrap sample and then dependence parameters of the bivariate normal copula were estimated by using a optimiser such as nlminb in R.
. We used these sets as empirical prior distributions for these parameters.

A.7 Dependence parameters of the generation process
Data of the simulation study were generated using the R package copula. To simulate IPD with low, moderate and high association, we used joint densities constructed with normal copula of Bernoulli marginal distributions in both arms with dependence parameters ρ A , ρ B . The following table presents the values of the dependence parameters for each scenario. A.8 Scenarios generated from independent Bernoulli distribution (no within-study association) In this section we considered a set of scenarios with zero within-study association simulating IPD from independent Bernoulli distributions instead of copula dependent Bernoulli distributions. This allows us to assess the robustness of the estimates of BRMA-BC to different distributional assumptions of the data generation process. Table 2 presents the empirical distribution of the within-study association parameters estimated with bootstrapping.
It can be seen that the bootstrap methods estimated the within-study associations quite accurately and precisely across the three different sets of scenarios. Table 3 presents the results of the estimates of between-studies correlation across the three sets of scenarios following the same format as in the simulation study. BRMA-BC performed better than BRMA and equally well as BRMA-IB in terms of RMSE, coverage probabilities and average bias in the scenarios with high proportions of events. This suggests that BRMA-BC gave robust estimates under a set of scenarios where no within-study association was present.
Therefore we can conclude that it can efficiently model the within-study variability even when IPD are simulated under different distributional assumptions.

A.9 Convergence plots of the data example -CML
This section presents proof of convergence of the three models (Figures 1-6) for the CML data example. Convergence was reached for all the parameters of BRMA, BRMA-IB and BRMA-BC models as there is significant overlap between the density plots across chains and traceplots indicate good mixing of the chains as they look like a random scatter and significantly overlap across chains. 1000 iterations were carried out). In such cases the particular this iteration was excluded for the Bayesian analysis.
The convergence status of the within-study association parameters ρ A , ρ B for each simulation iteration was evaluated by checking the convergence argument of the nlminb optimiser in R. Whenever the optimiser failed to converge, IPD were re-simulated until the optimiser provided a reliable solution. Table 7 provides information about the number of simulation iterations where the optimiser returned a convergence error and IPD were re-simulated until the obtimiser was able to provide a converged solution for both of the dependence parameters. It can be seen that convergence problems were quite rare as the total the number of studies that IPD were re-simulated is very small compared to the total number of studies across all simulation iterations (in total 30 studies × 1000 simulation replication = 30000 studies were simulated). Across the simulation study the largest number was 258 studies i.e., it occurs approximately 8 times every 1000 studies.
Moderate within-study association High within-study association Small study size Average Proportion of events = 0.5 Average Proportion of events = 0.95

Strength of association Parameter
Number of simulation iterations with R >1.01

Number of simulation iterations
with R > 1.01 Low within-study association Moderate within-study association High within-study association Moderate within-study association High within-study association Small study size Average Proportion of events = 0.5 Average Proportion of events = 0.95

Strength of association Parameter
Number of simulation iterations with R >1.01

Number of simulation iterations
with R > 1.01 Low within-study association Moderate within-study association High within-study association Moderate within-study association High within-study association Small study size Average Proportion of events = 0.5 Average Proportion of events = 0.95

Strength of association Parameter
Number of simulation iterations with R >1.01

Number of simulation iterations
with R > 1.01 Low within-study association Moderate within-study association High within-study association