10.6084/m9.figshare.7228559.v1 Sai Dharmarajan Sai Dharmarajan Joo Yeon Lee Joo Yeon Lee Rima Izem Rima Izem Dataset for: Sample size estimation for case-crossover studies Wiley 2018 case-crossover matched case-control correlation in exposure sample size formula Statistics Medicine 2018-11-12 12:25:55 Dataset https://wiley.figshare.com/articles/dataset/Dataset_for_Sample_size_estimation_for_case-crossover_studies/7228559 Case-crossover study designs are observational studies used to assess post-market safety of medical products (e.g. vaccines or drugs). As a case-crossover study is self-controlled, its advantages include better control for confounding because the design controls for any time-invariant measured and unmeasured confounding, and potentially greater feasibility as only data from those experiencing an event (or cases) is required. However, self-matching also introduces correlation between case and control periods within a subject or matched unit. To estimate sample size in a case-crossover study, investigators currently use Dupont’s formula (Biometrics 1988; 43:1157- 1168), which was originally developed for a matched case-control study. This formula is relevant as it takes into account correlation in exposure between controls and cases which are expected to be high in self-controlled studies. However, in our study, we show that Dupont’s formula and other currently used methods to determine sample size for case-crossover studies may be inadequate. Specifically, these formulae tend to underestimate the true required sample size, determined through simulations, for a range of values in the parameter space. We present mathematical derivations to explain where some currently used methods fail and propose two new sample size estimation methods that provide a more accurate estimate of the true required sample size.