figshare
Browse
uasa_a_1448827_sm7844.pdf (250.8 kB)

Speeding Up MCMC by Efficient Data Subsampling

Download (250.8 kB)
Version 2 2018-07-16, 15:15
Version 1 2018-03-14, 20:50
journal contribution
posted on 2018-07-16, 15:15 authored by Matias Quiroz, Robert Kohn, Mattias Villani, Minh-Ngoc Tran

We propose subsampling Markov chain Monte Carlo (MCMC), an MCMC framework where the likelihood function for n observations is estimated from a random subset of m observations. We introduce a highly efficient unbiased estimator of the log-likelihood based on control variates, such that the computing cost is much smaller than that of the full log-likelihood in standard MCMC. The likelihood estimate is bias-corrected and used in two dependent pseudo-marginal algorithms to sample from a perturbed posterior, for which we derive the asymptotic error with respect to n and m, respectively. We propose a practical estimator of the error and show that the error is negligible even for a very small m in our applications. We demonstrate that subsampling MCMC is substantially more efficient than standard MCMC in terms of sampling efficiency for a given computational budget, and that it outperforms other subsampling methods for MCMC proposed in the literature. Supplementary materials for this article are available online.

Funding

Matias Quiroz and Robert Kohn were partially supported by Australian Research Council Center of Excellence grant CE140100049. Quiroz was also partially supported by VINNOVA grant 2010-02635. Mattias Villani was partially financially supported by Swedish Foundation for Strategic Research (Smart Systems: RIT 15-0097). Minh-Ngoc Tran was partially supported by a Business School Pilot Research grant. Quiroz carried out part of the work while affiliated with Sveriges Riksbank, Linköping University and Stockholm University.

History