Critical values improvement for the standard normal homogeneity test by combining Monte Carlo and regression approaches
The distribution of the test statistics of homogeneity tests is often unknown, requiring the estimation of the critical values through Monte Carlo (MC) simulations. The computation of the critical values at low α, especially when the distribution of the statistics changes with the series length (sample cardinality), requires a considerable number of simulations to achieve a reasonable precision of the estimates (i.e. 106 simulations or more for each series length). If, in addition, the test requires a noteworthy computational effort, the estimation of the critical values may need unacceptably long runtimes.
To overcome the problem, the paper proposes a regression-based refinement of an initial MC estimate of the critical values, also allowing an approximation of the achieved improvement. Moreover, the paper presents an application of the method to two tests: SNHT (standard normal homogeneity test, widely used in climatology), and SNH2T (a version of SNHT showing a squared numerical complexity). For both, the paper reports the critical values for α ranging between 0.1 and 0.0001 (useful for the p-value estimation), and the series length ranging from 10 (widely adopted size in climatological change-point detection literature) to 70,000 elements (nearly the length of a daily data time series 200 years long), estimated with coefficients of variation within 0.22%. For SNHT, a comparison of our results with approximated, theoretically derived, critical values is also performed; we suggest adopting those values for the series exceeding 70,000 elements.