Objective priors for common correlation coefficient in bivariate normal populations

Abstract Various objective priors have been defined for the common correlation coefficient concerning several bivariate normal populations. In this paper, the proposed approach relies on the asymptotic matching of coverage probabilities corresponding to Bayesian credible intervals considering the corresponding frequentist ones. In the present paper, we focus on several matching criteria including quantile matching, distribution function matching, highest posterior density matching, and matching via inversion of test statistics. In addition, we consider reference priors for different groups of ordering. The proposed methods are investigated and compared between each other in terms of a frequentist coverage probability and then, they are illustrated through a simulation study and two real data examples.


Introduction
A general research practice is to collect samples from several bivariate distributions and then, to investigate the relationship between random variables. For example, the relationship between the diastolic and systolic types of blood pressure (Miall and Oldham 1955) and the relationship between diastolic blood pressure and weight (Tishler and Donner 1977) corresponding to the age groups of interest (see also Zar 1999) were analyzed. Paul (1989) demonstrated that there was no significant difference between the three correlation coefficients corresponding to the age groups in the study of Tishler and Donner (1977), and Tian and Wilding (2008) proved that the correlations between the diastolic and systolic types of blood pressure concerning the three age groups in the study of Miall and Oldham (1955) were the same. Recently, Kazemi (2021) investigated the confidence interval for the parameter of the common correlation coefficient using the confidence distribution approach.
Concerning various bivariate normal populations, Donner and Rosner (1980) and Paul (1989) developed and compared the methods for estimation and verification of the common correlation coefficient. Paul (1989) introduced various test statistics to check the equality of the correlation coefficients. Tian and Wilding (2008) proposed a generalized pivot for the common correlation coefficient and thereafter, introduced two generalized confidence intervals. Later, Jafari and Kazemi (2017) applied the parametric bootstrap approach to test the equality of correlation coefficients. However, overall, a limited effort was made to specifically investigate objective Bayesian inference for the common correlation coefficient concerning several bivariate distributions. To implement a non-informative prior in a single bivariate normal population, Berger and Sun (2008) introduced the reference and quantile matching priors for a variety of parameters. Ghosh, Santra, and Kim (2008) developed matching priors using various matching criteria for several parameters of interest, such as regression coefficients and generalized variances. Kim, Kang, and Lee (2009) proposed applying the matching and reference priors to common mean (see also Ghosh et al. 2010). Therefore, in the present study, we aim to apply objective Bayesian inference for the common correlation coefficient in several bivariate normal populations, concerning the possibility of different means and variances.
The remainder of the paper is organized as follows. In Section 2, we introduce the orthogonal reparameterization of nuisance parameters, which greatly facilitates Bayesian inference. To define matching priors, we focus on several matching criteria, including quantile matching, distribution function matching, highest posterior density matching, and matching via inversion of the likelihood ratio test statistics. We demonstrate that the developed matching priors under the considered matching criteria have different forms. Then, we derive the reference priors for different groups of ordering. In Section 3, we discuss the condition for the propriety of the posterior distribution based on the general priors, including the proposed matching and reference ones. In Section 4, the proposed methods are illustrated in terms of the frequentist coverage probability through a simulation study and discussion of two real data examples.

Matching prior via distribution functions
We consider a prior p that achieves matching via distribution functions of particular standardized variables. More specifically, let h 1 be the parameter of interest, ðh 2 , :::, h kþ1 Þ T be the vector of nuisance parameters, andĥ 1 be the maximum likelihood estimate (MLE) of h 1 with n À 1 2 I 11 as its asymptotic variance. Then, we define prior p achieving the asymptotic matching as follows: where y ¼ ffiffiffi n p ðh 1 Àĥ 1 Þ=ðI 11 Þ 1=2 , and P p denotes the posterior of y given the data X under prior p. Considering the orthogonality of h 1 with ðh 2 , :::, h kþ1 Þ, prior p has the form I 1=2 11 dðh 2 , :::, h kþ1 Þ and satisfies the following two differential equations: In addition, concerning the matching prior pðhÞ / ð1 À h 2 1 Þ À1 dðh 2 , :::, h kþ1 Þ, the differential equation (13) can be simplified as follows: À 6h 1 dðh 2 , :::, h kþ1 Þ P k i¼1 n i ( ) ¼ À 6dðh 2 , :::, h kþ1 Þ P k i¼1 n i 6 ¼ 0: Therefore, we conclude that the quantile matching priors (6) do not satisfy the secondorder distribution function (DF) matching criterion, and consequently, the second-order quantile matching priors (10) are not the second-order DF matching ones.

Highest posterior density (HPD) matching priors
In general, ifh is considered as the parameter of interest, then the highest posterior density (HPD) region has the form fh : pðhjXÞ ! kg, where pðhjXÞ is the posterior of h given the data X under the prior p. Here, we consider priors ensuring that the HPD regions corresponding to the probability level 1 À a have asymptotically the same frequentist coverage probability with the error of approximation equal to oðn À1 Þ: Considering the orthogonality of h 1 with h 2 , :::, h kþ1 , prior p must satisfy the following condition: which can be simplified as follows: Therefore, the solution of (16) has the following form: where hðÁ, :::, ÁÞ corresponds to any smooth positive function, and h Àðiþ1Þ ¼ ðh 2i , h 4i , h 5i Þ, i ¼ 1, :::, k: Therefore, the priors (17) correspond to the HPD matching ones (see details in DiCiccio and Stern 1994; Ghosh and Mukerjee 1995). Concerning the matching prior pðhÞ / ð1 À h 2 1 Þ À1 ð Q k i¼1 h À1 3i Þhðh À2 , :::, h Àðkþ1Þ Þ, the differential equation (15) can be simplified as follows: X k i¼1 n i hðh À2 , :::, h Àðkþ1Þ Þ 6 ¼ 0: Therefore, the second-order quantile matching priors (10) cannot be the HPD matching ones.

Matching priors via inversion of test statistics
One possible way to derive confidence intervals is to utilize the inversion of certain test statistics, such as the likelihood ratio test, Rao's score statistic, or the Wald statistic. As they are the first-order equivalent (namely, up to oðn À1=2 Þ), in the present study, we consider only the likelihood ratio test. Let h ¼ ðh 1 , :::, h p Þ, and let lðhÞ denote the usual log-likelihood. The corresponding profile log-likelihood for h 1 is defined as l Ã ðh 1 Þ ¼ lðh 1 ,ĥ 2 ðh 1 Þ, :::,ĥ p ðh 1 ÞÞ, whereĥ j ðh 1 Þ is MLE of h j , j ¼ 2, :::, p: Then, the likelihood ratio statistic for h 1 can be defined as follows: According to the orthogonality of h 1 with h 2 , :::, h kþ1 , a likelihood ratio (LR) matching prior p can be obtained by solving: which can be simplified as follows: Therefore, the solution of (21) has the form: ! hðh À2 , :::, h Àðkþ1Þ Þ, where hðÁ, :::, ÁÞ corresponds to any smooth positive function, and h Àðiþ1Þ ¼ ðh 2i , h 4i , h 5i Þ, i ¼ 1, :::, k: Therefore, the priors (22) are not the LR matching ones (see details in Yin and Ghosh 1997).

Reference priors
Finally, we derive the reference priors for different groups of ordering corresponding to ðh 1 , h 2 , :::, h kþ1 Þ: Then, considering the orthogonality of the parameters, the reference priors can be defined as follows (see Bernardo 1979;Berger and Bernardo 1992;Datta and Ghosh 1995).
In the case of bivariate normal models (1), if h 1 is the parameter of interest, then the reference prior distribution corresponding to the group of ordering fðh 1 , h 2 , :::, h kþ1 Þg are defined as follows: Concerning the groups of ordering fh 1 , ðh 2 , :::, h kþ1 Þg, the reference prior is defined as follows: In the groups of ordering fh 1 , h 21 , h 31 , h 41 , h 51 , :::, h 2k , h 3k , h 4k , h 5k g and fh 1 , h 21 , h 31 , ðh 41 , h 51 Þ, :::, h 2k , h 3k , ðh 4k , h 5k Þg, the reference prior is specified according to the following formula: The detailed derivation of the one-at-a-time reference priors p 3 is provided in Supplemental Appendix. It should be noted that the Jeffrey's prior p 1 does not satisfy the first-order quantile matching criterion; however, it is a LR matching prior. The reference priors p 2 and p 3 satisfy the first-order quantile matching criterion, and p 3 is the second-order quantile matching prior.
Concerning the matching priors defined in Equations (10) (17), and (22), the coverage probability does not actually seem to improve according to the function h. By selecting Q k i¼1 h À1 2i as the function h, the quantile matching prior can be defined as a reference prior having the form of: Accordingly, the HPD and LR matching priors are defined as follows: and p LR ðh 1 , h 2 , :: respectively.

Propriety of posterior distributions
In this section, we focus on the propriety of posterior distributions for a general class of priors including the matching priors (27, 28), and (29); the reference ones (24) and (25). First, we consider a general class of priors having the following form: pðh 1 , h 2 , :: where À1 < a < 1, b > 0, and c > 0. Then, the following theorem can be proved.
w In addition, the marginal posterior density of h 1 under the prior (30) is defined as follows:

Simulation studies
In this section, we present the evaluation of the frequentist coverage probability by considering the credible interval of the marginal posterior density corresponding to h 1 based on the proposed prior p under the following configurations: ðl xi , l yi , r xi , r yi , qÞ, i ¼ 1, :::, k, and ðn 1 , :::, n k Þ: Tables 1-6 provide numerical values of the frequentist coverage probabilities corresponding to the 0.05 (0.95) posterior quantiles for the proposed priors. Using pre-specified values ðl xi , l yi , r xi , r yi , qÞ, i ¼ 1, :::, k, and a, the frequentist coverage probability can be derived as follows: where h a 1 ðp; X, YÞ is the ath posterior quantile of h 1 , given ðX, YÞ: In particular, concerning the fixed ðl xi , l yi , r xi , r yi , qÞ, i ¼ 1, :::, k, 10, 000 independent random samples of  ( ðX, YÞ are generated using the model (1). Without loss of generality, we set ðl xi , l yi Þ ¼ ð0, 0Þ, i ¼ 1, :::, k: From Tables 1-6, it can be seen that the one-at-a-time reference prior p 3 is more aligned with the target coverage probability compared with the Jeffrey's prior p 1 and the two-group reference one p 2 . In addition, the two-group reference prior p 2 performs better compared with the Jeffrey's prior p 1 in terms of matching the target coverage probability. Moreover, we derive the following conclusions: the one-at-a-time reference prior p 3 satisfies the second-order matching criterion; the two-group reference prior p 2 satisfies the first-order matching criterion; the Jeffrey's prior p 1 does not satisfy the first-order matching criterion. Specifically, we note that the results for the one-at-a-time  reference prior are not sensitive to changes in values of ðl xi , l yi , r xi , r yi , qÞ, i ¼ 1, :::, k, and changes in the number of populations k.
Next the simulation study is performed to compare the coverage probabilities of the quantiles and the confidence intervals by the generalized variable (GV) approach and large sample (LS) approach based on Fisher's Z transformation (Tian and Wilding 2008;Jafari and Kazemi 2017) and the developed matching priors. The results are given in From Tables 7 and 8.
From Table 7, it can be seen that the one-at-a-time reference prior p 3 and the GV are more aligned with the target coverage probability compared with the HPD matching prior p HPD , the LR matching prior p LR and the LS for the quantiles. In addition, the LS Table 3. Frequentist coverage probabilities of 0.05 (0.95) posterior quantiles and 90% (95%) credible intervals for h 1 when k ¼ 6.

Posterior quantile
Credible interval ðr xi , r yi Þ n 1 , :::, n 6 q p 1 p 2 p 3 p 1 p 2 p 3 (1, 1) 5, :::, 5 À0.9 0.006 (0.590) 0.054 ( method does not match the target coverage probability at all and the one-at-a-time reference prior p 3 performs slightly better compared with the GV in terms of matching the target coverage probability. Moreover, the HPD matching prior p HPD and the LR matching prior p LR do not satisfy the first-order matching criterion. From Table 8, for the condifence intervals, it can be seen that the one-at-a-time reference prior p 3 and the GV are more aligned with the target coverage probability compared with the HPD matching prior p HPD , the LR matching prior p LR and the LS. In addition, the LS method seems to give relatively good performances but we know that the LS method does not match the target coverage probability for the quantilies from the results of Table 7.

Real data analysis
Next, we consider two examples discussed in Tian and Wilding (2008). The first example corresponds to the case of a small sample size, and the second example concerns the case of a relatively large sample size.
Example 1. We utilized the dataset constructed by Miall and Oldham (1955), which included the measurements on blood pressure and was analyzed in the context of making inference on interclass and intraclass correlations. We focused on the relationship  between the diastolic and systolic types of blood pressure and possibly, their relationship with age. The data comprised the values of the diastolic and systolic types of blood pressures measured for proposita girls for the age groups of 6-8, 9-11, and 12-14 years with the sample sizes of 7, 6, and 7, respectively. It should be noted that the sample correlations corresponding to the three age groups were 0.7454, 0.6391, and 0.7379, respectively. In addition, as the p-value of the v 2 test based on Fisher's Z transform was 0.96 (Tian and Wilding 2008), it was concluded that the correlations between the diastolic and systolic types of blood pressure for the three age groups were the same.
To evaluate the common correlation coefficient, the pooled estimate (PE) based on Fisher's Z transformation and the maximum likelihood estimate (MLE) were provided along with the 95% confidence intervals obtained using the generalized variable (GV), large sample (LS), and parametric bootstrap (PB) approaches, as represented in Table 9 ( Tian and Wilding 2008;Jafari and Kazemi 2017). The corresponding Bayes estimates and Bayesian credible intervals under the Jeffrey's prior (JP), two-group reference prior (TR), and the one-at-a-time reference prior (OR) were also derived, as provided in Table 9.
From Table 9, it can be seen the result obtained by the frequentist and Bayesian methods are slightly different. The estimate of JP is similar to those of ML and PE but differs from those of TR and OR. The confidence intervals obtained by the LS, GV, and Table 5. Frequentist coverage probabilities of 0.05 (0.95) Posterior quantiles and 90% (95%) credible intervals for h 1 when k ¼ 3.

Posterior quantile
Credible interval ðr xi , r yi Þ n 1 , n 2 , n 3 q p 1 p 2 p 3 p 1 p 2 p 3 (1, 1) 5,5,10 À0.9 0.016 (0.782) 0.062 ( PB methods are similar, while those of JP, TR, and OR differ from each other. In addition, the credible interval based on the one-at-a-time reference prior is slightly larger compared with those of the two-group reference and Jeffrey's priors. Example 2. We considered the dataset provided by Tishler and Donner (1977). In this example, we analyzed the relationship between diastolic blood pressure and the weight. The samples including 30 boys from the age groups of 6-8, 9-11, and 12-14 years were employed for investigation. The sample correlations of the three age groups were 0. 422, 0.388, and 0.569, respectively. Paul (1989) demonstrated that there was no significant difference between these correlations.
As represented in Table 9, we obtained PE based on Fisher's Z transformation and MLE; then, we derived 95% confidence intervals using the GV and LS approaches, along with the Bayes estimates and Bayesian credible intervals under JP, TR, and OR. From Table 9, it can be seen that the results obtained in Example 2 are deemed similar as those represented in Example 1.

Concluding remarks
In the present study, we developed the objective prior to introduce Bayesian inference in several bivariate normal distributions with the common correlation coefficient, Table 6. Frequentist coverage probabilities of 0.05 (0.95) Posterior quantiles and 90% (95%) credible intervals for h 1 when k ¼ 6.

Posterior quantile
Credible interval ðr xi , r yi Þ n 1 , :::, n 6 q p 1 p 2 p 3 p 1 p 2 p 3 (1, 1) 5,5,5,10,10,10 À0.9 0.008 (0.722) 0.053 ( concerning possibly different means and variances. Relying on orthogonal reparameterization, we considered several matching criteria, including quantile matching, distribution function matching, highest posterior density matching, and matching via inversion of test statistics. We observed that the Jeffrey's prior (LR matching prior) was highly sensitive to the number of populations and did not satisfy the first-order quantile matching criterion. The HPD matching prior is also not the first order matching prior.
In addition, we demonstrated that the two-group reference priors satisfied the firstorder quantile matching criterion, and the one-at-a-time reference prior was the second-order quantile matching prior. The GV method shows the very good performance, but the LS method does not match the target coverage probability at all. We illustrated that the one-at-a-time reference prior achieved the most appropriate results in   the view of the asymptotic frequentist coverage property through conducting the simulation study and considering two real data example.