On partial and conditional association measures for ordinal contingency tables

Abstract This paper proposes new non-model based partial/conditional association measures for ordinal contingency table where the main interest is the association between two ordinal variables of interest, adjusting for the effect of the covariate(s). We first develop a new type of scores for the ordinal variables, subcopula scores, taking into account the ordering of the categories of the variables and derive the subcopula regressions describing the relationships between the ordinal variables and the covariate(s). The proposed subcopula scores and subcopula regressions are obtained from the subcopula which uniquely links the joint distribution for the ordinal variables with their marginal distributions. We then define the new partial/conditional association measures between two ordinal variables to be the correlations between two sets of subcopula scores for them adjusted by the subcopula regressions of two response variables on the covariate(s). We theoretically investigate the properties of the proposed partial/conditional measures and their corresponding estimators. Finally, the performance of the proposed measures are evaluated by simulation studies and real-data examples.


Introduction
Various non-model based measures were proposed in literature for describing associations between the variables in the ordinal contingency table. For a two-way ordinal contingency table, there are two types of association measure. The first one is a measure based on concordant and discordant pairs of observations and the second one is a correlation measure based on the scores assigned to the categories of each variable. The commonly used concordance and discordance measures include the gamma index (Goodman and Kruskal 1954), Kendall's tau-b (Kendall 1938) and Somers' d (Somers 1962), but they cannot detect nonmonotone relationships such as a Ushaped association (Agresti 2010, 187). For the correlation association measure, Kendall (1948) proposed an analog of Spearman's rho for contingency table which is the ordinary correlation based on the ridit scores (Bross 1985) of the ordinal variable. Note that Kendall's tau-b is the correlation coefficient obtained by assigning the sign scores to concordant and discordant pairs.
For three-way contingency tables with two ordinal variables of interest and a covariate, one can define the association measures using two approaches described above (i.e., concordance/discordance measure and correlation measure). To this end, it is important to adjust the association between the two ordinal variables for the effect of the covariate and two ways have been proposed in the literature, partial and conditional association measures (Lawrance 1976;Agresti 1977;Liu et al. 2018).
The partial association measure summarizes the relationship between two variables of interest after removing the effect of covariate. The commonly used partial association measures include the partial Kendall's correlation (Agresti 1977) and the partial Spearman's rank correlation using probability-scale residuals .
The conditional association is the measure of dependence between the variables of interest across the levels of the covariate, and thus it is a function of the covariate, unlike the partial association measure. When the underlying association between the variables differs across the levels of the covariate, the conditional association will provide valuable information on the nature of the dependence structure. The frequently used conditional association measures include Goodman and Kruskal's conditional gamma (Davis 1967), tau type measures (Kendall 1948;Agresti 1977), and conditional Spearman's rank correlation using probability-scale residuals .
In this paper we propose the new partial and conditional association measures designed to assess association between the ordinal variables of interest in the ordinal contingency tables while adjusting for the influence of categorical covariate(s). The proposed association measures are constructed based on two tools: subcopula scores and subcopula regressions. The subcopula score, as a new type of data-driven scores for an ordinal variable, can take into account the fact that the ordering of the categories of an ordinal variable is informative. The subcopula regression, first proposed by Wei and Kim (2017) to study the asymmetric association for the two-way contingency table, can yield useful information on dependence structure between the ordinal variables of interest and the covariate(s). In this paper we further develop the subcopula regression to evaluate the potential interactions between the ordinal variables of interest conditional on the covariate(s) in the ordinal contingency table. Note that the proposed subcopula scores and the subcopula regressions are obtained from the subcopula which uniquely links the joint distribution for the ordinal variables with their marginal distributions.
Utilizing the subcopula scores and the subcopula regressions, we propose new partial/conditional association measures between two ordinal variables, the correlations between two sets of subcopula scores for them adjusted by the subcopula regressions of two response variables on the covariate(s). The proposed measures can describe not only linear but also monotone nonlinear association between the ordinal variables. Moreover, through simulation studies, we also show that the proposed measures can identify a certain type of nonmonotone nonlinear dependence structure such as a U-shaped relationship.
To clearly present the proposed methods, we focus on the three-way contingency table with two ordinal variables of interest and a covariate. The article is structured as follows. In Sec. 2, we propose the subcopula scores and the subcopula regression for three-way ordinal contingency tables. Section 3 proposes the subcopula regression based partial and conditional measures to describe the association between two ordinal variables, adjusting for the effect of the covariate. Note that in Sec. 3.4 we also extend the proposed partial and conditional measures for the multiway contingency table with multiple categorical covariates. In Sec. 4, we present the estimators of the proposed partial and conditional association measures and derive their asymptotic distributions. Section 5 illustrates the proposed measures using simulation studies and real data analysis. Section 6 completes the paper with conclusions and future work.

Subcopula score and subcopula regression for ordinal contingency table
In this section, we briefly review the subcopula for the ordinal three way contingency tables. We then propose a new type of data-driven scores for the ordinal variables, subcopula score, and introduce the subcopula regression designed to capture dependence between two ordinal variables of interest and a covariate.

Subcopula score
Suppose that Y, Z, and X are ordinal variables with J, K, and I categories, having the supports S y ¼ fy 1 < Á Á Á < y J g, S z ¼ fz 1 < Á Á Á < z K g and S x ¼ fx 1 < Á Á Á < x I g, respectively, where "<" represents the natural ordering of the categories of an ordinal variable. Consider an ordinal three-way contingency table with the probability mass function (p.m.f. ::, J, k ¼ 1, :::, K, i ¼ 1, :::, I, and P J j¼1 P K k¼1 P I i¼1 p jki ¼ 1: Let p j , p k and p i be the j-th column, the k-th tube, and i-th row marginal p.m.f.s for Y, Z, and X, respectively, and p jk , p j i , and p ki be the two-way marginal p.m.f.s for (Y, Z), (Y, X), and (Z, X), respectively. In the rest of the paper, we will denote the ordinal variables Y and Z as the variables whose association is of interest and X as the covariate which may influence Y and Z.
Let the marginal distributions of Y, Z, and X to be F Y ðyÞ, F Z ðzÞ, and F X ðxÞ, respectively, and define the random variables V, W, and U as V¼F Y ðYÞ, W¼F Z ðZÞ, and U¼F X ðXÞ, respectively. Then, the ranges of V, W and U are defined to be D By Sklar's theorem (Sklar 1959), for the joint distribution H of Y, Z, and X, there exists a unique subcopula C (Nelsen 2006) on the domain D v Â D w Â D u which links the joint distribution H to its marginal distributions, that is and satisfies the three properties: (a) D v Â D w Â D u is the domain of C where D v , D w , and D u are subsets of ½0, 1; (b) For every ðv, w, uÞ 2 D v Â D w Â D u , Cðv, 1, 1Þ ¼ v, Cð1, w, 1Þ ¼ w, Cð1, 1, uÞ ¼ u, and Cðv, w, uÞ ¼ 0 if at least one of v, w, or u is zero; (c) Cðv 0 , w 0 , u 0 Þ À Cðv 0 , w, u 0 Þ À Cðv, w 0 , u 0 Þ À Cðv 0 , w 0 , uÞ þ Cðv, w, u 0 Þ þ Cðv 0 , w, uÞ þ Cðv, w 0 , uÞ À Cðv, w, uÞ ! 0, where ½v, v 0 Â ½w, w 0 Â ½u, u 0 is a rectangle with all vertices in D v Â D w Â D u : In addition, the joint p.m.f. of the subcopula C and the conditional p.m.f. of (V, W) given U are, respectively, When the assessment of the association between the ordinal variables is of main interest, it is important to take into account the informative ordering on the the categories of ordinal variables. The ranges of V, W and U in Eq. (2.1), D v , D w , and D u , reflect the ordering of the categories of Y, Z, and X, respectively: v 1 < ::: < v j < ::: < v J ¼ 1, w 1 < ::: < w k < ::: < w K ¼ 1 and u 1 < ::: < u i < ::: < u I ¼ 1: Moreover, D v , D w , and D u consist of the elements of the domains on which P ¼ fp jki g of X, Y and Z is uniquely connected to their marginals. Thus, we propose using D v , D w , and D u as a new set of scores for the ordinal variables Y, Z and X, and call them the subcopula scores.

Subcopula regression
In this section, we first review the subcopula regressions (Wei and Kim 2017) for two-way marginal contingency tables obtained from the three-way contingency table. We then further develop  the subcopula regression to evaluate the interactions between two response variables of interest  given a covariate in the three-way contingency table. First, we consider the two-way marginal tables of (Y, X) and (Z, X). Then, the conditional p.m.f.s of V given U, and W given U are given by, respectively, The subcopula regressions defined in Definition 2.1 can detect the relationships for two pairs (Y, X) and (Z, X), respectively.
Definition 2.1. (Wei and Kim 2017) The subcopula regressions of V on U and of W on U are defined to be We view r C VjU and r C WjU in Eqs. (2.3)-(2.4) as the mean subcopula score for Y and Z, weighted by their conditional distributions, in the i-th category of X, respectively.
To take into account the potential interactions between Y and Z (denoted as YZ) conditional on X, we define the subcopula regression of the product variable V Â W (denoted as VW) on U. Let D vw be the range of VW such that D vw ¼ fd 0 ¼ 0, d 1 , :::, d t , :::, d I vw ¼ 1g, (2.5) where I vw denote the number of distinct elements, d 0 ¼ 0 < d 1 < ::: < d t < ::: < d I vw ¼ 1 and t ¼ 0, :::, I vw : Let c VWjU be the conditional p.m.f. of VW given U such that Then, we define the subcopula regression of VW on U to capture the dependence between YZ and X: Definition 2.2. The subcopula regression function of VW on U is defined by, Note that D vw in Eq. (2.5) is interpreted as the subcopula score for YZ and the subcopula regression r C VWjU in Eq. (2.7) can be viewed as the mean subcopula score for YZ weighted by the joint conditional distribution at the i-th category of X.
For better understanding of the subcopula scores and subcopula regressions proposed in this section, we give a simple artificial example below.
Example 2.1. Table 1 is an artificial 3 Â 3 Â 3 table for an ordinal random vector (Y, Z, X) with the joint p.m.f. P ¼ fp jki g where j, k, i ¼ 1, 2, 3 and the supports of Y, Z and X are S y ¼ fy 1 < y 2 < y 3 g, S z ¼ fz 1 < z 2 < z 3 g and S x ¼ fx 1 < x 2 < x 3 g, respectively. Here, we treat Y and Z as the variables of interest and X as the covariate influencing Y and Z.
We see that the associations between Y and Z are heterogeneous, depending on the level of X. When X¼x 1 and x 3 , Y and Z have monotone patterns (concavely increasing for X¼x 1 and convexly increasing for X¼x 3 ), and when X¼x 2 , their relationship is perfectly negative linear. The marginal distributions for Y, Z, and X are fp j g¼f1=3, 1=3, 1=3g, fp k g¼f1=3, 1=3, 1=3g, and fp i g¼f1=3, 1=3, 1=3g: The probabilities in the marginal table of (Y, Z) are uniformly distributed (p jk ¼ 2=18 for all (j, k)), and the margin tables for (Y, X) and (Z, X) show positive and negative weak linear patterns, respectively.
We see that the subcopula regressions of V and W linearly increase positively and negatively (by 1/3) with u i (subcopula score of X), respectively, which indicates positive and negative weak linear patterns for (Y, X) and (Z, X), respectively. The subcopula regression of VW given U shows nonlinear pattern and its values differ depending on the value of u i , indicating heterogeneous pattern supported by Table 1.

Subcopula regression based partial and conditional association measures
When the association between two ordinal variables in the contingency table is of interest, it is desirable to adjust the pairwise association between them for the effect of the covariates. In this section, we propose new partial and conditional association measures for three-way ordinal contingency table with a single covariate and then extend them for multi-way contingency table with several categorical covariates. To this end, we first propose a new association measure for the two-way marginal table associated with the three-way contingency table.
Definition 3.1. The subcopula score based pairwise correlations of (Y, X), (Z, X) and (Y, Z) are defined by, respectively, where r uu , r vv , r ww , r vu , r vw , and r wu are given in Eq. (3.8).
The pairwise correlation in Eq. (3.9) can be viewed as an analog of Spearman's rho for continuous variables because it is the ordinary correlation applied to the subcopula scores of two ordinal variables and their two-way marginal p.m.f.s. Since the subcopula scores are the (relative) rank-type scores derived from the marginal distribution function of an ordinal variable and it is well known that the rank transformation tends to linearize monotonic nonlinear relationships (Conover and Iman 1981;Conover 1999), the proposed pairwise correlations in Eq. (3.9) can determine monotonic (including linear) association between two ordinal variables.
When the two ordinal variables are independent, the proposed correlation in Eq. (3.9) is zero (e.g., when p jk ¼ p j p k for every pair of (j, k), q YZ ¼ 0), but the converse does not in general hold. The zero value of the proposed correlation indicates that there is neither linear nor monotone nonlinear association between the two ordinal variables. The range of the proposed correlation is between À1 and 1. The larger (in absolute value) the pairwise correlation is, the stronger the two variables are (either linearly or monotonically) associated.
Example 2.1 (continued). We calculate the pairwise correlations in Eq. (3.9) for Table 1 This result is confirmed by the patterns in the two-way marginal tables for (Y, Z), (Y, X), and (Z, X).

Subcopula regression based partial correlation
We now propose a new partial association measure for the three-way contingency table to assess the association between the two ordinal variables once the influence of the covariate on them has been removed. To capture a general form of influence of the covariate on the two ordinal variables, we utilize the subcopula regression introduced in Sec. 2.2.
Definition 3.2. The subcopula regression based partial variance-covariance matrix for Y and Z given X is defined to be where r C VjU U ð Þ and r C WjU U ð Þ are the subcopula regressions defined in Definition 2.1, The subcopula regression based partial correlation of Y and Z given X is defined to be When Y and Z are conditionally independent on X, as will be shown in Theorem 3.1(1), q YZ X in Eq. (3.11) is equal to 0. The zero value of q YZX indicates that Y and Z are not only linearly uncorrelated, but also monotonically uncorrelated after removing the effects of X using subcopula regression. The range of q YZ X is from À1 to 1, and the larger (in absolute value) value of q YZ X represents the stronger (linearly/monotonically) association between Y and Z, with the effect of X eliminated. Note that q YZ X is not a function of X.
The partial variance-covariance in Eq. (3.10) is the variance-covariance between the residuals of the projections of V and W on the space spanned by the subcopula regressions of U. The Proposition 3.1 below shows that when the copula regressions r C VjU ðUÞ and r C WjU ðUÞ are linear functions of U, the partial variance-covariance in Eq. (3.10) can be viewed as the variance-covariance between residuals of the projections of V and W on the linear space spanned by U and thus the proposed partial correlation in Eq. (3.11) can be written as a function of the three pairwise correlations in Eq. (3.9).
Proposition 3.1. Suppose that the r C VjU ðUÞ and r C WjU ðUÞ are linear functions of U: where l v , l w , l u , r uu , r vu , r wu , q Y X and q ZX are given in Eqs. (3.8) and (3.9). Then, the subcopula regression based partial variance-covariance in Eq. (3.10) is where R 11 , R 12 , R 21 , and R 22 are given in Eq. (3.8), and the subcopula regression based partial correlation in Eq. (3.11) is where q Y X , q ZX , and q YZ are given in Eq. (3.9).
Proof. The proof is given in the Sec. S1 of the supplementary material.
w Example 2.1 (continued). Using the subcopula scores and subcopula regressions obtained from Table 1, we compute the partial correlation measure proposed in Eq. (3.11): q YZ X ¼ 1=8: This indicates that after removing the effect of X on Y and Z, the association between Y and Z becomes (weakly) positive. We also check if the proposed partial correlation can be computed from the three proposed pairwise correlations for (Y, Z), (Y, X), and (Z, X), as given in Proposition 3.1. To this end, we first compute the right hand sides of Eqs. (3.12) and (3.13): Then we evaluate these two regression functions at the subcopula scores for X: (5/9, 6/9, 7/9) for r C VjU ðU ¼ u i Þ and (7/9, 6/9, 5/9) for r C WjU ðU ¼ u i Þ at u i ¼ ð1=3, 2=3, 1Þ: Since these values are exactly the same as those obtained in Table 2, r C VjU ðUÞ and r C WjU ðUÞ are linear functions of U for Table 1 and thus the proposed partial correlation can be obtained from three pairwise correlations, q YZ , q YX and q ZX in Eq. (3.9): q YZ X ¼ 0Àð1=3ÞðÀ1=3Þ ffiffiffiffiffiffiffiffiffiffiffiffiffi ffi Remark 3.1. The partial association measure q YZÁX in Eq. (3.11) can be viewed as the pairwise correlation between Y and Z in Eq. (3.9) equipped with two adjusted subcopula scores, D Ã ¼ f Ã 1 ðiÞ, :::, Ã J ðiÞji ¼ 1, :::, Ig and D Ã w ¼ fw Ã 1 ðiÞ, :::, w Ã K ðiÞji ¼ 1, :::, Ig, and the joint probability of (Y, Z, X), fp jki g, where v Ã j ðiÞ ¼ v j À r VjU ðu i Þ and w Ã k ðiÞ ¼ w k À r WjU ðu i Þ are the subcopula scores for Y and Z adjusted by the subcopula regressions of V on U in Eq. (2.3) and of W on U in Eq.
(2.4), respectively. Note that D Ã v and D Ã w are the pooled scores across the levels of X. This means that q YZÁX approximates non-monotone nonlinear (such as U-shaped) association between Y and Z by way of a weighted linear regression between D Ã v and D Ã w with weights equal to fp jki g in the space of adjusted subcopula scores. Note the slope for the weighted linear regression of where r vvÁu and r wwÁu are given in Eq. (3.10).

Subcopula regression based conditional correlation
This section proposes a new conditional association measure to assess the dynamic nature of the dependence between two ordinal variables across the levels of the covariate.
Definition 3.3. The subcopula regression based conditional variance-covariance matrix of Y and Z given X is defined to be and r C VjU ðUÞ, r C WjU ðUÞ, and r C VWjU ðUÞ are subcopula regressions defined in Sec. 2.2. The subcopula regression-based conditional correlation of Y and Z given X is defined to be The q YZjX of Eq. (3.16) computed at the i-th category of X is As will be shown in Theorem 3.1(1), q YZjX¼x i is zero under the conditional independence between Y and Z on X ¼ x i : The zero value of q YZjX¼x i means that Y and Z are linearly/monotonically uncorrelated at X ¼ The range of q YZjX¼x i is from À1 to 1, and the larger (in absolute value) value of q YZjX¼x i represents the stronger (linearly/monotonically) conditional association between Y and Z given X ¼ x i : Example 2.1 (continued). Using the computed subcopula scores and subcopula regressions for Table 1, we compute the conditional correlation measure of Eq. (3.17): q YZjX¼x i ¼ 0.8 for X¼x 1 , -1 for X¼x 2 and 0.8 for X¼x 3 . This result supports the heterogeneous associations depending on the levels of X shown in Table 1: strong positive monotonic or perfect negative linear dependence.
Remark 3.2. We can view the conditional association measure q YZjX¼x i in Eq. (3.17) as the pairwise correlation between Y and Z in Eq. (3.9) equipped with two adjusted subcopula scores, D Ã ðiÞ ¼ f Ã 1 ðiÞ, :::, Ã J ðiÞg and D Ã w ðiÞ ¼ fw Ã 1 ðiÞ, :::, w Ã K ðiÞg, and the conditional joint probability of Y and Z given X ¼ This means that q YZjX¼x i approximates the nonmonotone nonlinear relationship between Y and Z given X ¼ x i using a weighted linear regression with the weights equal to fp jkji g attached to D Ã v ðiÞ and D Ã w ðiÞ in the space of adjusted subcopula scores. That is, the slope for the weighted linear regression of Theorem 3.1 below shows the properties of the proposed subcopula regression based partial/ conditional correlations given in Definition 3.2 and 3.3.
Theorem 3.1. Suppose that Y, Z, and X are the ordinal variables in an J Â K Â I contingency table with the joint p.m.f. P ¼ fp jki g. Let V, W, and U be the marginal distributions of Y, Z and X (i.e., V ¼ F Y ðYÞ, W ¼ F Z ðZÞ, U ¼ F X ðXÞ). Then (1) If Y and Z are conditionally independent given X, then q YZ X ¼ 0 and q YZjX¼x i ¼ 0 for each category of X, x i .
( (3) If Y is binary with support fy 1 , y 2 g andỸ is the variable with support fy 2 , y 1 g, then qỸ Z X ¼ Àq YZ X and qỸ ZjX ¼ Àq YZjX : Similarly, if Z is binary with support fz 1 , z 2 g andZ is the variable with support fz 2 , z 1 g, then q YZ X ¼ Àq YZX and q YZ jX ¼ Àq YZjX : Therefore, if both Y and Z are binary, then qỸZ X ¼ q YZ X and qỸZ jX ¼ q YZjX : (4) q YZX is invariant over the permutation of the categories of X.
Proof. The proof is given in the Sec. S2 of the supplementary material. w By Theorem 3.1(1), the conditional independence implies that the proposed partial and conditional correlation measures are both zero. Theorem 3.1(2) implies that a necessary and sufficient condition for the equivalence between the proposed partial variance-covariance in Eq. (3.10) and the expectation of the proposed conditional variance-covariance in Eq. (3.15) is that the average subcopula scores of Y and Z are linear in the subcopula score of X with respect to their corresponding conditional distributions.
Theorem 3.1(3) indicates that when one of the variables of interest is binary, the absolute values of the proposed partial and conditional correlations are invariant with respect to the permutation of the category of the binary variable. When both variables of interest are binary, the proposed measures are unchanged for the permutations of the categories of both binary variables. From Theorem 3.1(4) we see that the proposed partial correlation can be used even for a nominal covariate. Note that the proposed conditional correlation can be also used for a nominal covariate, as it is computed at each level of the covariate.

Subcopula regression based partial and conditional correlations for multiple categorical covariates
The proposed partial and conditional association measures in Secs. 3.2 and 3.3 are defined mainly for the ordinal three-way contingency tables with a single categorical covariate. It would be straightforward to extend the proposed association measures to the multi-way contingency table where there are more than one categorical covariate affecting two ordinal variables of interest. Consider a ðd þ 2Þ-way contingency table formed by the ordinal variables of interest Y and Z, and a d-dimensional covariate vector, X ¼ ðX 1 , :::, X d Þ T where each X ' in X is a categorical variable with I ' categories x ' 1 , :::, x ' I ' and ' ¼ 1, :::, d: Let the joint p.m.f. of ðY, Z, X T Þ T be p jki ¼ p jki 1 :: Given the covariate vector X, we can define the subcopula regression based partial correlation and the subcopula regression based conditional correlation between Y and Z, respectively: where 1 ¼ ð1, :::, 1Þ and I ¼ ðI 1 , :::, I d Þ are the index tuples of length d, and p jji ¼ p j i =p i , and p kji ¼ p ki =p i :

Estimation
This section presents the estimators for the subcopula scores, the subcopula regressions and the association measures proposed in Secs. 2-3. We also derive the asymptotic distributions for the estimators of the proposed association measures and their Fisher transformations which help to obtain the estimators for the asymptotic variances of the proposed estimators by the plug-in principle and then construct the asymptotic confidence intervals. Let fn jki g, j¼1, :::, J, k¼1, :::, K, i¼1, :::, I denote the cell counts in a J Â K Â I contingency table obtained by classifying n(¼ P J j¼1 P K k¼1 P I i¼1 n jki ) cases with respect to J, K and I categories of Y, Z and X. The one-and two-dimensional margin sums are denoted as Then, we define the estimators for p j , p k , p i , p jk , p j i , and p ki bŷ The subcopula scores given in Eq. (2.1) for Y, Z, and X are estimated byD v ¼ fv j g,D w ¼ fŵ k g, Using the estimators above, we obtain the estimators for the subcopula regressions of V on U, of W on U, and of VW on U in Eqs. (2.3), (2.4), and (2.7): wherep jji ¼p ji =p i ,p kji ¼p ki =p i , andp jkji ¼p jki =p i :

Estimation of subcopula score based pairwise correlation
We first give the estimators for the subcopula score based pairwise correlations of (Y, X), (Z, X), and (Y, Z) given in Definition 3.1: wherer vv ,r ww ,r uu ,r vw ,r vu , andr wu are given bŷ The theorem below presents the asymptotic distributions of the estimators in Eq. (4.20).
Proof. The proof is given in the Sec. S3 of the supplementary material.

Estimation of subcopula regression based partial correlation
We give the estimator for the subcopula regression based partial correlation of Y and Z given X of Eq. (3.11):q Theorem 4.2 presents the asymptotic distribution of the estimator in Eq. (4.21).

Estimation of subcopula regression based conditional correlation
Below we give the estimator of the subcopula regression based conditional correlation of Y and Z given X of Eq. (3.17): The following theorem presents the asymptotic distribution of the estimator in Eq. (4.22).
w In practice, estimating the large sample distributions of the Fisher transformation ofq YZ ,q YZ X , andq YZjX¼x i in Eqs. (4.20)-(4.22) tends to lead to faster convergence to normality and more reliable confidence intervals of the proposed measures. In the real data analysis of Sec. 5.3, we will obtain the confidence intervals for the proposed measures by the asymptotic distributions of the Fisher transformation ofq YZ ,q YZ X , andq YZjX¼x i : Remark 4.1. The partial and conditional correlation measures for multiple categorical covariates given in Eqs. (3.18) and (3.19) can be estimated in a similar fashion using the cell counts fn jki 1 :::i d g in a ðd þ 2Þ-way contingency table obtained by classifying n subjects with respect to J, K, I 1 :::, I d , categories of Y, Z and d covariates X 1 , :::, X d where j ¼ 1, :::, J, k ¼ 1, :::, K, i ' ¼ 1, :::, I ' and ' ¼ 1, :::, d: Remark 4.2. For small and moderate sized data sets, one can also use the bootstrap method (Efron and Tibshirani 1994) to construct the confidence intervals for the proposed (partial/conditional) measures. In the data analysis section (Sec. 5.3) we will also compute the bootstrap biascorrected and accelerated (BCa) confidence interval using 10,000 bootstrap samples of size n that are simulated from the saturated log-linear model fitted to the observed contingency table.

Numerical examples
In this section, the performances of the proposed subcopula regression-based association measures are illustrated with simulation studies and real data sets.

Simulation study
We conduct the simulation study where the proposed partial correlation in Eq. (3.11) and conditional correlation in Eq. (3.17) will be evaluated in the J Â K Â I contingency table with two ordinal variables, Y with J levels and Z with K levels, and a covariate X with I levels. We will also compare the proposed measures with two partial correlation measures and three conditional association/correlation measures commonly used in the literature: Goodman and Kruskal's conditional gamma (Davis 1967), conditional/partial Kendall's tau-b (Kendall 1938;Agresti 1977), and conditional/partial Spearman's rank correlations using probability-scale residuals .
The conditional gamma c YZjX¼x i for the i-th level of X is given by The partial Kendall's tau-b s YZ X and the conditional Kendall's tau-b s YZjX¼x i for the i-th level of X are defined by where s YZ , s YX , and s ZX represent the pairwise Kendall's taus for the corresponding two way marginal tables, and P cji ¼ 2 k 0 <k p j 0 k 0 ji Þ are the conditional concordance and discordance probabilities on the i-level of X.
The partial Spearman's rank correlation r s YZX and conditional Spearman's rank correlation r s YZjX¼x i for i-th level of X are given as r s YZX ¼ corrðrðY, F YjX Þ, rðZ, G ZjX ÞÞ, (5.25) where F YjX and G ZjX denote the conditional distribution functions of Y given X and Z given X, respectively, and rðx, FÞ ¼ FðxÀÞ þ FðxÞ À 1 denotes the probability-scale residual of X ¼ x for an ordinal random variable X with the distribution F.

Simulation design
To investigate the performance of the proposed partial/conditional association measures and compare them with the commonly used partial/conditional measures mentioned above, we consider the four simulation scenarios regarding the relationship between two ordinal variables Y and Z given a covariate X: [I] Conditional independence: Y and Z are conditionally independent at each level of X; [II] Linear pattern: the score in Z increases linearly as the score in Y increases linearly at each level of X; [III] Monotone nonlinear pattern: the score in Z increases exponentially as the score in Y increases linearly at each level of X; [IV] Nonmonotone nonlinear (U-shaped) pattern: the score in Z increases and decreases parabolically as the score in Y increases linearly at each level of X.
To simulate the three-way contingency tables under the scenarios [I]-[III], we use the homogeneous linear-by-linear association log-linear model (Agresti 2010): (5.27) where fl jki g is the expected count in the (j, k, i) cell of the J Â K Â I table and the parameter constraints are k Y 1 ¼k Z 1 ¼k X 1 ¼0 and k YX j1 ¼k YX 1i ¼k ZX k1 ¼k ZX ki ¼0 for all j, k and i. Note that exp ðbÞ gives the conditional local odds ratios describing local YZ association at a value of X.
For the scenario (I) (conditional independence), we simulate the J Â K Â I tables from Eq. (5.27) with b ¼ 0. For the scenario (II) (linear pattern), we use the unit-spaced scores for both Y and Z (a j ¼ 1, :::, J and b k ¼ 1, :::, K) and consider the three values of b representing weak, moderate, strong linear association between Y and Z. For the third scenario (III) (monotone nonlinear pattern), we choose the unit-spaced scores for Y (a j ¼ 1, :::, J) and the quadratic spaced scores for Z (b k ¼ 1 2 , :::, K 2 ).
For the scenario (IV) (U-shaped pattern), we use the homogeneous column effect log-linear model with quadratic term (Agresti 2010): (5.28) where the unit-spaced scores for Y is used (a j ¼ 1, :::, J), and x k and g k are the parameter scores for Z under constraints x 1 ¼ g 1 ¼ 0: The detailed information on the parameter values of the models (Eqs. (5.27) and (5.28)) used in the simulation study is given in the Sec. S6.1 of the supplementary material. Under each scenario described above, we consider two additional factors, the sizes of the contingency tables (the values of J, K and I) and sample sizes (n): 5 Â 5 Â 2 and 3 Â 3 Â 2 tables (J, K ¼ 3, 5, and I ¼ 2) and n ¼ ð1000, 5000Þ: Finally, under each combination of the simulation factors, we simulate 500 J Â K Â I contingency tables of size n and calculate the following three partial and four conditional association measures: the proposed subcopula regression based partial correlation q YZ X in Eq. (3.11), the partial Spearman's rank correlation r s YZX in Eq. (5.25), the partial Kendall's tau-b s YZX in Eq. (5.24); the proposed subcopula regression based conditional correlation q YZjX¼x i in Eq. (3.17), the conditional Spearman's rank correlation r s YZjX¼x i in Eq. (5.26), the conditional gamma c YZjX¼x i in Eq. (5.23), the conditional Kendall's tau-b s YZjX¼x i in Eq. (5.24). Note that the existing partial and conditional measures above were calculated using R packages devtools (Wickham, Hester, and Chang 2018), DescTools (Signorell 2019), and PResiduals (Dupont et al. 2018).
In order to effectively compare the performances of the proposed measures and the existing association measures, we present the simulation results using the boxplots. For each combination of the simulation factors (scenario for association pattern, magnitude of association under the scenario (II), size of the contingency table and sample size), the four box plots for the conditional association measures at each level of X (ordered as subcopula regression based correlation, Spearman correlation, Gamma and Tau) and three box plots for the partial association measures (ordered as subcopula regression based correlation, Spearman correlation, and Tau) are presented. The true value of each association measure is represented by the black dot on each boxplot. Note that due to the limited space, the paper will give only the results of the two simulation scenarios [III] and [IV]. For the results of [I] and [II], see the Sec. S6.2 of the supplementary material. Figures 1 and 2 show the boxplots of the conditional/partial association measures for 3 Â 3 Â 2 and 5 Â 5 Â 2 tables with n ¼ ð1000, 5000Þ and a monotonic non-linear pattern. Note that Figures 1 and 2 include the results for the four conditional measures and the three partial measures, respectively. The conditional Gamma measure is much larger than the other three conditional association measures and the conditional/partial tau measure is smaller than the remaining conditional/partial measures. The subcopula regression-based correlation and the Spearman correlation perform similarly. Note that the biases and variabilities of all four (conditional/partial) measures decrease as n increases. Figures 3 and 4 are the boxplots of the conditional/partial association measures for 3 Â 3 Â 2 and 5 Â 5 Â 2 tables with n ¼ ð1000, 5000Þ and a U-shaped pattern. The major distinction between the proposed subcopula regression based correlations and the other measures is that the proposed measures are able to identify the existence of a U-shaped relationship in the simulated contingency table, unlike the other measures. The true values of proposed subcopula regression based correlation measure are far from zero (around 0.5), while the true values of the other association measures are very close to zero. The sampling distribution of each measure is around its true value and the variabilities of the proposed measures are smaller than those of the other measures, regardless of the size of the table (the values of (J,K)) and the sample size, n.

Further investigation on the U-shaped associations
The simulation results of Sec. 5.1.3 showed that the proposed association measures are able to detect the U-shaped association simulated from the homogeneous column effect log-linear model with quadratic term. As a reviewer pointed out, it would be insightful to further assess the performance of the proposed association measures for different types of U-shaped associations.
In order to flexibly simulate contingency tables for two ordinal variables Y and Z with different U-shaped patterns at each level of binary covariate X, we employ the quadratic latent variable regression approach based on the underlying continuous variables Y Ã and Z Ã for Y and Z. First, we generate nð¼ 10 6 Þ values of ðY Ã , Z Ã , XÞ from the four quadratic latent variable regression models, each describing different relationship between Y Ã and Z Ã : Given nð¼ 10 6 Þ observations of ðY Ã , Z Ã , XÞ simulated from each quadratic regression model, we then discretize Y Ã and Z Ã to construct three 5 Â 5 Â 2 contingency tables of sample size n, each with one of three U-shaped patterns: (a) smooth, (b) pointy and (c) flat patterns. Note that we choose suitable discrete marginal distributions for Y and Z to generate an intended U-shaped association in the constructed contingency table. The detailed information on the parameter values of the quadratic latent variable regression model and the discrete marginal distributions is provided in Sec. S6.3 of the supplementary material. Figure 5 presents twelves 5 Â 5 Â 2 tables with different types of U-shaped associations and the corresponding estimated values of the partial correlation measure and the conditional correlation measures at each level of X. The U-shaped patterns shown in Figure 5a are smoother than those shown in Figure 5b,c, and the U-shaped patterns in Figure 5b,c are more pointy and flatter than the other patterns, respectively. Under each of three U-shaped patterns, we present the four contingency tables that differ in the relationship between two ordinal variables (Y is a function of Z and vice versa) and the ways in which the larger cell counts are located along the U-shaped pattern. Note that, in each contingency table the darker the gray colors, the larger the cell counts.
From the results given in Figure 5, we have made the following observations. First, the estimated values of the proposed association measures are all non-zero for the three types of Ushaped associations. Note that the existing association measures considered in Sec. 5.1 were all very close to 0. Second, the proposed association measures tend to take higher values (in absolute  value) for smooth U-shaped patterns than for pointy and flat U-shaped patterns. This is because, for the table with smooth U-shaped patterns, the category of Z(Y) at which the largest count occurs has only one category of Y(Z) with non-zero count. Third, for each U-shaped pattern, the four contingency tables simulated from four simulation regression models have similar estimated values of the association measures (in absolute value), except their signs. The signs of the proposed measures are determined by the direction of the pattern of larger (joint/conditional) cell counts (weights in the weighted linear regression used to approximate U-shaped associations between Y and Z) in the space of adjusted subcopula scores.

Data analysis
In this section, we demonstrate the utility of the proposed (pairwise/partial/conditional) correlation measures using two real data sets, one from the patient satisfaction study and the other from the study on toxemia of pregnancy. Table 3, obtained from the patient satisfaction study in a hospital in Naples, shows the responses of 1050 patients according to three ordinal variables, overall Satisfaction (S), hospital Cleanliness (C) and Quality management (Q). They are measured on a scale from 1 (Low) to 4 (High): S1 -S4 for Satisfaction, C1 -C4 for Cleanliness, Q1 -Q4 for Quality. Beh, Simonetti, and D'Ambra (2007) analyzed Table 3 using the Marcotorchino index (Marcotorchino 1984a(Marcotorchino , 1984b(Marcotorchino , 1985 to understand the dependence structure where Satisfaction is treated as the response variable and Cleanliness and Quality as the predictors. To assess association between the response variable and one predictor, with the effect of the other predictor removed, we compute the proposed pairwise and partial correlation for this data. Table 4 shows the estimates of the pairwise and partial correlation measures and the corresponding 95% (asymptotic/bootstrap BCa) confidence intervals for the data in Table 3. We first see that all the correlation measures are positive and they are statistically significant because none of the confidence intervals include zero. This means that both Cleanliness and Quality are  statistically significant in influencing Satisfaction. The magnitudes of both pairwise and partial associations indicate that Quality appears to be more influential in the level of Satisfaction than Cleanliness (i.e.,q SC <q SQ andq SCQ <q SQC ). In addition, the association between Cleanliness and Quality is statistically positive. Note that the interpretation of the results given above are consistent with those from the analysis by Beh, Simonetti, and D'Ambra (2007).

Patient satisfaction data
We also observe that the influence of one predictor on the response variable is overestimated without removing the effect of the other predictor. That is, the 18.43% reduction in association strength results from removing the effect of Quality in the assessment of the association between Satisfaction and Cleanliness (i.e.,q SC ¼ 0.5211 >q SCQ ¼ 0.4250). For the evaluation of the association between Satisfaction and Quality, the Cleanliness accounts for 15.17% of the total correlation (i.e.,q SQ ¼ 0.5804 >q SQC ¼ 0.4923). Note that the bootstrap confidence intervals for two partial correlations tend to be shorter than the asymptotic intervals. Table 5 shows a 2 Â 2 Â 5 Â 3 table concerned with the occurrence of signs of toxemia (hypertension and protein urea) among 13,384 expectant mothers in Bradford (England) in their first pregnancy (Brown, Stone, and Smith 1983). The mothers were classified by social class (I-V) and by the number of cigarettes smoked per day during pregnancy (0, 1-19, 20þ) and so there are two response variables (hypertension (H) and protein urea (P)) and two ordinal explanatory variables (social class (C) and number of cigarettes smoked (S)). One of the main questions of interest here is how the association between hypertension and protein urea is related to the level of smoking and social class. Table 6 presents the estimates of the proposed pairwise correlations for Table 5. From the pairwise correlation results, we first see that hypertension and protein urea are moderately positively associated (the corresponding confidence interval does not include zero), indicating that one symptom is more likely to occur with the other symptom. We also observe that the prevalence of hypertension is slightly positively associated with the level of smoking and the prevalence of protein urea may be associated with neither of two covariates. Note that the bootstrap BCa interval for q HC does not include zero, unlike the corresponding asymptotic interval.  Table 7 shows the estimates of the proposed conditional correlations between hypertension and protein urea, controlling for both level of smoking and social class simultaneously. We see that the occurrence of these symptoms has a rather complex relationship with level of smoking and social class. First, the associations between hypertension and protein urea appear to be greater for nonsmokers (S ¼ 0) than smokers with "1-19," particularly in social classes I-III. Second, the associations between two symptoms for the heavy smokers (S ¼ 20þ) vary depending on social class, and their patterns are quite different from those for the mothers with S ¼ "0" and "1-19." Note that the conditional correlations for two combinations of the categories of level of smoking and social class (20þ, I) and (20þ, II) may be poorly estimated due to small sample size.

Discussion
In this paper, we propose the new partial/conditional association measures between two ordinal variables of interest, adjusting for the covariate(s). They are constructed based on the subcopula scores and the subcopula regressions to describe the dependence between two ordinal variables and the covariate(s) in a non-model based fashion. In particular, we investigate statistical properties, including asymptotic distributions, of the proposed association measures and their estimators for ordinal three-way contingency table with a single covariate. It is also shown by simulation that the proposed measures can describe not only the linear/monotone association but also Ushaped associations.
As explained in Remark 3.1 and 3.2, the proposed association measures approximate the Ushaped association by the weighted linear regression between two ordinal variables of interest given a covariate in the space of adjusted subcopula scores. This indicates that the proposed measures cannot take the value of 1 for the tables showing an exactly quadratic pattern. However, we think that nonzero values of the proposed measures can be considered as an indication of the association between two ordinal categorical variables given covariates especially when existing association measures are all very close to 0 as shown in Sec. 5.1. As a reviewer suggested, it is important to further investigate the ability of the proposed measures to detect non-monotonic associations (such as U-shaped association) and understand how to interpret the values of the proposed measures.
In the real data analysis, we used the bootstrap approach to quantify the uncertainty associated with the proposed association measures adjusting for the multiple categorical covariates. An extension of this work would be to investigate the asymptotic distributions of the estimators for the proposed measures controlling for the multiple categorical covariates and derive the corresponding asymptotic confidence intervals.
The proposed partial and conditional association measures are defined mainly for the ordinal contingency tables with categorical covariates. In practice, there may be several continuous or mixed covariates to be adjusted for the assessment of association between two ordinal variables. Future work will consider the development of the subcopula regression based partial and conditional association measures that would permit the adjustment of continuous or mixed covariates.