Joint estimation of disease-specific sensitivities and specificities in reader-based multi-disease diagnostic studies of paired organs

Binocular data typically arise in ophthalmology where pairs of eyes are evaluated, through some diagnostic procedure, for the presence of certain diseases or pathologies. Treating eyes as independent and adopting the usual approach in estimating the sensitivity and specificity of a diagnostic test ignores the correlation between fellow eyes. This may consequently yield incorrect estimates, especially of the standard errors. The paper is concerned with diagnostic studies wherein several diagnostic tests, or the same test read by several readers, are administered to identify one or more diseases. A likelihood-based method of estimating disease-specific sensitivities and specificities via hierarchical generalized linear mixed models is proposed to meaningfully delineate the various correlations in the data. The efficiency of the estimates is assessed in a simulation study. Data from a study on diabetic retinopathy are analyzed to illustrate the methodology.


Introduction
Subjecting patients to several tests or to the same test on several occasions is a common protocol in many diagnostic studies in medicine and health. This is especially true in situations where test outcomes are reader-based assessments, in which case two or more readers are necessary to minimize, if not eliminate, so-called reader bias. Simply assuming that the test results are independent is unwise, since associations among these observations often exist; for example, test results from the same patient are frequently correlated. The proper accounting of these associations, which intuitively exist between measurements from a patient, is an interesting statistical question. Failure to account for such correlations by treating measurements from the patient as independent may consequently yield incorrect inferences. Because valid information contained in correlated observations is less than expected from independent observations, standard errors (SEs) calculated by incorrectly assuming correlated observations to be independent tend to underestimate the true sampling variability, consequently yielding inflated type I errors of significance tests [18,22]. This paper is motivated by a reader-based diagnostic study of diabetic retinopathy [28] in Alberta, Canada, in which at least two readers were used to diagnose the presence or absence of certain pathologies (e.g. clinically significant macular edema (CSME), hard exudates (HEX), etc.) among diabetic patients who suffer from treatable diabetic retinopathy. In Canada, where a disproportionate number of diabetic patients are First Nations Canadians living on reserves in farflung rural areas, sending ophthalmologists on remote clinics can be costly and inefficient. Due to advances in digital imaging in recent years, a possible alternative is distance evaluation wherein patients undergo stereoscopic digital photography using a high-resolution digital camera. In this approach, digital images of patients' eyes are read by at least two opthalmologists and patients are diagnosed as either positive (i.e. disease is present) or negative (i.e. disease is absent) for the pathologies. This cost-effective tele-ophthalmologic technique has the potential to increase rural accessibility to specialist eye care [20], allowing for early detection and treatment of diabetic retinopathy. Only patients who need treatments would have to travel to a specialist; the transportation cost is thus also reduced. However, before wide implementation of any potential new diagnostic methodology, its accuracy must first be examined. The purpose of the study was thus to determine whether diabetic retinopathy can be identified with high-resolution stereoscopic digital photography and whether this identification correlates well with the accepted gold standard of clinical examination. In such reader-based diagnostic studies, it is often of interest to estimate the overall sensitivity and specificity, independently of readers. Making the usual assumption that the test results for different patients are independent, there is a complex correlation structure to a patient's diagnostic data that needs to be incorporated in the analysis.
The accuracy of a diagnostic or screening test can be described by several measures, the most common of which are given by the test's sensitivity and specificity with respect to the true disease status as determined by the 'gold standard'. Sensitivity is the probability of a diseased patient having a positive test result and specificity is the probability of a non-diseased patient having a negative test result. There has been previous work on the estimation of sensitivity and specificity and the calculation of the estimates' SEs in the context of clustered binary diagnostic data. These include simple adjustments to SEs introduced by Rao and Scott [24] and Donner and Klar [4] to account for the intra-cluster correlation, and a weighted estimate proposed by Lee and Dubin [12] for handling unbalanced cluster sizes. A similar approach based on weighting was discussed by Leite and Nicolosi [15] in the context of logistic regression analysis of binocular ophthalmologic data. Traditional estimation methods that ignore correlations in clustered diagnostic data have been shown to be inadequate as they underestimate SEs and lead to incorrect inferences [6]. Methods that account for these correlations are thus needed. Approaches to handling this problem include the generalized estimating equations (GEE) approach of Smith and Hadgu [29] (see also Sternberg and Hadgu [31] and Leisenring et al. [14]) and those based on likelihood methods studied by Hujoel et al. [11] and Rosner [27], among others (see also Sutradhar and Das [32] and Lefkopoulou et al. [13]). More recently, Hsiao et al. [8] discussed the assessment of inter-and intra-rater reliability for data structures with more than one level of nesting from a Bayesian point of view. One shortcoming of their model is that associations are restricted to be positive. A number of authors have also discussed estimating diagnostic accuracy without a gold standard; a latent class approach is common in this situation (see Hui and Walter [9], Hui and Zhou [10] and Zhou et al. [36]). Such an approach assumes that multiple readers/tests are independent, conditional on the true disease status. However, this assumption cannot always be justified and may not be valid in practice. Alternatively, Qu et al. [23] developed a general latent class model with normally distributed subject-specific random effects to model the conditional dependence among multiple diagnostic tests. In a further extension, Wang and Zhou [34] recently incorporated normal subject-specific random effects while assuming fixed effects for the raters. Recent references on correlated ophthalmologic data include de Leon et al. [16,17].
A generalized linear mixed model (GLMM) for estimating disease-specific sensitivities and specificities using correlated binocular binary data from reader-based multi-disease diagnostic studies is discussed in this paper. Such an approach allows for the flexible incorporation of the various correlations in the data and the resulting methodology can be readily implemented using standard statistical packages and software. The development of the model via a latent variable formulation of the binary outcomes is presented in Section 2; a brief discussion of the advantages of the model is also included. Model estimation via pairwise likelihood (PL) is then proposed in Section 3 as a means of alleviating the computational demands of a full likelihood analysis. In Section 4, the bias and efficiency of PL estimates are studied and compared against those of GEE estimates. An application to data from the diabetic retinopathy study described earlier is used to illustrate the methodology in Section 5. A brief summary concludes the paper in Section 6.

Correlated probit model
Assume that N patients are evaluated by K readers for the presence or absence of V pathologies in their left and right eyes. Define Y ijkv as the binary outcome representing the assessment of pathology v = 1, . . . , V , by reader k = 1, . . . , K, for eye j = L (left), R (right), of patient i = 1, . . . , N, with Y ijkv = 1 or Y ijkv = 0, accordingly as whether the test result is positive or negative, respectively. For patient i, let D ijv denote the disease status as determined by the gold standard, and define x ij as a vector of known covariates for eye j of patient i. We assume that the (observed) binary outcome Y ijkv is associated with an underlying latent continuous variable Y * ijkv by the threshold model Y ijkv = I{Y * ijkv > 0}, where I{·} is the indicator function. The underlying linear mixed model (LMM) is given by for all i, j, k and v, with γ the vector of regression coefficients, β v0 and β v1 the pathology-specific fixed effects, . . .
where B (p) i0 and B (p) i1 are respective pathology-specific random intercepts and slopes for disease status, B (r) i represents reader-specific random effects, and B (e) i represents eye-specific random effects and where we assumed independence among the different random effects. We also have * ijkv iid ∼ N(0, 1) as the errors, assumed to be independent of the random effects. It follows that the Y * ijkv 's have a joint latent Gaussian distribution, which implies a multivariate probit model for the binary outcomes. That is, the model given by Equation (1) yields a probit mixed model for Y ijkv given by where (·) is the standard Gaussian cumulative distribution function (CDF). Note that we assumed the cutpoints or thresholds to be all 0 for identifiability, since we included the intercept β v0 in the GLMM given by Equation (2). In addition, we made the conventional assumption that var(Y * ijkv |B i1 of true disease status induce between-pathology correlations via their covariance matrix (p) . The random effects B (r) i represent between-reader heterogeneity and thus, account for correlations between readers arising from the fact that the readers base their pathology diagnoses on the same image of a patient's eye, while the random effects B (e) i account for correlations between fellow eyes. Observe that the associations incorporated in the model given by Equation (1) are not restricted only to positive correlations.
Define the stacked vector . The multivariate LMM given by Equation (1) (i.e. the multivariate GLMM for the binary outcomes) then takes the form where the stacked errors are iid according to an M -dimensional centered Gaussian distribution (i.e. with zero means) and covariance matrix = I M , the M × M identity matrix. Here, X i is the matrix containing the disease status and eye-specific covariates for patient i, and β = (γ |β 10 , β 11 | · · · |β V 0 , β V 1 ) ; Z 1i , Z 2i and Z 3i are the design matrices for the respective random effects B i and B (e) i , representing pathology-specific, reader-specific and eye-specific random effects, respectively; note that Z 1i contains the disease statuses D ijv . In addition, B i and * i are assumed to be independent. It is then easy to see that the marginal distribution of Note that although * i varies with i, its elements do not depend on i. We thus get a (marginal) multivariate probit model for Y i given by where φ (c) M (·; * i ) is the centered M -dimensional Gaussian density with covariance matrix * i , with the interval A ih either (−∞, 0] or (0, +∞) accordingly as whether y ih is 0 or 1, and with D i the vector of true disease statuses for patient i.
Suppressing the indices, it follows that We thus have eye-specific sensitivities Sen jv and specificities Spc jv for pathology v as for v = 1, . . . , V and j = L, R. In many applications, the covariates x L and x R are usually measured at the patient level, so that x L = x R = x. This implies that Sen Lv = Sen Rv = Sen v and Spc Lv = Spc Rv = Spc v . That is, the sensitivity and specificity of a diagnostic test for a pathology are independent of the particular eye under consideration. Note, however, that even in the absence of eye-level covariates, it is still possible to obtain eye-specific sensitivities and specificities by varying the parameters β v0 and β v1 with the eyes. Finally, because ophthalmologists are generally interested in measures of accuracy that are independent of the particular reader conducting the diagnostic procedures, Sen v and Spc v are conveniently free of k.
It can be shown that for any diagonal matrix G with positive diagonal elements, This implies that the variances in the covariance matrix * cannot be estimated based on the likelihood function [35] and hence only the correlations are identifiable (see appendix for a detailed discussion). 1 For this reason, we assume that (r) and (e) in Equation (3) are correlation matrices. In this setting, by including different random effects for eyes and readers, fixing variances to unity does not unduly constrain the model. By allowing for an unstructured (p) , the researcher has the flexibility of having different correlations for different pathologies. A noteworthy strength of the joint model given by Equation (3) lies in the flexibility in the way it accounts for and delineates the various correlations in the data through the introduction of reader-specific, pathology-specific and eye-specific random effects. The following section discusses the various marginal correlations induced by our model. As discussed in Section 3, the joint model also lends itself to computationally straightforward estimation of its parameters via PL.

Marginal correlations
Under the Gaussian latent specification of the binary outcomes in Y i , the model in Equation (3) yields a correlated probit model for Y i . This model is very flexible in accounting for the correlation structure of clustered multiple binary variables because of the underlying multivariate LMM for the Gaussian latent vector Y * i . Based on this, the associations among the binary outcomes are measured by the correlations between the corresponding latent continuous variables. These correlations are called the tetrachoric correlations between the binary outcomes and are commonplace in psychometrics. Because test results for different patients are assumed to be independent, the associations between the outcomes for a specific patient are captured solely by the three different random effects. For simplicity, we assume in what follows that only random intercepts B (p) i0 for the pathologies are included in the joint model. Using corr(Y * h , Y * h ) = σ * hh / σ * hh σ * h h , the marginal tetrachoric correlations are then found to be where ρ (p) vv , ρ (r) kk and ρ (e) are the correlations between the pathology-specific random effects (i.e. the off-diagonal elements of (p) ), the reader-specific random effects (i.e. the off-diagonal elements of (r) ) and the eye-specific random effects (i.e. the off-diagonal element of (e) ), respectively, with σ iv ) (i.e. the diagonal elements of (p) ). In practice, we may assume a common correlation between the reader-specific random effects (i.e. exchangeability of B (r) i1 , . . . , B (r) iK ), in which case we have ρ (r) = ρ (r) kk , for all k = k . Based on Equations (8)- (14), it is clear that our model provides a flexible specification of the dependence structure in the data.

Likelihood estimation
Marginally, Y * i follows an M -dimensional Gaussian distribution with mean μ * i and covariance matrix * i . As shown in Section 2.1, * can accommodate a complex correlation structure for the data. However, although the full joint model for Y * i is completely specified, the evaluation of the corresponding multivariate Gaussian orthant probabilities is difficult in practice. For instance, the case of two readers and two pathologies necessitates the evaluation of 2 × 2 × 2 = 8 multivariate Gaussian orthant probabilities. To obviate the computational complexity involved, an alternative is to use a pseudo-or composite likelihood function obtained from low-dimensional margins [1,3]. Selection of low-dimensional margins is based on computational savings. Here we adopt the PL method outlined in [25,26].

PL estimation
The PL contribution of subject i is obtained by replacing the full likelihood function (i.e. the density of Y * i ) by the product of the pairwise probabilities for the pairs Y * ih and Y * ih , for h < h.
for any pair h < h, then the contribution p i of patient i to the log-PL can be written as (δ (11) ihh log p (11) ihh + δ (10) ihh log p (10) where δ ( r) ihh = 1, if Y ih = and Y ih = r, and δ ( r) ihh = 0, otherwise, with Note that where 2 (·; τ ) is the standardized bivariate Gaussian CDF with correlation τ , and ρ hh is the pairwise correlation between Y * ih and Y * ih , obtained by selecting the appropriate 2 × 2 submatrix of * i . The PL estimateˆ of , which contains β and the unique parameters in (p) , (r) and (e) , is obtained by maximizing log PL = N i=1 p i with respect to . This is accomplished by solving the pairwise score equations 0 = U p ( ) = ∂ log PL/∂ . Since the pairwise score is a linear combination of valid likelihood score functions, then its unbiasedness follows under the usual regularity conditions. See also Fieuws and Verbeke [5] for a variant of this method.

Calculation of SEs ofˆ
Following Renard et al. [26], the variances of the PL estimates can be obtained from the inverse of the Godambe information matrix G( ) = H( )J −1 ( )H( ) (also known as the sandwich information matrix; see, e.g. Song [30]), where For clustered data, H( ) and J( ) are estimated by where U p i ( ) = ∂p i /∂ is the pairwise score equation for patient i. The SEs for the estimates inˆ are obtained from the estimated covariance matrixˆ Under standard regularity conditions,ˆ is consistent and has an asymptotic multivariate Gaussian distribution [2] with mean and covariance matrix G −1 ( ). The respective estimates Sen v and Spc v of Sen v and Spc v are directly obtained by the plug-in principle, and their SEs are given by the delta method. That is, cov Sen

Simulation study
The goal of this simulation study is to examine the PL approach to estimation for the correlated probit model based on Equation (3) in terms of the finite-sample properties of the resulting estimates, including those for the sensitivities and specificities of the pathologies. Further, we also compare the estimates of pathology-specific sensitivities and specificities based on the PL method with those obtained by GEE, with a user-defined marginal working-correlation structure (similar to that for the correlated probit model) as well as with the 'independence' workingcorrelation matrix, which we refer to as the 'crude' method. The performance of the estimatesθ g of θ g were evaluated based on their relative bias RB = 100 × {(mean ofθ g ) − θ g }/θ g and their relative efficiency RE = {mean of SEs}/{empirical SD ofθ g }, where SD denotes standard deviation. We also investigated the empirical coverage properties of the estimates by constructing the 95% confidence intervals (CIs)θ g ± 1.96 × SE(θ g ) for each parameter θ g , and obtaining the proportion of simulation repeats for which the corresponding CI captured θ g . For the simulations, we considered the case of two readers and two pathologies (i.e. V = K = 2). The underlying LMMs from which the data were simulated are given by with * ijkv iid ∼ N(0, 1), and  (20) and (21). Those for the sensitivities and specificities were obtained by delta method.
Simulation results are presented in Table 1. These results generally suggest that PL estimates for the model perform quite well in finite samples. The RBs generally decreased with increasing sample size N; even for the relatively small and moderate sample sizes N = 50 and N = 100, the RBs generally vary in an acceptable range, indicating that the PL method generally yields reasonably unbiased estimates. Some bias can be seen in the correlation estimates; however, this improved with increasing sample size. In addition, all the relative efficiencies are relatively close to one (which improved with increasing sample size), implying that the PL method yields robust SEs reflecting the true sampling variabilities of the estimates. Empirical coverage rates (CP) for most parameters for N = 50 (and a few for N = 100) appear to be inflated; however, upon comparison with the Monte Carlo margin of error ±1.96 √ 0.95 × 0.05/R = ±0.0191, only the coverage rates for β 22 for N = 50 and for ρ (r) for N = 50, 100, failed to enclose the nominal 95% level. Additionally, the lengths of CIs (not reported in Table 1) generally became shorter with increasing sample size. These results appear to confirm those previously reported by Renard et al. [26], who showed that PL estimates yield only slight efficiency loss compared to maximum likelihood estimates in random intercept models for clustered data. Overall, it appears that PL estimation provides reasonably good estimates with little efficiency loss.
We next compared the estimated pathology-specific sensitivities and specificities based on PL approach for the joint model based on Equation (23), against those from the crude method and from GEE, with a working-correlation structure similar to (but not the same as) the marginal correlation structure in the joint model; see appendix for details. The robust SEs of the GEE estimates of β = (β 10 , β 20 , β 11 , β 21 ) can be obtained using the so-called sandwich variance estimates [21,29]; these have the significant advantage over the model-based or naive estimate of being robust in the sense that they give consistent SEs even when the correlation structure is misspecified, provided that the marginal mean model is correctly specified. Note that, for the GEE approach, Sen v = (β v0 + β v1 ) and Spc v = 1 − (β v0 ). Estimates of Sen v and Spc v are thus obtained directly by the plug-in method, and corresponding SEs are computed via the delta method. The GEE approach was implemented in R using the package geese in the geepack library [7]. Interestingly, the package geese can accommodate the userdefined working-correlation matrix R i given in the appendix. Results from the crude method were obtained using the GEE approach with independence working-correlation structure (i.e. R i is the identity matrix), with SEs calculated from the model-based variance estimate [19].
Note that the GEE method [29,31] is a non-likelihood-based approach that does not rely on any distributional assumption regarding the binocular data. Furthermore, GEEs work most naturally for models specified marginally; in contrast, GLMMs are specified conditionally, given random effects. In the former, we marginally regress a reader's diagnosis on the corresponding disease status, using the probit link function, as P(Y ijkv = 1; D ijv ) = −1 (β v0 + β v1 D ijv ), for j = L, R, k = 1, 2, v = 1, 2 and i = 1, . . . , N. Note that the estimates from the marginal model (under GEE) and those from a GLMM such as Equation (23), are different, since the marginal probit model in Equation (5) obtained by averaging Equation (23) with respect to the random effects is not the same marginal probit model under GEE. As a consequence, estimates obtained under GEE (with a marginal probit model) and under the joint model based on Equation (23) are different and not comparable (i.e. β v0 and β v1 are different for the two models). However, the pathology-specific sensitivities and specificities are comparable. Results are presented in Table 2. Observe that the estimates from the three approaches are relatively close to the true values, with CP all close to the nominal 95% level. This is not surprising in light of the robustness of GEE estimates to misspecification of the marginal covariance structure. Not surprisingly, a significant efficiency loss can be observed for estimates from the crude method. The implication of ignoring the between-eyes, between-readers and betweenpathologies correlations is thus clear: failure to account for these correlations in any statistical analysis may lead to potentially incorrect inferences.
Note that while the GEE estimates compare favorably with the PL estimates for the joint model, this is accomplished with a careful specification of the working-correlation matrix R i , which is hardly known in practice. Moreover, this necessitates the specification of an 8 × 8 matrix; in general, the GEE method requires specifying an M × M working-correlation matrix, which renders the GEE method infeasible in applications involving many readers (i.e. large K) and many pathologies (i.e. large V ).

Application to diabetic retinopathy data
We now illustrate the proposed methodology described in Section 2 on data from the diabetic retinopathy study [16,28] described in Section 1. The study involved N = 94 diabetic patients in Alberta, Canada, who were referred to a comprehensive retina practice in Edmonton. The study protocol required that patients be clinically examined on the same day they underwent digital photography by a trained ophthalmic photographer using a high-resolution digital camera. The digital images were stored uncompressed and then graded by experienced ophthalmologists at least two months after they were taken. They were assessed in random order, with a minimum of two months in between review of the left and right eye images to minimize reader recall. In order to evaluate treatable diabetic retinopathy among the patients, a number of pathologies were identified as either present (positive) or absent (negative). The pathologies identified included CSME, microaneurysms, intra-retinal haemorrhage, HEX, and other diseases of note. Contact lens biomicroscopy, the clinical examination considered to be the 'gold standard' for most, but not all, of the pathologies considered, was performed on all patients by retinal specialists to determine disease status. Digital images of the patients' eyes were graded by two ophthalmologists and patients were diagnosed as either positive or negative for the pathologies.
In what follows, we consider the pathologies CSME and HEX for which 'gold standard' values are available. The summarized data concerning the presence (+) or absence (−) of CSME and HEX in the left and right eyes of the patients as evaluated by two readers along with the true disease status are presented in Table 3. Pathology CSME pertains to the thickening and swelling of an eye's macula due to fluid and protein deposits while pathology HEX involves the leakage of fluid and lipoprotein into the retina of the eye.
With K = V = 2, we adopt the same joint model from Equation (23), with only random intercepts for the pathologies. Furthermore, to ensure that the estimate of (p) is positive definite, we maximize the pairwise log-likelihood function indirectly with respect to the Cholesky factors of (p) instead of its direct elements [33]; estimates of its direct elements are then obtained by  the plug-in method and the corresponding SEs are computed by the delta method. In addition, Fisher's z-transformation is used for the correlation parameters in (r) and (e) , as discussed in Section 4. Table 4 displays the estimates and their (large-sample) SEs based on the PL method. The results for the corresponding accuracy measures are given in Table 5, both based on the PL approach and the GEE method with three different working-correlation structures, namely, independence (which is equivalent to the crude method), unstructured and that based on the marginal correlation structure of the joint model. Note that the estimates for sensitivities and specificities are slightly different according to the working-correlation structure assumed under GEE.
From Table 5, the crude method estimates for sensitivities and specificities are slightly closer to those from the PL method, with the discrepancy between the estimates lying within 2%, a negligible difference in practice. The other two sets of estimates (i.e. GEE with unstructured and structured working correlations) are considerably different from those from the PL method, with a maximum discrepancy between the estimates of at most 8%.
Comparing the individual values for the accuracy measures, we note that the sensitivity of HEX is slightly higher than that of CSME, while the reverse is true for the specificities. The estimated sensitivities and specificities for both pathologies range between 73% and 93%. It should be noted that the crude method yields comparatively smaller SEs. This can be misleading since Table 5. PL estimates for joint model based on Equation (23) of sensitivities and specificities and their SEs for pathologies CSME (v = 1) and HEX (v = 2) along with those obtained by GEE, with independence (i.e. crude method), unstructured, and structured working correlations (similar to that in joint model based on Equation (23)).  Table 6. Estimated marginal tetrachoric correlations and their SEs for joint model based on Equation (23) for pathologies CSME (v = 1) and HEX (v = 2). correlated observations contain less information than independent data, and when the former are incorrectly assumed to be the latter, underestimation of SEs may result, possibly leading to large Wald test statistics and inflated type I error rates [27]. This is evident from Table 2.
Estimates of the marginal tetrachoric correlations from the joint model based on Equation (23) are shown in Table 6. Note that these correlations pertain to the correlations between the underlying latent variables. The SEs of the estimated marginal correlations were computed via the delta method. Not surprisingly, the largest estimated correlations correspond to the case with different readers diagnosing the same eye for the same pathology, given by 0.7784 for CSME and 0.7765 for HEX. This implies that the between-readers agreement is quite strong. The estimated correlation 0.5771 between CSME and HEX for the same reader and the same eye is moderate (i.e. 0.4) while that between fellow eyes for the same reader and the same pathology is not.
Finally, note that the estimated marginal tetrachoric correlations are all positive although the correlation between the eye-specific random effects is negative (see Table 4).

Discussion
This paper focused on an important issue arising in reader-based multi-disease binocular diagnostic studies: how to account for correlations between readers' diagnoses and between diseases/ pathologies, while at the same time incorporating the intrinsic correlation between patients' fellow eyes. The general approach taken in the paper was a model-based one that relies on specifying a multivariate probit model for the joint distribution of the binary outcomes. The latent variable formulation of the binary diagnostic data should be appealing to ophthalmologists, as it provides a natural description of the biological process underlying retinopathy-related pathologies. A PL approach is adopted to overcome the computational complexities in computing multivariate orthant probabilities required in a full likelihood analysis of the probit model. Estimates derived from the approach are shown, empirically via simulations, to be more efficient than those from the crude method. While GEE estimates yielded comparable performance in terms of bias and efficiency, they require the specification of a working-correlation matrix whose dimension increases with the numbers of readers and pathologies. This can become impractical, even infeasible, in practice.
The methodology outlined in the paper is illustrated with data from a study concerning the diagnoses of two retinopathy-related pathologies among diabetic patients as evaluated by two ophthalmologists based on digital images of the eyes. Compared to GEE, our model has the advantage of being able to delineate inferences about various correlations -such as the binocular correlation between fellow eyes, the same-eye/different-pathologies correlation and the differentreaders/same-pathology/same-eye correlation -without the need to specify a large working correlation matrix. Such a matrix can be cumbersome to construct when many pathologies and many readers exist.