Tehran, Iran.

This paper proposes a class of estimators for population correlation coefficient when information about the population mean and population variance of one of the variables is not available but information about these parameters of another variable (auxiliary) is available, in two phase sampling and analyzes its properties. Optimum estimator in the class is identified with its variance formula. The estimators of the class involve unknown constants whose optimum values depend on unknown population parameters.


Introduction
Consider a finite population U ¼ {1; 2; . . .; i; . . .; N }: Let y and x be the study and auxiliary variables taking values y i and x i , respectively, for the ith unit. The correlation coefficient between y and x is defined by Based on a simple random sample of size n drawn without replacement, (x i , y i ), i ¼ 1; 2; . . .; n; the usual estimator of r yx is the corresponding sample correlation coefficient: ð y i 2 yÞ 2 ; y ¼ n 21 X n i¼1 y i ; x ¼ n 21 X n i¼1 x i : The problem of estimating r yx has been earlier taken up by various authors including Gupta and Singh (1989), Gupta et al. (1978Gupta et al. ( , 1979, Koop (1970), Rana (1989), Singh et al. (1996) and Wakimoto (1971), in different situations. Srivastava and Jhajj (1986) have further considered the problem of estimating r yx in the situations where the information on auxiliary variable x for all units in the population is available. In such situations, they have suggested a class of estimators for r yx which utilizes the known values of the population mean X and the population variance S 2 x of the auxiliary variable x.
In this paper, using two-phase sampling mechanism, a class of estimators for r yx in the presence of the available knowledge ( Z and S 2 z ) on second auxiliary variable z is considered, when the population mean X and population variance S 2 x of the main auxiliary variable x are not known.

The suggested class of estimators
In many situations of practical importance, it may happen that no information is available on the population mean X and population variance S 2 x ; we seek to estimate Finite correlation coefficient the population correlation coefficient r yx from a sample "s" obtained through a two-phase selection. Allowing simple random sampling without replacement scheme in each phase, the two-phase sampling scheme will be as follows: (1) The first phase sample s* ðs* , U Þ of fixed size n 1 , is drawn to observe only x in order to furnish a good estimates of X and S 2 x : (2) Given s*, the second-phase sample s ðs , s* Þ of fixed size n is drawn to observe y only.
We write u ¼ x= x* ; v ¼ s 2 x =s *2 x : Whatever be the sample chosen let (u, v) assume values in a bounded closed convex subset, R, of the two-dimensional real space containing the point (1, 1). Let h (u, v) be a function of u and v such that: and such that it satisfies the following conditions: (1) The function h(u, v) is continuous and bounded in R.
(2) The first and second partial derivatives of h(u, v) exist and are continuous and bounded in R. Now one may consider the class of estimators of r yx , defined by: which is double sampling version of the class of estimators Suggested by Srivastava and Jhajj (1986), where u* ¼ x= X; v* ¼ s 2 x =S 2 x and X; S 2 x are known. Sometimes even if the population mean X and population variance S 2 x of x are not known, information on a cheaply ascertainable variable z, closely related to x but IJSE 31,10 compared to x remotely related to y, is available on all units of the population. This type of situation has been briefly discussed by, among others, Chand (1975) and Kiregyera (1980Kiregyera ( , 1984. Following Chand (1975) one may define a chain ratio-type estimator for r yx aŝ where the population mean Z and population variance S 2 z of second auxiliary variable z are known, and are the sample mean and sample variance of z based on preliminary large sample s* of size n 1 ð. nÞ: The estimatorr 1d in equation (2.3) may be generalized aŝ where a i 0 s ði ¼ 1; 2; 3; 4Þ are suitably chosen constants. Many other generalization ofr 1d is possible. We have, therefore, considered a more general class of r yx , from which a number of estimators can be generated.
The proposed generalized estimators for population correlation coefficient r yx , is defined byr Satisfying the following conditions: (1) Whatever be the samples (s* and s) chosen, let (u, v, w, a) assume values in a closed convex subset S, of the four-dimensional real space containing the point P¼ (1, 1, 1, 1). (2) In S, the function t(u, v, w, a) is continuous and bounded.
(3) The first and second order partial derivatives of t(u, v, w, a) exist and are continuous and bounded in S.
To find the bias and variance ofr td we write: Finite correlation coefficient and ignoring the finite population correction terms, we write to the first degree of approximation: ( p, q, m) being non-negative integers.
To find the expectation and variance ofr td ; we expand t(u, v, w, a) about the point P ¼ ð1; 1; 1; 1Þ in a second-order Taylor's series, express this value and the value of r in terms of e's. Expanding in powers of e's and retaining terms up to second power, we have: which shows that the bias ofr td is of the order n 2 1 and so up to order n 2 1 , mean square error and the variance ofr td are same. Expanding ðr td 2 r yx Þ 2 ; retaining terms up to second power in e's, taking expectation and using the above expected values, we obtain the variance ofr td to the first degree of approximation, as: þ ðd 040 2 1Þt 2 2 ðPÞ 2 At 1 ðPÞ 2 Bt 2 ðPÞ þ 2d 030 C x t 1 ðPÞt 2 ðPÞ i 2 r 2 yx =n 1 h C 2 x t 2 1 ðPÞ þ ðd 040 2 1Þt 2 2 ðPÞ 2 C 2 z t 2 3 ðPÞ 2 ðd 004 2 1Þt 2 4 ðPÞ 2 At 1 ðPÞ 2 Bt 2 ðPÞ þ Dt 3 ðPÞ þ Ft 4 ðPÞ þ 2d 030 C x t 1 ðPÞt 2 ðPÞ 2 2d 003 C z t 3 ðPÞt 4 ðPÞ i ð2:8Þ where t 1 (P), t 2 (P), t 3 (P) and t 4 (P), respectively, denote the first partial derivatives of t(u, v, w, a) respect to u, v, w and a, respectively, at the point P ¼ ð1; It is observed from equation (2.11) that if optimum values of the parameters given by (2.10) are used, the variance of the estimatorr td is always less than that of r as the last two terms on the right hand sides of (2.11) are non-negative. Two simple functions t(u, v, w, a) satisfying the required conditions are: tðu; v; w; aÞ ¼ 1 þ a 1 ðu 2 1Þ þ a 2 ðv 2 1Þ þ a 3 ðw 2 1Þ þ a 4 ða 2 1Þ tðu; v; w; aÞ ¼ u a 1 v a 2 w a 3 a a 4 and for both these functions t 1 ðPÞ ¼ a 1 ; t 2 ðPÞ ¼ a 2 ; t 3 ðPÞ ¼ a 3 and t 4 ðPÞ ¼ a 4 : Thus, one should use optimum values of a 1 , a 2, a 3 and a 4 inr td to get the minimum variance.

IJSE 31,10
It is to be noted that the estimatedr td attained the minimum variance only when the optimum values of the constants a i (i¼ 1, 2, 3, 4), which are functions of unknown population parameters, are known. To use such estimators in practice, one has to use some guessed values of population parameters obtained either through past experience or through a pilot sample survey. It may be further noted that even if the values of the constants used in the estimator are not exactly equal to their optimum values as given by equation (2.8) but are close enough, the resulting estimator will be better than the conventional estimator, as illustrated by Das and Tripathi (1978, Section 3). If no information on second auxiliary variable z is used, then the estimatorr td reduces tor hd defined in equation (2.2). Taking z ; 1 in equation (2.8), we get the variance ofr hd to the first degree of approximation, as: which is always positive. Thus, the proposed estimatorr td is always better thanr hd :

A wider class of estimators
In this section, we consider a class of estimators of r yx wider than equation (2.5) given by:r gd ¼ gðr; u; v; w; aÞ ð 3:1Þ where g(r, u, v, w, a) is a function of r, u, v, w, a and such that Finite correlation coefficient gðr; 1; 1; 1; 1Þ ¼r td and ›gð·Þ ›r ! ðr;1;1;1Þ ¼ 1 Proceeding as in Section 2, it can easily be shown, to the first order of approximation, that the minimum variance ofr gd is same as that ofr td given in equation (2.11). It is to be noted that the difference-type estimator is a particular case ofr gd ; but it is not the member ofr td in equation (2.5).

Empirical study
To illustrate the performance of various estimators of population correlation coefficient, we consider the data given in Murthy (1967, p. 226 r yz ¼ 0:9413: The percent relative efficiencies (PREs) ofr 1d ;r hd ;r td with respect to conventional estimator r have been computed and compiled in Table I.