

The randomized response (RR) data-gathering device to procure trustworthy data on sensitive issues by protecting privacy of the respondent was first developed by Warner1. Feeling that the co-operation of the respondent might be further enhanced if one of the two questions referred to a non-sensitive, innocuous attribute, say Y, unrelated to the sensitive attribute A. Horvitz et al.2 proposed an unrelated question randomized response model. Greenberg et al.3 provided theoretical framework for a rectification to the Warner’s1 model envisaged by Horvitz et al.2. Numerous randomized response techniques have been developed for reducing non sampling errors in sample surveys, protecting a respondent’s privacy and increasing response rates. The key references of the randomized response model are Singh and Mathur4,5, Singh et al.6, Kim and Warde7, Kim and Elam8,9 and Singh and Tarray10,11.
Land et al.12 have considered a different and unique problem where the number of persons possessing a rare sensitive attribute is very small and a huge sample size is required to estimate this number. They proposed a method to estimate the mean of the number of persons possessing a rare sensitive attribute by utilizing the Poisson distribution in survey sampling. Land et al.12 have discussed two different situations that when the proportion of persons possessing a rare unrelated attribute is known and that when it is unknown. Lee et al.13 has extended the studies of Land et al.12 to the stratified sampling.
Singh and Tarray14 have further considered the problem of estimating the mean of the number of persons possessing a rare sensitive attribute using the Poisson distribution in the situation where the proportion of persons possessing a rare unrelated attribute is known. Singh and Tarray14 have suggested an alternative randomized response model based on Singh et al.15 model and studied its properties in presence of known population proportion of rare unrelated attributes. The main problem with the use of the method due to Singh and Tarray14 is that sometimes the mean value of the rare unrelated attribute remains unknown.
ESTIMATION OF PROPORTION OF A RARE SENSITIVE ATTRIBUTE WHEN PROPORTION OF A RARE UNRELATED ATTRIBUTE IS UNKNOWN
Let π1 be the true proportion of the rare sensitive attribute A1 in the population U. For example, the proportion of AIDS/HIV patients who continue having affairs with strangers, the proportion of persons who have witnessed a murder, the proportion of persons who are told by the doctors that they will not survive long due to a ghastly disease, for more examples, the reader is referred to Land et al.12. Consider selecting a large sample of n persons from the population such that as n → ∞ and π1→0 then, lim (n π1) = δ1 (finite). Let π2 be the true proportion of the population having the rare unrelated attribute A2 such that as n → ∞ and π2→0 then, lim (n π2) = δ2 (finite and known). For instance, π2 might be the proportion of persons who are born exactly at 12:00 o’clock, the proportion of babies born blind, see Land et al.12.
In the proposed procedure, each respondent in the sample of n persons, selecting using simple random sampling with replacement (SRSWR) from the given population, is requested to use the deck of cards marked as Deck-I and Deck-II. Each respondent in the sample is requested to use Deck-I consists of three types of cards bearing statements:
| Do you possess the rare sensitive attribute A1? |
| Do you possess the rare unrelated attribute A2? |
| Draw one more card |
with probabilities P1, P2 and P3, respectively such that . The respondent is required to draw one card randomly from Deck-I and give answer in term of “Yes” or “No” according to his/her actual status and the statement, (i) or (ii), drawn. However if the statement (iii) is drawn, he/she is required to repeat the above process without replacing that card. If the statement (iii) is drawn in the second phase, he/she is directed to report “No”. If m is the total number of cards in the Deck-I, the probability of a “Yes” answer is given by:
![]() | (1) |
Note that both attributes A1 and A2 are very rare in population. Assuming that, as → ∞ and θ1 → 0 such that (finite), thus it is clear that:
![]() | (2) |
Next, the respondent is requested to use Deck-II consists of three types of cards bearing statements:
| Do you possess the rare sensitive attribute A1? |
| Do you possess the rare unrelated attribute A2? |
| Draw one more card |
with probabilities T1, T2 and T3, respectively such that . The respondent is required to draw one card randomly from Deck-II and give answer in term of “Yes” or “No” according to his/her actual status and the statement, (i) or (ii) drawn. However, if the statement (iii) is drawn, he/she is required to repeat the above process without replacing that card. If the statement (iii) is drawn in the second phase, he/she is instructed to report “No”. If m is the total number of cards in the Deck - II, the probability of a “Yes” answer is given by:
![]() | (3) |
As before assuming that as → ∞ and θ2 → 0 such that (finite). Thus it is obvious that:
![]() | (4) |
By following the procedure as adopted by Singh and Tarray14, we have:
![]() | (5) |
and:
![]() | (6) |
where, y1i and y2i denotes the observed values in the first and the second response from the ith respondent, respectively. Solving Eq. 5 and 6 for it has established the following theorems.
Theorem 1: An unbiased estimator of the parameter δ1 for the rare sensitive attribute A1 is given by:
![]() | (7) |
Where:
![]() |
and:
![]() |
Proof: Since y1i ~ iid Poisson (δ*1) and y2i ~ iid Poisson (δ*2), thus by taking expected value on both sides of the Eq. 7, we have:
![]() |
which proves the theorem.
Theorem 2: The variance of the unbiased estimator of the parameter δ*1 is given by:
![]() | (8) |
Proof: Since y1i~ iid Poisson (δ*1) and y2i~ iid Poisson (δ*2), thus V(y1i) = δ*1 and V(y2i) = δ*2. It is to be mentioned that both responses are not independent, thus we have Eq. 9 where:
![]() | (9) |
![]() | (10) |
![]() | (11) |
and
![]() | (12) |
Putting Eq. 10-12 in 9 it established the theorem.
Corollary 1: An unbiased estimator to estimate the parameter δ2 for rare unrelated attribute A2 is given by:
![]() | (13) |
with the variance:
![]() | (14) |
Proof: Analogous to the proof of the theorems 1 and 2.
Corollary 2: An unbiased estimator of the variance of the estimator is given by:
![]() | (15) |
and an unbiased estimator of the variance of the estimator is given by:
![]() | (16) |
where, and
are, respectively defined in Eq. 7 and 13, respectively.
RELATIVE EFFICIENCY
The percentage of relative efficiency of the proposed estimator with respect to the Land et al.12 estimator δ1* is given by:
![]() | (17) |
Where:
![]() |
See Land et al.12, Eq. 14, p. 7.
It is observed from Eq. 17 that the percentage of relative efficiency of the proposed estimator with respect to Land et al.12 estimator
is free from the sample size n. To see the performance of the proposed estimator
relative to Land et al.12 estimator
, it has computed the values of PRE (
,
) using the formula given in Eq. 17 for fixed (m =100) and different parametric values as given in Table 1. The resulting values of PRE (
,
) are shown in Table 1.
Table 1 exhibited that the values of PREs are greater than 100 for all the parametric values considered here. Thus the proposed procedure is better than the Land et al.12 procedure.
Table 1: | Percentage of relative efficiency of the proposed estimator 1 with respect to Land et al.12 estimator ![]() |
![]() ![]() |
For the choice of δ1, δ2 as 0.5, 1.50, the percentage of relative efficiency remains considerably larger than the other two cases, which reveals that it is appropriate to use the rare unrelated attribute Y, one with a mean value greater than that of the rare sensitive attribute A without affecting the cooperation of the respondents in using the suggested randomization device. The choice of (Pi, Ti ), I = 1, 2 should be made in such a way that the respondents should not feel that their privacy is threatened, while the difference (P1T2-P2T1) should be kept large as compared to P1-T1. Finally, our recommendation is to use the suggested estimator 1 in practice.
This study advocates the problem where the number of persons possessing a rare sensitive attribute is very small and huge sample size is required to estimate. A more practical situation is discussed, when the proportion of persons possessing a rare unrelated attributes is unknown. Properties of the proposed randomized response model have been studied along with recommendations. Efficiency comparison is worked out to investigate the performance of the suggested procedures. It is interesting to mention that the proposed procedure is superior.
This study discovers a new Stratified randomized response model and random sampling is generally obtained by dividing the population into non-overlapping groups called strata and selecting a simple random sample from each stratum. An RR technique using a stratified random sampling gives the group characteristics related to each stratum estimator. Also, stratified sample protect a researcher from the possibility of obtaining a poor sample. This study will help the researchers to uncover the critical areas related to randomized response technique. For the future research, researcher can be considering a new theory for randomized response model.
The authors are grateful to the referees for there valuable suggestions.