INTRODUCTION
Power analysis and sample size estimation are aspects of the design of experiments
and other research studies in which data are collected. Determining the appropriate
sample size for an investigation, whether it is clinical trial or field experiments,
is essential step in the statistical design of the project (Cohen,
1988; Murphy and Myors, 1999). An adequate sample
size helps ensure that the study will yield reliable information, regardless
of whether the ultimate data suggest a clinically important difference between
the treatments being studied, or the study is intended to measure the accuracy
of a diagnostic test or the incidence of a disease (Foster,
2001; Nemec, 1991; Di Stefano,
2001). Generally, the researchers choose a sample size large enough to enhance
chances of conclusive results while small enough to lower the study cost, constrained
by limited budget and/or some medical consideration. The required sample size
in an experiment (test) is a function of the alternative hypothesis, the probabilities
of type I and type II errors and the variability of the population(s) under
study (Kramer and Rosenthal, 1999).
The probabilities of type I and type II errors are always predetermined prior
to the test. Type I error is also the level of significance of the test and
it is the probability of rejecting a true null hypothesis. Type II error is
the size of probability of accepting a false null hypothesis (Chow
et al., 2003). The power of a test is therefore 1β (β
is size of type II error). Power of a test is the probability of rejecting a
false null hypothesis and it depends on the effect size (which is defined by
the alternative hypothesis), type I error rate and the sample size (Roger,
2000). In fact, considering these three parameters and the power of a test
together, fixing any three will allow the determination of the fourth. For example,
once we define the effect size, type I error rate and the desired power, this
definitely determines the required sample size. Similarly, if the effect size,
type I error rate and the sample size are defined, then the power of the test
is determined.
In any method for deriving a conclusion from experimental data carries with
it some risk of drawing a false conclusion. There are two types of false conclusions
that can be committed and they are known as type I error and type II error (Huck,
2000). A type I error occurs when one concludes that a difference exist
between the treatment groups when, in fact, it does not. It is type of false
positive. The risk of type I error, assuming that there is really no difference
between groups is equal to α. A type II error occurs when one concludes
that a difference does not exist between groups being compared when, a difference
does exist. A type II error is a type of false negative. The risk of a type
II error occurring is denoted by β. In a classical hypothesis of H_{0}
(null hypothesis) against the H_{1} (alternative hypothesis), there
are four possible outcomes, two of which are incorrect:
• 
Accept H_{0} when H_{0} is true 
• 
Reject H_{0} when H_{0} is false 
• 
Reject H_{0} when H_{0} is true (type I error) 
• 
Accept H_{0} when H_{0} is false (type II error) 
To construct a test, the distribution of the test statistic under H_{0}
is used to find the critical region which will ensure that the probability of
committing a type I error does not exceed some predetermined level. This probability
is typically denoted by α. The power of the test is its ability to correctly
reject the null hypothesis, which is based on the distribution of the test under
H_{1}. The required sample size will be a function of:
• 
The effect size (alternative hypothesis) 
• 
The size of type I error 
• 
The desired power to detect H_{1} 
Current available methods for power analysis include paired and pooled ttest,
fixed effect ANOVA and regression models, binomial proportion comparison, bioequivalence,
correlation and simple survival analysis models and even in multivariate analysis
(Oyeyemi, 2007). Numerous mathematical formulae have
been developed to calculate sample size for various scenarios in different researches
based on objectives, designs, data analysis methods, power, type I and type
II errors and effect size (Chow et al., 2003).
Contingency Table
A contingency table is a cross classification of two or more categorical
variables, the simplest being a 2x2 contingency table. The contingency table
test is a common method of analyzing categorical data. One of its applications
is to test whether two or more categorical variables are independent of one
another (Lebart et al., 2000; Clausen,
1998). Suppose X and Y are 2 categorical variables with r and c categories,
respectively. If o_{ij} is the observed count/frequency in the cell
ij (i = 1, 2, 3, , , r; j = 1, 2, 3, , , c). Then n_{i•} is the
marginal total for the ith row and n_{•j} is the marginal total
for the jth column. Then the sample size n is given as:
The expected count/frequency e_{ij} of cell ij is the obtained as:
In testing, the hypothesis of independence between variables X and Y at α
level of significance, a test statistic Q is obtained as follows:
If the null hypothesis is true, the test statistic follows a chisquare distribution
with (r1)(c1) degrees of freedom. The hypothesis of independence is rejected
if:
where, χ^{2}_{(r1) (c1); α} is the critical value
of the χ^{2} distribution at a significance level α with (r1)(c1)
degrees of freedom. If the null hypothesis is not true, Q has the limiting noncentral
χ^{2} distribution, with the noncentrality parameter λ and
(r1)(c1) degrees of freedom.
In general, the following is valid for the noncentrality parameter λ
(Lachin, 1977):
where, n is the sample size and f is the function of the vectors of parameters θ^{0} and θ^{1}, which are involved in the test statistic Q under the H_{0} and H_{1}, respectively. From different perspective, f can be considered as the observed result’s degree of deviation from the condition stated through H_{0} and therefore is a function of the statistical test’s corresponding effect size. From Eq. 1, we can show that:
Therefore, if the parameter λ and its corresponding effect size are estimated, then Eq. 2 can be used to calculate the minimum sample size required, at a significance level α and power, for the chisquare test of independence.
Power Analysis
Generally Power Analysis is Used to Determine
• 
The minimum sample size n required to implement statistical test to detect
an effect as statistically significant at a significance level α and
power 
• 
The power of a statistical test, given the sample size, the level of significance
and the observed effect size 
The first task is known as a priori approach to power analysis while the second is the posthoc approach to power analysis. Therefore posthoc power analysis of a statistical test obtains the power while a priori power analysis determines the sample size required to detect a true significant effect for a test.
Using the type II error which is estimated as follows:
where, χ^{2}_{nc(r1)(c1)}(λ) is the value of the
nocentral χ^{2}distribution with parameter λ and (r1)(c1)
degrees of freedom. The power of the Chisquare test is then:
In order to estimate the power, it is necessary to have an estimate of the
parameter λ. According to Cohen (1988):
where, n is the sample size and w is an estimate of the effect size given as:
It can be easily shown that:
Therefore, w^{2} = Q/n, this implies that the noncentrality parameter
λ. We can then obtain the power as:
Effect Size
It is possible to make an a priori calculation of the minimum sample size
required, when an estimate of the noncentrality parameter ë and effect
size are given. Using Eq. 3 n = λ/w^{2}. For
the χ^{2} distribution, the values of noncentrality λ(α,
β, df) that correspond to significance level α, power (1β) with
degrees of freedom df, can be found in tables (Haynam et
al., 1970; Pearson and Hartley, 1972) or can be
calculated using relevant software. The only problem lies in providing a predetermined
estimate of effect size that is significance within the framework of the hypothesis.
The determination of effect size could be achieved either through pilot research project or from previous related studies on the same research subject. Cohen’s convention can also be used in relation to what can be considered as a small, medium, or large effect size within the framework of Pearson Chisquare test of independence (Table 1).
Table 1: 
Cohen’s convention of classification of effect size 

Table 2: 
Epidemiological data on 535 children 

Power and Sample Size Determination
Table 2 shows the epidemiological data on 535 children
as contained in Nelson et al. (2005). The children
were crossclassified according to their race {Black, White and Others} and
risk of becoming obesity. Based on Table 2, we want to test
whether race of the children is independent of being at risk of becoming obesity
or not.
The test statistic:
where, Q is chisquare distributed with 2 degrees of freedom. The above test
can be performed using Rlanguage as follows:
The following summary statistics were obtained as:
With pvalue of 0.000, we can therefore conclude that there exist relationship
between the race of a child and risk of becoming/developing obesity at 0.05
level of significance. The reliability of the above conclusion can be verified
with computation of the power of the test through power analysis. As discussed
earlier, the power of a chisquare test depends on its noncentrality parameter
Q. The power of a test is the right tail probability under the alternative hypothesis
characterized by Q (Bergerud and Sit, 1992). The power
can be obtained with the same Rlanguage as follows:
At α = 0.05, the power of the above test is 0.9905. The interpretation
is that, if race and risk of becoming obesity were indeed related to the extent
suggested by the data in Table 2, the test would be able to
detect that 99.05% of the time. The high power so obtained makes the conclusion
to be more reliable.
Table 3: 
Computed power values for different sample (n) and effect
sizes (w) and when degrees of freedom, df = 2 

Table 4: 
Modified epidemiological data on 535 children presented in
Table 2 

The high value of power in the test is as a result of high value of the noncentrality
parameter (Q) and sample size (n). Using the Cohen’s classification of
effect size, which is a function of noncentrality parameter and sample size.
The above test gives effect size w = 0.2009 which is classified as medium according
to Cohen (1988). For this effect size, power was calculated
for different sample sizes of 100, 150, 200, 250, 300, 350, 400, 450 and 500.
Likewise, for the same set of sample sizes, the effect sizes of 0.10 (small
effect) and 0.30 (large effect) were used to obtain the power values and the
results were presented in Table 3.
The degrees of freedom for chisquare test for the data in Table 2 was modified by collapsing two categories (white and others) as nonblack and the modified table is presented in Table 4. The same hypothesis is tested and the test statistic and the pvalue were obtained.
The same conclusion of relationship between race and risk of becoming obesity
is established. At α = 0.05, the power of the test is 0.9893 though with
effect size w = 0.1843. Table 5 gives the computed power for
different sample sizes for this effect size and when effect size is small (0.10)
and large (0.30).
RESULTS AND DISCUSSION
The power of a test increases as the sample size increases irrespective of
the noncentrality parameter value or effect size as shown in Tables
3 and 5 for the 2x3 and 2x2 contingency tables respectively.
From Table 3 with degrees of freedom of 2, for small effect
size, sample size of more than 500 is required to obtain power of 0.80 while
sample sizes of 250 and 150 are required to attain the same power for the same
test with medium and large effect sizes, respectively.
When the degree of freedom is 1 as shown in Table 5, the test attains higher power than when the degree of freedom is 2. For instance, when for large effect size of 0.30 when the degrees of freedom is 1, sample size 100 gave power value of 0.8508 while the same sample size gave power value of 0.7706 when the degrees of freedom is 2. Also for small effect size (w = 0.10), when the degrees of freedom is 2, the power value is 0.5037 for sample size of 500 while the same sample size gave 0.6088 when the degrees of freedom is 1.
Table 5: 
Computed power values for different sample (n) and effect
sizes (w) when degree of freedom, df = 1 

CONCLUSION
In test of independence between two categorical variables, apart from the size of noncentrality parameter (effect size) which determines the power and sample size of the test, the number of categories of the variables also affect the sample size and the power of the test.