Subscribe Now Subscribe Today
Research Article
 

Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)



H. Sadeghi-Bazargani, K. Mohammad, S. Arshi, S.R. Majdzadeh and S. Mohammadi
 
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail
ABSTRACT

Several software packages for sample size determination have been designed and presented up to now but nearly all of them need a user with a good knowledge about biostatistics and statistical tests. The aim of this research is to evaluate the primary version of a new software compared to some other available sample size software packages. Visual basic 6 was the language for programming. Algorithm used for designing sample size calculation is taken mainly from WHO recommendations in WHO different publications. For primary evaluation purpose, 23 academic staff of Ardabil University of Medical Sciences and Public Health School of Tehran University of Medical Sciences were selected and presented the different capabilities of software and then, designed questionnaire for evaluation of it was filled by them.

Services
Related Articles in ASCI
Search in Google Scholar
View Citation
Report Citation

 
  How to cite this article:

H. Sadeghi-Bazargani, K. Mohammad, S. Arshi, S.R. Majdzadeh and S. Mohammadi, 2007. Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software). Information Technology Journal, 6: 135-141.

DOI: 10.3923/itj.2007.135.141

URL: https://scialert.net/abstract/?doi=itj.2007.135.141

INTRODUCTION

Sampling and determining sample size is one of the major problems in health research. Sample size determination is defined as the mathematical process of deciding, before a study begins, how many subjects should be studied (Last, 1995). The most important question asked by a health researcher when designing and planning of a study or survey is how large a sample size do we need? and the answer will depend on the objectives, nature and scope of the study and the expected results. Many interested health researchers who may even not be academic staff, have problems in determination of sample size. Estimation of the power of tests is one of the important problems, when the researcher doesn’t find significant statistical differences. On the other hand the majority of researchers while studying results presented or published by other researchers, as they are faced with summarized data and because of the complexity of statistical formulas, mostly, don’t know if they can rely on what they read or not? Acquiring a fairly large count of randomized numbers, especially when the numbers are to be in a given range is another time consuming if not difficult activity. So a need has always been arisen to design sample size software packages facilitating the process. Several software packages have been designed and presented up to now but nearly all of them need a user with a good knowledge about biostatistics and statistical tests. Based on some WHO publications a software package has been designed for those health researchers who have not a good knowledge of biostatistics as well as biostatisticians and epidemiologist. The software is nominated by the designer as Yasin sampling software. The aim of this paper is to evaluate the primary version of this software compared to some other available sample size software packages.

MATERIALS AND METHODS

Visual basic 6 was the language for programming. Standard tables such as Z table were used by self defined functions and modules. SQL language was used to extract data from access database taking advantage of ADO technology. Active Xs, DLLs and MSchart, PPT, Mplayer components were used every where needed. RND function and For-Next and Do-Until loops were used extracting random numbers.

Algorithm used for designing sample size calculation is taken mainly from WHO recommendations in WHO different publications mainly Lemshaws’ “Sample size determination in health studies”. Algorithm was improved using scientific recommendations from a professor of biostatics and an epidemiologist along with testing applicability of it looking up for special examples of sample size calculation, reviewing 5 peer reviewed medical journals.

For primary evaluation purpose, 23 academic staff of Ardabil University of medical sciences and Public Health School of Tehran University of medical sciences were selected and presented the different capabilities of software and then, designed questionnaire for evaluation of it was filled by them.

Data collected were analyzed with SPSS software. formulas used in estimation of sample size and key descriptions were mostly taken from Lemeshaw textbook.

Capabilities of software:

Calculating sample size for a vast range of health studies such as descriptive studies, cohort studies case control studies and clinical trial studies (Fig. 1).
Suggesting complete printable report and sample size estimation graphs for different values of power or significance level (Fig. 2).
Calculating the power of study.
Accomplishing major statistical tests on summarized data.
Producing random numbers in required count and given upper and lower limits.
Presenting some educational matters about different aspects of sampling in power point format.
Giving some useful examples of calculating sample size in different situations to help young researchers get acquainted with sampling procedures.

All the above mentioned capabilities are provided in four parts. The first of which is sample size determination Applied algorithmic approach for designing of different situations of sample size estimation is adopted from WHO Publications in the field of research methodology and mostly from, sample size determination textbook edited by Lemeshaw. This algorithmic approach was adopted thanks to the recommendations of two professors of biostatistics and epidemiology. Validity of calculations were tested by comparing the results of given examples by Lemeshaw.

Grouped situations for sample size calculation and the formulas used are as following:

One-sample situations: Three components for estimation of sample size were predicted in the one- sample situations.

The first component is calculating sample size for estimation of a proportion or prevalence of a disease in a given population. Formula used was:

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

The second component is when estimating of mean for a quantitative measure in a given population.

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)
Fig. 1: Interface view for the sample size calculation section of the software

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)
Fig. 2: Printable report and sample size estimation graphs for different values of power and significance level

Third component is related to hypothesis testing of a population proportion. As stated by Lemshaw this section applies to studies designed to test the hypothesis that the proportion of individuals in a population possessing a given characteristic is equal to a particular value.

Formulas used for either one sided or two sided tests were:

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

Two-sample situations: estimation of the difference between two population proportions with specified absolute precision. Formulas used were:

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

or

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

where

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

Hypothesis tests for two-population proportion: This section is used for studies to test the hypothesis that two-population proportions are equal.

Formulas used for one sided or two sided tests were:

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

Case_control studies: In these studies, estimation of sample size was done on the basis of two exposure ratio in control group and case group and/or exposure ratio in control group and odds ratio (OR). Related values of power, confidence level and ratio of number of controls over cases consisted other necessary information.

Cohort and clinical trials studies: Calculation of sample size is like case-control studies except for entering some specific features like loss to follow up ratio in cohort studies.

Lot quality assurance sampling: Accepting population prevalence as not exceeding a specified value. This section outlines how to determine the minimum sample size that should be selected from a given population so that, if a particular characteristic is found in no more than a specified number of sampled individuals. the prevalence of the characteristic in the population can be accepted as not exceeding a certain value. The formulas used were:

The value of n is obtained by solution of the inequality

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

where M = NP, for a finite population; or

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

i.e.,

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

or

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

for an infinite population.

RESULTS

Of all the evaluation participants two were associate professors, 7 were assistant professors and others were university lecturers and health researchers. Ideas of participants about the quality of graphical views used is given in Table 1.

Considering the use of music in software 61.9% agreed with it while 23.8% were against it and 14.3 had no ideas in this regard.

About the application language 42.9% of evaluators preferred it to be available both in English and Farsi languages and others in equal proportion preferred either English or Farsi languages.

Using sample size estimation graphs was evaluated to be necessary or quite necessary in 76% and only one of the participants had considered it to be unnecessary. The examples and Help available in software were evaluated as excellent in 19% and no one had evaluated it as weak or poor.

Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)
Fig. 3: Distribution of the overall ideas of participants in evaluating Yasin health research software

Table 1: Ideas of evaluators about the quality of graphical views of software
Image for - Pilot Evaluation of the First Iranian Sampling Software for Health Research (Yasin Sampling Software)

All of the participants except one had stated the presentation of formulas in printable report is necessary. 38.1% had evaluated the power determination section of application as very useful and 57% as useful (Table 1). Nearly half of the evaluators had found the random numbers section of application as excellent and except for one person who had no idea the other half had evaluated it as good.

Distribution of the overall ideas of participants in evaluating Yasin health research software as a useful research tool is given in Fig. 3.

DISCUSSION

There have been several other software applications developed for sample size calculation (CDC/EPI Info 2002; software, 2002; Dupont and Plummer, 2003; Iwane and Plante, 1997; Luttke, 1991, Lwanga and Lemeshow, 1991; Arshi and Sadeghi, 2003). In this discussion we will comparatively review capabilities of Yasin software as well as some other sample size software applications.

REPLI: A program written in elementary BASIC, calculates the approximate sample size, which is required to detect a desired difference between any two group means in an experiment with n groups for a given probability and at three significance levels of the means difference. The current program consists of three blocks carrying out (1) the reading of data for the t-value matrices, (2) acceptance of parameters (coefficient of variation of means, difference to be detected, probability for detection, number of groups in experiment and a pre-selected threshold), (3) the calculation proper, the output on screen and the option to rerun the program with new parameters immediately.

EPI Info statcalc: Both DOS application versions and windows application versions provide equally determined capabilities in relation to sample size calculation. Two main capabilities of this software are 1-Estimation of sample size in population surveys, when p-value, precision, confidence level and target populations are determined. 2-Comparison of difference between two ratios. A complete algorithmic approach and help are not provided by this software.

PS: This software is designed on the basis of types of statistical tests and is very complete compared to Epi info. An ultimate algorithmic approach is used. This software is a window application written by with visual basic programming language. But PS software users need to be quite acquainted with statistical methods and not easy to use for many health researchers.

Nsurv: Nsurv is specifically designed for computing sample size and power for two-group studies with exponentially distributed time-to-event data. Sample size calculations are based on the method of Lachin and Foulkes. Nsurv is a companion package to N, which computes sample size and Rower for studies with normally distributed measurement data or binary data. For time-to-event data, Nsurv calculates the quantity of interest, such as sample size, when the user specifies a one-sided or two-sided test, an efficacy or equivalence study design, cumulative event rates for each treatment group at a specified time, the ratio of the sample sizes for the two treatment groups, the cumulative percent lost-to-follow-up (exponentially distributed) common at a specified time to both groups and types I and II error rates. The user must choose one of four configurations of accrual and follow-up periods, which then constrains the available options in later menus.

Clinical Trials Design Program (CTDP): CTDP can perform a wide array of calculations for a variety of trial designs, including those with endpoints which are time-to-event, binary, or normally distributed means, as well as some epidemiological designs. We address survival designs only. The package provides the user with three methods for time to-event outcomes based on three different statistical approaches. The input/output screens are similar for the different methods, with available parameter options determined by the statistical approaches.

EGRET SIZ: Egret Siz is an extensive software package with several screens to navigate through when calculating sample size and power based on the Cox proportional hazard model. Power and sample size calculations follow the approach of Self, Mauritsen and Ohara (Self et al., 1992) which is based on a noncentral Chi-square approximation to the distribution of the Likelihood Ratio Test (LRT) statistic. Egret siz will also calculate sample size and power for models that are prospective logistic, unmatched retrospective logistic, conditional logistic for matched sets and Poisson regression for subject-time data. Unlike the other software packages evaluated, Egret Siz allows more than two treatment groups and specification of covariates.

Power: power, a simple DOS prompt-driven package, calculates sample size, power, or the detectable difference for time-to-event data based on the formula in Schoenfeld and Richter (1982) for exponentially distributed event times. It also performs calculations for matched and unmatched study designs with binary or continuous outcome variables.

Power Analysis and Sample Size (PASS): Pass calculates power, sample size and other design parameters for a broad range of study designs for outcome variables that are continuous, proportions and time-to-event and for an array of analyses including those based on analysis of variances (ANOVAs), linear regressions, correlation coefficients, logistic regressions, matched and unmatched analyses and log-rank tests. As with Nsurv, Pass uses the method of Lachin and Foulkes for exponentially distributed time-to-event data.

Ex-Sample : Ex-Sample, like Power, calculates sample size using the method of Schoenfeld and Richter. The design parameters are entered into a single survival design screen which is reached through two other screens.

Yasin software is a windows application compatible with all windows operating systems available up to 2006. Except for PS all other software applications evaluated and discussed above are DOS applications. Although windows applications have very graphical capabilities but in our pilot evaluation, we found that most of them liked its graphical interface to be improved and in some face to face interviews some of them said that the first interface graphical view of the software is some how complex while they preferred a simple graphical view for a scientific software like this.

Both Yasin and PS provide an estimation graph for different powers and significance levels. Egret Siz and Pass and Power to a more limited extent, can perform calculations over a user specified range of values for design parameters, e.g., power calculations for various sample sizes or sample size calculations for various types I and II error. In case of cohort and case control and clinical trail studies comparing proportions Yasin can estimate power of the study if proportions and sample size is available. It seems that future versions of software should include power calculation when means are compared either. The other packages return only a single calculated sample size or power. Thus, a user who wants to consider many design scenarios must conduct each in a separate computer run. Yasin gives a one sided and two sided hypothesis option in calculation of sample size. Most of the software applications discussed have the same capability but Nsurv’s two-sided option gives higher sample sizes than the other software packages because the hazard difference under the two-sided alternative is conceptualized in a different way (Frick, 1991). Nsurv makes conservative assumptions and the sample size is larger since fewer events are predicted.

The help in Yasin is quite expansive and provides examples from WHO publications in a power point slide show format. The examples and explanations make user to learn not only about using the software but also about basic research methodological knowledge in some cases. It can be recommended to software designer to provide an expansive printable embedded manual or a complementary handbook either. Nsurv has provided such a handbook accompanying the software package.

Suggestions for improvement of Yasin software:

A separate section on survival studies should be included.
Power determination to be expanded for both studies comparing means as well as proportions.
To increase the validity of software at least 10 well known biostaticians and epidemiologists can be invited to give recommendations for improving the software and discuss the methods used
A web based application is encouraged to be developed.
An accompanying handbook is suggested to be made available.
Sample size determination for clinical trials with more than two comparison groups should be included.
An online help system for answering arisen questions of users can be developed.
A special section for enthomology toxicology studies can be added.

CONCLUSIONS

Yasin sampling software after some improvements can be a very useful and easy to use scientific software in field of health research especially for those who have not a high expertise in biostatistics.

ACKNOWLEDGMENT

We thank to Mrs. Amini and Mr. Amani for their kind help and recommendations. Thanks also to Ardabil University of Medical Sciences for the financial support provided. It is declared by all authors that the right for patent of software belongs only to the software designer who is the first author of this paper or any other person introduced by him as helping in software design. The designer dedicates the software to Dr. Parvaneh Vosough Professor of Pediatric Oncology who has saved the lives of many cancer patients including Yasin.

REFERENCES

1:  Arshi, S. and H. Sadeghi, 2003. A Brief Manual of EPI Info 2000 Software. Bageh Rezvan publishers, Ardabil

2:  Dupont, W.D. and W.D. Plummer, 2003. Software: Ps-Power and sample size calculations. Main Power Sample Sizer, 2: 1-30.
Direct Link  |  

3:  Frick, H., 1991. On lachins formulae for sample sizes of survival tests. Commun. Statist Theory Methodol., 20: 2267-2280.

4:  Iwane, M. and K. Plante, 1997. A users review of commercial sample size software for design of biomedical studies using survival data. Control. Clinic. Trials, 18: 65-83.
PubMed  |  Direct Link  |  

5:  Lachin, J.M. and M.A. Foulkes, 1986. Evaluation of sample size and power for analysis of survival and allowance for nonuniform patient entry, losses to follow-up, noncompliance and stratification. Biometrics, 42: 507-519.

6:  Last, J.M., 1995. A Dictionary of Epidemiology. 3rd Edn., Oxford University Press, Oxford

7:  Luttke, A., 1991. REPLI: A program in basic for determination of approximate sample size. Intl. J. Biomed. Comput., 27: 193-200.

8:  Lwanga, S.K. and S. Lemeshow, 1991. Sample Size Determination in Health Studies: A Practical Manual. World Health Organization, Geneva, Switzerland, ISBN-13: 9789241544054, Pages: 80

9:  Schoenfeld, D. and J. Richter, 1982. Nomograms for calculating the number of patients needed for a clinical trial with survival as an endpoint. Biometrics, 38: 163-170.

10:  Self, S., R. Mauritsen and J. Ohara, 1992. Power calculations of likelihood ratio tests in generalized linear models. Biometrics, 48: 31-39.

©  2021 Science Alert. All Rights Reserved