HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2016 | Volume: 16 | Issue: 11 | Page No.: 496-503
DOI: 10.3923/jas.2016.496.503
Exploring Age Distribution Pattern of Female Breast Cancer Patients in Assam, India Using Gamma Probability Distribution Model
N. Rajbongshi, D.C. Nath and L.B. Mahanta

Abstract: Background and Objective: Worldwide, breast cancer is by far the most frequent cancer among women causing numerous deaths every year. Hence, it is important to study every aspect of this disease to tackle it. Monitoring age distribution pattern of breast cancer occurrence is important for guidance to a cancer epidemiologist as well as for effective planning of breast cancer control. Present study is an attempt to fit the age distribution pattern into a probability distribution model so that it can provide a probability curve of the number of observations occurring in each age group. Materials and Methods: A descriptive study is also carried out including basic demographic information such as place of residence, marital status, occupation and number of children of the female breast cancer patients. Retrospective data available during 2010-2012 at Dr. B. Borooah Cancer Institute, Assam, India were included in the study. Data were extracted from 1153 clinically diagnosed cases of that period. The goodness-of-fit is checked using the chi-square goodness-of-fit test. Also, independent sample t-test is used to compare the mean age in both rural and urban group of the patient. Results: Most of the breast cancer patient (18.6%) belongs to the 38-43 age groups, married women (94 %) and from rural area (66.2%). Maximum patients (63.8%) have children less than 2. It is observed that gamma probability distribution model is well fitted to the age distribution data of the patients (Cal χ2 = 15.8899 and p-value = 0.2271). Mean age difference has been observed in both group of residence at 10% level of significance (p-value = 0.061). Conclusion: Patients from rural area may ignore initial symptoms of the disease due to lack of awareness and their financial problem and ultimately when they came to consult with doctor it already turns into malignant tumour. Lack of awareness and income level may be a responsible factor for this outcome. The most affected age group in this study is slightly lower as compared with other studies. Using the gamma probability model, it is possible to predict the probability of occurrence of breast cancer of a women belonging to a particular age in North-East India.

Fulltext PDF Fulltext HTML

How to cite this article
N. Rajbongshi, D.C. Nath and L.B. Mahanta, 2016. Exploring Age Distribution Pattern of Female Breast Cancer Patients in Assam, India Using Gamma Probability Distribution Model. Journal of Applied Sciences, 16: 496-503.

Keywords: independent sample t-test, gamma probability distribution, risk factor, age distribution and Breast cancer

INTRODUCTION

Cancer imposes a heavy burden on the public health and has become one of the leading causes of death worldwide. Cancer is a disease caused by an uncontrolled division of abnormal cells in a part of the body and when the growth cell form lump on breast it is called breast cancer1. Breast cancer is more prevalent among the female population in all the countries wherever it has been studied. Breast cancer is curable if it is detected in the early stage of diagnosis and screening is the key element for early detection. But unfortunately in India the survival rate is very low due to lack of screening awareness2. In 2012, 1676633 new cases of breast cancer are estimated to have been diagnosed worldwide which is 25.16% of all new cases of female cancer patients (Excluding non-melanoma skin cancer)3,4. Healthcare burden related to breast cancer in India too is increasing day by day. It is mentioned in the International agency for research on cancer (WHO) report that number of new cases of breast cancer increased from 115251 (22.2%) in the year 2008 to 144937 (27.0%) in the year 2012. Also, 17000 more deaths have occurred in the year 2012 compared to 20085. In North-East India, there is a wide disparity in both the diagnosis and treatment of cancers which are mostly due to socio-economic conditions, lack of awareness and difficulty to access the facilities for cancer diagnosis and treatment6. Breast cancer also occupies the highest place with relative proportion of 17.5% in the hospital based cancer registry in progress in the Dr. B. Borooah Cancer Institute7. Dr. B. Borooah Cancer Institute is the Regional Cancer Care Centre for entire North-East region of India.

Increasing age is a primary risk factor for breast cancer1,8 and earlier study reveal that the risk factors may vary from one group to another age group of the patients. Mayberry9 investigated about the risk factors of breast cancer in different age group. Wingo et al.10 also found that the relationship between the risk of breast cancer and oral contraceptives use appeared to vary by age at diagnosis. So, there is a dependency between the age distribution pattern and the risk factors of the beast cancer cases9,10.

Present study is carried out in Assam which is in North-East of India. The North-East region of India is unlike the rest of the country regarding customs, assorted ethnic groups with their typical food habits, life-style and varying types and pattern of tobacco use. Moreover, due to its unique and strategic geographic location with age old history of migrations it is considered as a genetic pool. The present study aims to obtain an appropriate statistical distribution model for the age pattern of the female breast cancer occurrences and to test how well the chosen statistical distribution fits the data. In previous study, only frequency have been given for different age group. It also makes a study of the basic demographic information of the patients and to compare the mean ages of the occurrence of breast cancer cases between two different place of residence, viz., rural and urban. This study will help draw the attention of cancer epidemiologists to the most prevalent age group of the breast cancer patients in the study region as probability estimation of breast cancer occurrences for a particular age group can be useful tool in the development of a risk prediction model.

MATERIALS AND METHODS

To execute the present study a retrospective analysis utilizing secondary data available during 2010-2012 at Dr. B. Borooah Cancer Institute was done. Case files of the clinically diagnosed breast cancer patient were reviewed and information on age, marital status, occupation, child birth and place of residence were abstracted. A total of 1261 cases were registered during this period. But 94 cases which are from outside Assam and 14 male patients are excluded and finally 1153 cases were considered for the study. Being a retrospective study and no ethical approval was required for the study as all the patients were treated with the standard departmental protocol.

Basic information of the breast cancer cases are presented in Table 1. To accomplish the age distribution pattern, ages of the breast cancer cases are grouped into 14 subintervals and corresponding observed frequency of the number of cases have been found out. Examining the frequency curve of the observed data, it was felt that a good fit of the observed data in each age group can be obtained with a form of gamma probability distribution. Parameter of the gamma probability distribution model are estimated using method of moments and the goodness-of-fit was checked using the chi-square goodness-of-fit test. The procedure of parameter estimation and goodness-of -fit test are briefly described below.

Fitting the age distribution data using gamma probability distribution model: The two-parameter gamma distribution has one shape and one scale parameter. The random variable X follows gamma distribution with the shape and scale parameters as α>0 and β>0, respectively, if it has the following Probability Density Function (PDF):

(1)

Table 1: Frequency table for demographic information of female breast cancer patients

It will be denoted by gamma (α,β).

where, is the gamma function and expressed as:

It is well known that the PDF of gamma (α,β) can take different shapes but it is always unimodal. The mean and variance of the gamma distribution is:

and:

The cumulative distribution function of the gamma distribution is:

Use the Gauss-Laguerre integration method to derive gamma values. These values were compared with the table values available in the tracts of computers11. The values obtained by the computer program were accurate upto 4 places. Also, Simpsons 1/3rd integration technique is used for evaluation of values of cumulative distribution function corresponding to different value of variable x.

Method of moments (MOM): Consider the method of moments for parameter estimation. Let x1, x2,..., xn represent the set of data which is i.i.d distributed as gamma (α,β) with moment generating function:

The moments of gamma distribution are:

(2)

(3)

(4)

Using the method of moment estimator:

(5)

(6)

where, and are calculated from observed data.

Goodness-of-fit test: Goodness-of-fit test procedures are intended to detect the existence of a significant difference between the observed (Empirical) frequency of occurrence of an item and the theoretical (Hypothesized) pattern of occurrence of that item. This study assumed that the gamma distribution is a good fit to the given dataset.

Chi-square goodness-of-fit: The chi-square test is used to test if a sample of data came from a population with a specific distribution.

In this study, chi-square test is defined for the hypothesis:

H0: Data follow a gamma distribution against the alternative hypothesis
Ha: Data do not follow the gamma distribution

For the chi-square goodness-of-fit computation, the data are divided into k = 14 bins and the test statistic is defined as:

where, Oi is the observed frequency for bin i and Ei is the expected frequency for bin I.

The expected frequency is calculated by:

where, F is the cumulative distribution function for the gamma distribution, Yu is the upper limit for class i, Yl is the lower limit for class i and N is the sample size.

Implementation of the age distribution data: The data for this study is the frequency of the number of patients occurring in a specific age group. Programs written in C-language (Appendices) was used to find out the expected frequency from gamma distribution model. The goodness-of-fit test is done in Ms-Excel sheets.

Independent sample t-test: Independent sample t-test has been used to test whether mean age is significantly different in two place of residence of the patients. Here the null hypothesis (H0) and alternative hypothesis (H1) of the independent samples t-test can be expressed in the given way:

H0: 1 = µ2 (Two population means are equal)
Ha: µ1 ≠ µ2 (Two population means are not equal)

When the two independent samples are assumed to be drawn from populations with identical population variances the test statistic t is computed as:

With:

When the two independent samples are assumed to be drawn from populations with unequal variances the test statistic t is computed as:

Where:
= Mean age of the first sample
= Mean age of the second sample
n1 = Sample size (i.e., No. of observations) of first sample
n1 = Sample size (i.e., No. of observations) of second sample
s1 = Standard deviation of first sample
s2 = Standard deviation of first sample
sp = Pooled standard deviation

RESULTS AND DISCUSSION

A total of 1153 clinically diagnosed cases of the female breast cancer are found between the study periods 2010-2012. Table 1 reveal that among the cases, 18.6% cases occurred in the age group 38-43. Though breast cancer is more prevalent in urban area12 but in the present study it is observed that larger amount of patients reported are from rural areas of the study region. Patients from rural area may ignore initial symptoms of the disease due to lack of awareness and their financial problem and ultimately when they came to consult with doctor it already turns into malignant tumour. So, lack of awareness and income level may be a responsible factor for this outcome. Majority of the patients are married (94%) and housewives (91.6%). Regarding these factors of the patient, same scenario have been observed in other Indian studies13-15. Among the child birth group, where this study has also included those patients who do not have previous pregnancy records, it is observed that 61.5% breast cancer patients are in 0-2 child birth group. A study on current breast cancer scenario of India done by Agarwal and Ramakant16 too mentioned that nulliparous women had higher risk than parous women16.

Previous study shows that age is most important factor for breast cancer occurrences and risk of occurrence of breast cancer cases increasing with age of the patients. So, in this study more emphasis has been given to observe the pattern and characteristics of age of the patients. Table 2 depicts the more prevalent age group of the breast cancer occurrences found in other studies conducted within India.

It is observed that researcher of the all the above given study mentioned only highest peak of the age group in their specific region. In this study, an attempt has been made to fit the age distribution data to a probability distribution model in order to estimate the expected frequency of female breast cancer cases that might occur in specific age group in the study region. This study found that gamma probability distribution model gives good fits to the age distribution data. In the study, observed probability curve is positively skewed and unimodel which matched with the characteristics of the gamma distribution model21,22.

The shape parameter (α) and scale parameter estimated from the method of moments are 16.51 and 0.37, respectively. Table 3 presents the observed frequency and gamma fits for each of the 14 age groups. It can be observed from the estimated frequency that the fit is quite good. The chi-square statistic for the goodness-of-fit test is 15.8899 and the corresponding p-value for (14-1-2-1 = 10 degrees of freedom) is 0.2271 which is insignificant. Therefore, the null hypothesis can be accepted and concluded that the fit is good.

The visual assessment of the fit is shown in Fig. 1 and 2.

Table 2: Age peak of breast cancer patients in other place of India

From the observed probability curve it can be found that the curve is unimodel and has highest peak at the age group 38-43 and for estimated curve it is 43-48. From the age group 43-48 the slope of the curve slants down and from 68-73 age groups it becomes flat. It can be observed that highest peak of the age of breast cancer occurrences is lower to some extent than earlier related studies as mentioned in Table 2.

Fig. 1: Observed and fitted count resulting from gamma probability distribution model

Fig. 2: Cumulative probability curve

Table 3: Gamma Probability distribution model fit to age distribution data

Fig. 3: Age distribution pattern in two different place of residence

Table 4:Descriptive measure of the age of the breast cancer patients in different place of residence

Barrya and Breen23 reported that study of place of residence have importance to breast and cervical cancer diagnosis. Sither24 reported that locality of the patients have effect on the diagnosis stage of the breast cancer cases. So, considering the importance of the previous study, this study has applied independent sample t-test in the present study to observe whether there is any difference in the mean age of the patients in the rural and urban group of residence.

Table 4 shows the difference in the average age of the patients and Fig. 3 depicts that frequency curve of age representing rural group of the patients takes the highest peak in the 38-43 age group. As age increases further, the curve slopes down and from 68-73 age group it becomes flat. Interestingly for the urban group, the curve shows slightly oscillatory nature. First it rises to a high place in 33-38 age group, slopes down to 38-43 age groups and subsequently it again rises upward to 43-48 age groups.

From the independent sample t-test, significant difference has been observed in the mean age of the patients in rural and urban group of residence at 10% level of significance with p-value = 0.061.

CONCLUSION

It is well known that there exists regional variation in the incidence and mortality of cancer. This study proves that the age incidence of breast cancer is slightly lower and incidence of breast cancer in rural women is slightly higher in the study region than other regions in India. These results are interesting because they vary a bit from the reports of the other individual researchers. It may be because there is a slight shift in the trend since the previous studies or the pattern may vary in this region as compared to other regions in India. Further, this study is important because probability estimation of breast cancer occurrences for a particular age group can be useful tool in the development of a risk prediction model. Using the outcome of the present study, the cancer epidemiologist can predict more accurately the probability of occurrence of cancer of a women belonging to particular age in North-East India and give importance to the most prevalent age group of women and their life style related factor with reduced study time and more accuracy. For those epidemiological studies, the screening will be more successful for the study region.

ACKNOWLEDGMENT

We would like to express sincere thanks to the Director of Dr. B. Borooah Cancer Research Institute for giving us the permission to collect the data from the institute to continue this study. We gratefully acknowledge all the support extended by Institute of Advanced Study in Science and Technology during this study period.

APPENDICES



REFERENCES

  • ACS., 2014. What are the risk factors for breast cancer? American Cancer Society Inc., Atlanta, GA., USA. http://www.cancer.org/cancer/breastcancer/detailedguide/breast-cancer-risk-factors.


  • Agarwal, G. and P. Ramakant, 2008. Breast cancer care in India: The current scenario and the challenges for the future. Breast Care, 3: 21-27.
    CrossRef    Direct Link    


  • GLOBOCAN, 2012. All malignant neoplasm, excluding non-melanoma skin cancer incidence, female, 2012. http://globocan.iarc.fr/ia/world/atlas.html.


  • GLOBOCAN, 2012. Breast cancer, cancer incidence, female, 2012. http://globocan.iarc.fr/ia/world/atlas.html.


  • GLOBOCAN, 2012. Statistics of breast cancer in India: Global comparison. http://www.breastcancerindia.net/statistics/stat_global.html.


  • Krishnatreya, M., A.C. Kataki, J.D. Sharma, P. Nandy, A. Talukdar, G. Gogoi and N. Hoque, 2014. Descriptive epidemiology of common female cancers in the North East India-a hospital based study. Asian Pac J. Cancer Prev., 15: 10735-10738.
    CrossRef    PubMed    Direct Link    


  • NCRP., 2012. Hospital based cancer registry report. National Cancer Registry Programme, Indian Council of Medical Research, Bangalore, India.


  • Chen, W.Y., 2015. Patient information: Factors that modify breast cancer risk in women (beyond the basics). http://www.uptodate.com/contents/factors-that-modify-breast-cancer-risk-in-women-beyond-the-basics.


  • Mayberry, R.M., 1994. Age-specific patterns of association between breast cancer and risk factors in black women, ages 20 to 39 and 40 to 54. Ann. Epidemiol., 4: 205-213.
    CrossRef    Direct Link    


  • Wingo, P.A., N.C. Lee, H.W. Ory, V. Beral, H.B. Peterson and P. Rhode, 1991. Age-specific differences in the relationship between oral contraceptive use and breast cancer. Obstetr. Gynecol., 78: 161-170.
    PubMed    Direct Link    


  • Pearson, K., 1923. Tracts for Computers. Cambridge University Press, USA


  • Nagrani, R.T., A. Budukh, S. Koyande, N.S. Panse, S.S. Mhatre and R. Badwe, 2014. Rural urban differences in breast cancer in India. Indian J. Cancer, 51: 277-281.
    CrossRef    Direct Link    


  • Kapil, H.A. and S.S. Rajderkar, 2012. Clinico-epidemiological profile of female breast cancers and its important correlates: A hospital based study. Natl. J. Community Med., 3: 316-320.
    Direct Link    


  • Kulkarni, B.B., S.V. Hiremath, S.S. Kulkarni, U.R. Hallikeri, B.R. Patil and P.B. Gai, 2012. Decade of breast cancer-trends in patients profiles attending tertiary cancer care center in South India. Asian J. Epidemiol., 5: 103-113.
    CrossRef    Direct Link    


  • Gupta, P., R.G. Sharma and M. Verma, 2002. Review of breast cancer cases in Jaipur region. J. Indian Med. Assoc., 100: 282-283,286-287.
    PubMed    


  • Agarwal, G. and P. Ramakant, 2008. Breast cancer care in India: The current scenario and the challenges for the future. Breast Care, 3: 21-27.
    CrossRef    Direct Link    


  • Harrison, P.A., K. Srinivasan, V.S. Binu, M.S. Vidyasagar and S. Nair, 2010. Risk factors for breast cancer among women attending a tertiary care hospital in Southern India. Int. J. Collaborat. Res. Internal Med. Public Health, 2: 109-116.
    Direct Link    


  • Kaur, N., A. Attam, S. Saha and S.K. Bhargava, 2011. Breast cancer risk factor profile in Indian women. J. Int. Med. Sci. Acad., 24: 163-165.
    Direct Link    


  • Datta, K., M. Choudhuri, S. Guha and J. Biswas, 2012. Breast cancer scenario in a regional cancer centre in Eastern India over eight years-still a major public health problem. Asian Pac. J. Cancer Prev., 13: 809-813.
    CrossRef    PubMed    Direct Link    


  • Chopra, B., V. Kaur, K. Singh, M. Verma, S. Singh and A. Singh, 2014. Age shift: Breast cancer is occurring in younger age groups-Is it true? Clin. Cancer Invest. J., 3: 526-529.
    CrossRef    Direct Link    


  • RocScience Inc., 2014. Gamma distribution. RocScience Inc., Toronto, Ontario, Canada. https://www.rocscience.com/help/swedge/webhelp/swedge/Gamma_Distribution.htm.


  • Das, S. and D. Ghosh, 2003. On a new measure of skewness for unimodal distributions. IIM Bangalore Research Paper No. 216, October 30, 2003.


  • Barry, J. and N. Breen, 2005. The importance of place of residence in predicting late-stage diagnosis of breast or cervical cancer. Health Place, 11: 15-29.
    CrossRef    Direct Link    


  • Sither, M.J., 2014. Effects of locality and risk of late stage breast cancer diagnosis in Kentucky females, 2001-2011. Master's Theses, University of Kentucky, Kentucky.

  • © Science Alert. All Rights Reserved