HOME JOURNALS CONTACT

International Journal of Plant Breeding and Genetics

Year: 2015 | Volume: 9 | Issue: 4 | Page No.: 228-237
DOI: 10.3923/ijpbg.2015.228.237
Assessing Inter-Relationship of Sesame Genotypes and their Traits Using Cluster Analysis and Principal Component Analysis Methods
Fiseha Baraki, Yemane Tsehaye and Fetien Abay

Abstract: The study was carried out from 2011-2013 cropping seasons in three locations of Northern Ethiopia (a total of 7 environments) and thirteen sesame genotypes were evaluated. The objective of this study was to determine the interrelationship of the genotypes and their genetic divergence. The experiment was laid out in randomized complete block design with three replications. The thirteen sesame genotypes were grouped into four clusters based on the similarity of their agronomic traits and the dendrogram showed that clusters I, II, III and IV had 9, 1, 1 and 2 number of genotypes and the highest grain yield (918.1 kg ha–1) as well as highest oil content (55.1%) was observed in cluster III. The Mahalanobis’s (D2) distance, genetic divergence, among the clusters were statistically significant and the highest genetic divergence was observed between clusters II and III (D2 = 7425.5), whereas, the lowest distance was found between clusters I and III (D2 = 179.64). Eight Principal Components (PCs) were extracted from the eight agronomic traits of sesame and the first three PCs accounted for 88.49% of the total variance (45.05, 28.25 and 15.20% for PC1, PC2 and PC3, respectively) and these three PCs were considered as significant. G1 and G4 were highly associated with traits such as grain yield, oil content, length of capsule bearing zone and number of capsules and G12 and G13 were relatively better yielding genotypes. G2 that aligned with days to maturity confirms its delaying character in maturity.

Fulltext PDF Fulltext HTML

How to cite this article
Fiseha Baraki, Yemane Tsehaye and Fetien Abay, 2015. Assessing Inter-Relationship of Sesame Genotypes and their Traits Using Cluster Analysis and Principal Component Analysis Methods. International Journal of Plant Breeding and Genetics, 9: 228-237.

Keywords: Agronomic traits, association, genetic diversity and sesame

INTRODUCTION

Sesame (Sesamum indicum L.) belongs to the Pedaliaceae family and it is recognized as an annual plant. Despite of its ideal adaptation to the tropical, sesame can also be grown in humid and sub tropical regions (Gandhi, 2009). Sesame is highly drought tolerant and it can adapt and produce seed well under fairly high temperatures. It is an erect herbaceous annual plant and based on the type of variety and the rain it receives or the moisture status of the soil sesame can have a characteristics of indeterminate or determinate growth habits (Pham et al., 2010). It is usually with numerous flowers, whose fruit is a capsule, often called pods, containing a number of small oleaginous seeds which are ovate smooth or reticulate reaching 3×1.5 mm.

The potential of any crop productivity is mainly determined by edaphic factors, climatic conditions, management and the genetic potential of the crop. From this point of view the phenotypic expression of a crop is determined by both the environment on which the genotypes are growing and by the genetic constituent of the genotype. The major constraints in sesame production worldwide are lack of widely adapting cultivars, shattering of capsules at maturity, non synchronous maturity, poor stand establishment, lack of fertilizer responses and low management practices (Ashri, 1994).

Cluster analysis is a convenient method for identifying homogenous groups of objects called clusters. Objects (or cases, observations) in a specific cluster share many characteristics, but are very dissimilar to objects not belonging to that cluster (Mooi and Sarstedt, 2011). In cluster analysis a dendrogram is established, which summarizes the hierarchical agglomeration process and objects that group together earlier (with a short distance) tend to be more similar in terms of the proximity measure defined. There is no right or wrong answer as to how many clusters are required. It depends on what is going to be done with the clusters. To find a good cluster solution, you must look at the characteristics of the clusters at successive steps and decide when you have an interpretable solution or a solution that has a reasonable number of fairly homogeneous clusters (Tryfos, 1997).

Principal Component Analysis (PCA) analyzes a data table in which observations are described by several inter-correlated quantitative dependent variables (Abdi and Williams, 2010). The variance explained by each principal component is expressed in terms of its eigenvalue. For this reason, principal components are usually arranged in order of decreasing eigenvalues or declining information content. The number of components extracted is equal to the total amount of variance in PCA which are also equal to the number of observed variables being analyzed (Suhr, 1999). However, to select maximum number of PCs for better interpretation different rules set by different scholars are: (1) Eigenvalue-one criterion or Kaiser criterion (Kaiser, 1960) (If the eigenvalue is greater than 1, then each principal component explains at least as much variance as 1 observed variable), (2) The Scree test (Cattell, 1966) (looking for a break between consecutive PCs) and (3) Proportion of variance for each component (5-10%) and cumulative proportion of variance explained (70-80%) (Suhr, 1999).

Amongst the different constraints in sesame production lack of improved seeds contributes a lion’s share. This shortage of improved seeds is associated with no or limited breeding activities in the sesame growing areas of Ethiopia and this problem in turn is associated with knowledge gap of the breeders in the different traits and their associations of the sesame crop. Hence, this study was undertaken to investigate the association between different agronomic traits of sesame, which is crucial in providing an input that is indispensably important for further sesame breeding.

MATERIALS AND METHODS

Description of the study areas: The experiment was conducted in Northern Ethiopia, Tigray region (specifically Humera, Dansha and Sheraro). The geographic location of these sites is described in Fig. 1 and the agro-ecology of the these experimental sites is described as hot to warm semiarid plain with climatic and edaphic variations (Table 1).

Experimental design and material: The experiments were conducted from 2011-2013 cropping seasons in Humera and Dansha and in 2013 cropping season in Sheraro under rain fed condition (total seven environments) where: E1, E2, E3 are 2011, 2012, 2013 growing seasons, respectively in Humera; E4, E5, E6 are 2011, 2012, 2013 growing seasons, respectively in Dansha and E7 is 2013 growing season in Sheraro. A total of thirteen genotypes further descried in Table 2 were used and evaluated for their agronomic traits. The experiment was laid down in Randomized Complete Block Design (RCBD) with three replications.

Fig. 1:Geographic location of the study areas

Table 1:Agro-climatic and soil characteristics of the experimental sites

Table 2:Description of the sesame genotypes
WARC: Werer agricultural research center, HuARC: Humera agricultural research center

Each genotype was randomly assigned and sown in a plot area of 2.8 m by 5 m with 1m between plots and 1.5 m between blocks keeping inter and intra row spacing of 40 and 10 cm, respectively. Each plot had a total area of 14 m2 a total of seven rows and 10 m2 net plot area with five harvestable rows.

Data collection: From a net plot area ten plants were selected randomly and tagged to collect the agro-morphological data such as, plant height, length of capsule bearing zone, number of branches and number of capsules. The averages of ten plants were considered for further analysis. Days to 50% flowering and 75% maturity were taken on plot basis. Furthermore, the five experimental rows were harvested, tied in sheaves and were made to stand separately until the capsules opened. After the sheaves have dried out fully and all of the capsules opened, they were tipped out onto sturdy cloths or canvases and threshing was accomplished by knocking the sheaves. The seeds from each plot were weighed for yield determination. Oil content was determined from the composite of the three plots of each genotype.

Statistical analysis: Statistical estimations and computations were performed using different statistical software. Homogeneity of residual variances was tested prior to a combined analysis over locations in each year as well as over locations and years using Bartlet’s test (Steel and Torrie, 1980). Accordingly, the data collected were homogenous and all data showed normal distribution.

Association of genotypes and their characters: The Best Linear Unbiased Predictor (BLUP) was estimated via restricted maximum likelihood (REML) for each genotype in each environment and the genotype effect and replication were considered as random and fixed, respectively. One major property of BLUP is shrinkage towards the mean, which anticipates regression of progeny to the mean observed by every breeder. Finally, under certain fairly general assumptions, BLUP maximizes the correlation of true genotypic values and predicted genotypic values (Searle et al., 1992). The clustering, ordination (PCA) and Mahalanobis (D2) distance were executed using the BLUP values of the different agronomic traits of the genotypes.

Cluster analysis based on Ward’s method (Ward, 1963) using squared Euclidean distance of the distance metric and standardized variables was performed using Minitab release 16 (Minitab, 1998) to cluster the genotypes based on their agronomic traits in different environments. Moreover, Genetic divergence for different agronomic traits of the 13 genotypes was estimated using the Mahalanobis (D2) statistics (Mahalanobis, 1936). The D2 values obtained from pairs of clusters were considered as the calculated values of Chi-square (χ2) and were tested for significance at 1 and 5% probability level against the tabulated values of χ2 for ‘p’ degree of freedom, where, p is the number of characters considered (p = 7) (Urdan, 2005).

The data were subjected to Principal Component Analysis (PCA) using the PAST (Hammer et al., 2001) statistical package software. Using this software the loadings of the genotypes and the traits were determined to clarify the association among principal components and traits, principal components and genotypes, genotypes and their traits and among the different agronomic traits. A scatter diagram was also plotted to easily visualize these all associations.

RESULTS AND DISCUSSION

Clustering of sesame genotypes based on their agronomic traits: The BLUP value of different agronomic traits in seven environments is illustrated in Table 3 and these all agronomic traits were used as an input for clustering the genotypes. Sinebo (2002) also used the BLUP to investigate the relationship of different agronomic traits in barley. The thirteen sesame genotypes were clustered into four groups (Fig. 2) based on the similarity of their agronomic traits. Similarly, Bandila et al. (2011) was also clustered 60 sesame genotypes in to eight clusters and clearly described their similarities.

Fig. 2:
Dendrogram of sesame genotypes showing the degree of their relationship for different traits. Acc# 031 (G1), Oro (9-1) (G2), NN-0079-1 (G3), Acc-034 (G4), Abi-doctor (G5), Serkamo (G6), Acc-051-02sel-13 (G7), Tate (G8), Acc-051-020sel-14 (G9), Adi (G10), Hirhir (G11), Setit-1 (G12) and Humera-1 (G13)

Table 3:Combined BLUP value of different agronomic traits in seven environments

Table 4:Mean of the different agronomic traits of each cluster
DF: Days to 50% flowering, DM: Days to 75%, LCBZ: Length of capsule bearing zone (cm), NB: No. of branches, NC: No. of capsules, PH: Plant height (cm), OC: Oil content (%), YLD: Grain yield (kg ha–1)

The mean of each agronomic traits of respective clusters is given in Table 4 and the characteristics of the genotypes within the clusters is summarized in Table 5. As shown in the dendrogram, clusters I, II, III and IV had 9, 1, 1 and 2 number of genotypes. Such grouping of the genotypes is indispensably important for breeding program since it clearly indicates the significance of a given trait in any group (cluster). Hence, the genotype in cluster III with highest grain yield and oil content (Table 4) may be important for further breeding program to increase grain yield and oil content meanwhile cluster II could serve as a parent material in the improvement of capsule number as it possess highest capsule number than any other cluster.

Table 5:Summary of the Clusters and their unique characteristics based on the mean of the respective agronomic traits (Table 4)
Accession No. 031 (G1), Oro (9-1) (G2), NN-0079-1 (G3), Acc-034 (G4), Abi-doctor (G5), Serkamo (G6), Acc-051-02sel-13 (G7), Tate (G8), Acc-051-020sel-14 (G9), Adi (G10), Hirhir (G11), Setit-1 (G12) and Humera-1 (G13)

Table 6:Mahalanobis’s (D2) distance among clusters of different agronomic traits
**Significant (p<0.01)

Genetic diversity is analyzed by using various methods such as morphological, biochemical and molecular markers. Information on genetic diversity is important, when working to improve crop varieties. It is also important for selection of parents that can be used in plant breeding programs. The Mahalanobis’s (D2) distance, genetic divergence, among the clusters is illustrated in Table 6. The distance among clusters were statistically significant and the highest genetic divergence was observed between clusters II and III (D2 = 7425.5), whereas, the lowest distance was found between clusters I and III (D2 = 179.64). Furat and Uzun (2010) also found high genetic diversity of 43 sesame genotypes for their different agronomic traits. Similarly, Begum et al. (2011) also studied the genetic diversity of 50 sesame (Sesamum indicum L.) genotypes through Mahalanobis’s (D2) distance and they found high genetic diversity among the genotypes. The greater the distance between two clusters, the wider the genetic diversity among the parents to be included in hybridization program. Similarly, Tsehaye and Kebebew (2002) and Saha et al. (2012) also stated that crossing of parents with greater inter-clusters distance could produce desirable recombinants, while crossing parents from lower inter cluster distances seems not to produce desirable recombinants.

Ordination of sesame genotypes and their agronomic traits: Eight Principal Components (PCs) were extracted from the eight agronomic traits of sesame and a bi-plot was depicted to clearly visualize the association among principal components and traits, principal components and genotypes, genotypes and their traits and among the different agronomic traits. The first three PCs accounted for 88.49% of the total variance (45.05, 28.25 and 15.20% for PC1, PC2 and PC3, respectively) (Table 7). According to Kaiser (1960), since the eigen values of the first three PCs were higher than unity and their variance accounted more than 80%, they were considered as significant. In contrast to this finding, Furat and Uzun (2010) extracted about seven PCs, which are with greater than a unity although they have less than 80% of the total variance. The importance and relationship between variables within a component are determined by the magnitude and direction of a factor loadings within a PC (Azeez et al., 2013). The sign of the loading indicates the direction of the relationship between the components and the variable.

Fig. 3:
Principal component analysis scatter diagram of sesame genotypes and their traits. DF: Days to 50% flowering, DM: Days to 75%, LCBZ: Length of capsule bearing zone, NB: Number of branches, NC: Number of capsules, PH: Plant height, OC: Oil content, YLD: Grain yield and the G1, G2, G3, refers to genotype numbers

Table 7:Principal components loadings of different sesame agronomic traits

Principal Component One (PC1) that accounted for 45.05% of the variation was highly positively associated with Grain yield, oil content, length of capsule bearing zone, number of capsules and also highly associated in the negative direction with days to maturity (Table 7). Principal Component Two (PC2) accounted for 28.25% of the total variation and traits such as days to flowering and number of branches had the highest loading (both a positive magnitude). Principal Component Three (PC3) further accounted for 15.2% of the total variance and plant height was the only trait with large and positive loading in this axis. In addition to the Principal Components (PCs) a bi-plot was depicted to clearly visualize the association among principal components and traits, principal components and genotypes, genotypes and their traits and among the different agronomic traits. The two outlying genotypes, G1 and G4 (Fig. 3), had the largest scores in the first PC (PC1) and located in the positive direction of this PC. This indicates that these genotypes were highly associated with traits that have the highest positive loading (in the first PC) such as grain yield, oil content, length of capsule bearing zone, number of capsules. Apparently, these genotypes performed well in the mentioned traits as compared to others. Furthermore, two genotypes (G12 and G13) both grouped in cluster I in Fig. 2 had relatively large and positive scores in the first PC (Fig. 3), which indicated that they were also relatively better yielding genotypes (Table 3). The late maturing genotype (G2) is located in the negative direction of first axis (PC1), far away from the origin that aligned with days to maturity, which supports its delaying character in maturity.

Genotypes with high scores in the positive direction of the second axis (PC2) such as G1 and G8 were closely associated with days to flowering and number of branches (both with large loading in the second PC). Moreover, G2 and G4 both located in the positive direction of PC 2 were associated with number of branching and date of flowering respectively (Fig. 3) that indicates their better performance (in terms of higher value) in the respective traits (Table 3). Plant height was the only trait with the highest loading in the third PC. Likewise genotypes, G1, G3 and G6 had the highest and positive scores (figure not shown). As observed in the depicted dendrogram, the distance between the latter two genotypes (G3 and G6) was too small showing their relatedness. The strong similarity between these genotypes (G3 and G6), may be largely attributed to plant height, which is the nearest trait to these genotypes.

As grain yield is a complex trait and is highly influenced by genetic as well as environmental factors, direct selection for yield may not be as such effective (Baraki et al., 2015) and investigating the association of different traits is critical to boost sesame grain yield. Figure 3 indicates that all of the agronomic traits were negatively associated with days to maturity, similarly length of capsule bearing zone and oil content were negatively associated with days to flowering since they are located in opposite direction and far apart to each other. On the other hand, grain yield, plant height, oil content and length of capsule bearing zone were highly associated to each other. Baraki et al. (2015) and Chowdhury et al. (2010) was also reported as they found highly and positively association between grain yield and oil content using correlation coefficient.

CONCLUSION

The thirteen sesame genotypes were clustered into four groups based on the similarity of their agronomic traits and such clustering of the genotypes is indispensably important for breeding program since it clearly indicates the significance or the potential of a given trait in any cluster. The Mahalanobis’s (D2) distance, genetic divergence, among the clusters were statistically significant and the clusters with greater inter-cluster divergence can be used as a parent material for further breeding to produce desirable recombinants. G1 and G4 were highly associated with traits such as grain yield, oil content, length of capsule bearing zone and number of capsules. On the other hand, G12 and G13 were highly associated with grain yield and G2 that aligned with days to maturity confirms its delaying character in maturity.

Information on genetic diversity is indispensably important when working to improve crop varieties. Investigation of morphological genetic diversity in Ethiopian sesame genotypes have been undertaken by different authors. However, genetic diversity using biochemical and molecular markers have not been undertaken exhaustively in Ethiopian sesame, which requires due attention and exhaustive work on this to undertake further and in detail sesame breeding.

ACKNOWLEDGMENTS

The first author would like to sincerely acknowledge for research members of crop department in Humera Agricultural Research Center and Public and Private Partnership Organization project (PPPO) for financial support.

REFERENCES

  • Abdi, H. and L.J. Williams, 2010. Principal component analysis. Wiley Interdiscipl. Rev.: Comput. Stat., 2: 433-459.
    CrossRef    Direct Link    


  • Ashri, A., 1994. Genetic Resources of Sesame: Present and Future Perspectives. In: Sesame Biodiversity in Asia Conservation, Evaluation and Improvement, Arora, R.K. and K.W. Riley (Eds.). IPGRI, New Delhi, India, pp: 25-39


  • Azeez, M.A., C.O. Aremu and O.O. Olaniyan, 2013. Assessment of genetic variation in accessions of sesame (Sesamum indicum L.) and its crosses by seed protein electrophoresis. J. Agroaliment. Process. Technol., 19: 383-391.
    Direct Link    


  • Bandila, S., A. Ghanta, S. Natarajan and S. Subramoniam, 2011. Determination of genetic variation in Indian sesame (Sesamum indicum L.) genotypes for agro-morphological traits. J. Res. Agric. Sci., 7: 88-99.


  • Begum, S., M.A. Islam, A. Husna, T.B. Hafiz and M. Ratna, 2011. Genetic diversity analysis in sesame (S. Indicum L.). SAARC J. Agric., 9: 65-71.


  • Cattell, R.B., 1966. The scree test for the number of factors. Multivariate Behav. Res., 1: 245-276.
    CrossRef    Direct Link    


  • Chowdhury, S., A.K. Datta, A. Saha, S. Sengupta, R. Paul, S. Maity and A. Das, 2010. Traits influencing yield in sesame (Sesamum indicum L.) and multilocational trials of yield parameters in some desirable plant types. Indian J. Sci. Technol., 3: 163-166.
    Direct Link    


  • Suhr, D.D., 1999. Principal component analysis vs. exploratory factor analysis. University of Northern Colorado, USA., pp: 1-11 http://www2.sas.com/proceedings/sugi30/203-30.pdf.


  • Baraki, F., Y. Tsehaye and F. Abay, 2015. Grain yield based cluster analysis and correlation of agronomic traits of sesame (Sesamum indicum L.) Genotypes in Ethiopia. J. Nat. Sci. Res., 5: 11-17.
    Direct Link    


  • Furat, S. and B. Uzun, 2010. The use of agro-morphological characters for the assessment of genetic diversity in sesame (Sesamum indicum L.). Plant Omics J., 3: 85-91.
    Direct Link    


  • Gandhi, A.P., 2009. Simplified process for the production of Sesame seed (Sesamum indicum L) butter and its nutritional profile. Asian J. Food Agro-Ind., 2: 24-27.
    Direct Link    


  • Kaiser, H.F., 1960. The application of electronic computers to factor analysis. Educ. Psychol. Measur., 20: 141-151.
    CrossRef    


  • Mahalanobis, P.C., 1936. On the generalized distance in statistics. Proc. Natl. Inst. Sci., 2: 49-55.
    Direct Link    


  • Mooi, E. and M. Sarstedt, 2011. Cluster Analysis. In: A Concise Guide to Market Research, Mooi, E. and M. Sarstedt (Eds.). Springer, Berlin, Heidelberg, ISBN: 978-3-642-12540-9, pp: 237-284


  • Pham, T.D., T.D.T. Nguyen, A.S. Carlsson and T.M. Bui, 2010. Morphological evaluation of sesame (Sesamum indicum L.) varieties from different origins. Aust. J. Crop Sci., 4: 498-504.
    Direct Link    


  • Hammer, O., D.A.T. Harper and P.D. Ryan, 2001. PAST-Palaeontological statistics. http://www.uv.es/pe/2001_1/past/pastprog/past.pdf.


  • Saha, S., T. Begum and T. Dasgupta, 2012. Analysis of genotypic diversity in Sesame based on morphological and agronomic traits. Proceedings of the Conference on International Research on Food Security, September 19-21, 2012, Gottingen, Germany -.


  • Searle, S.R., G. Casella and C.E. McCulloch, 1992. Variance Components. Wiley, New York, ISBN: 9780471621621, Pages: 528


  • Steel, R.G.D. and J.H. Torrie, 1980. Principles and Procedures of Statistics: A Biometrical Approach. 2nd Edn., McGraw Hill Book Co., New York, USA., ISBN-13: 9780070609266, pp: 471-472


  • Urdan, T.C., 2005. Statistics in Plain English. 2nd Edn., Lawrence Erlbaum Associates, Mahwah, New Jersey


  • Ward, Jr. J.H., 1963. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc., 58: 236-244.
    CrossRef    Direct Link    


  • Sinebo, W., 2002. Yield relationships of barleys grown in a tropical highland environment. Crop Sci., 42: 428-437.
    Direct Link    


  • Tsehaye, T. and F. Kebebew, 2002. Morphological diversity and geographic distribution of adaptive traits in finger millet [Eleusine coracana (L.) Gaertn.(Poaceae)] populations from Ethiopia. Ethiopian J. Biol. Sci., 1: 37-62.


  • Tryfos, P., 1997. Methods for Business Analysis and Forecasting: Text and Cases. Chapter 16: Cluster Analysis. John Wiley and Sons, New York, pp: 364


  • Minitab, 1998. MINITAB users guide, released 16.0. MINITAB Inc., USA.

  • © Science Alert. All Rights Reserved