HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2006 | Volume: 6 | Issue: 8 | Page No.: 1845-1853
DOI: 10.3923/jas.2006.1845.1853
Discrimination and Classification Levels of Students as Per their Major Specialization
Ali Mahmud Ateiwi, Sadoon Abdullah Al-Obaidy and Iryna Volodymyrivna Komashynska

Abstract: This study is considered as an attempt for employing the discriminated analysis method and classification for the purpose of achieving the assessment of a discriminant function through which we can discover the reasons of the actual difference between two groups of students-graduates of mathematics and statistics section at Al-Hussein Bin Talal University in accordance with their major specialization. Fisher`s Linear Discriminant Function was used as a tool for statistical analysis. It was estimated on the basis of a sample of graduated students consisting (28) male and female students who were classified into two groups based upon their accumulative averages which reflect their general specialization. The study results showed high significant differences between both groups based on the T2 test, this is ensure the existence of real differences among the students mental levels and their mental and innovative abilities for discriminating those who are distinguished and provide them with good care. Finally, the study results showed that the adopted classification rule led to the classification of about 90% of the students within the actual group to which they belong and the classification of about 10% of the students in a group which they do not belong to. These results assert the efficiency of the discriminant function which we have reached and the possibility of its use for the purpose of discriminating and classifying the students (unknown or affiliation) into the correct group in future.

Fulltext PDF Fulltext HTML

How to cite this article
Ali Mahmud Ateiwi, Sadoon Abdullah Al-Obaidy and Iryna Volodymyrivna Komashynska, 2006. Discrimination and Classification Levels of Students as Per their Major Specialization. Journal of Applied Sciences, 6: 1845-1853.

Keywords: analysis, hotelling`s T2 test, Fisher`s linear discriminant function and multivariate

INTRODUCTION

The discriminant analysis and classification are considered as multivariate analysis concerning the discrimination of different groups of observations and classifying the new observations into predefined groups. We resort to such method, when it is difficult to understand the causal relation of the difference among the different groups in a sufficient degree.

This study aims to use the discriminant analysis method and classification for achieving an estimation of discrminant function which can be used for separating or discriminating two groups of students-graduates of Mathematics and Statistics Department at Al. Hussein Bin Talal University, who were already classified on the basis of their accumulative averages which reflect their general specialization for the purpose of reclassifying them according to their major specialization, through the selection 5 basic study subjects. It also aims to identify the arrangement scale of the relative importance of these five subjects and to reach a rule which can be used for future discrimination and classification of a new students in one group of both specified groups.

Study methodology: This study aims to the Fisher’s Linear Discriminant Function as a tool for statistical analysis.

Study sample: The researchers selected a random sample of students-graduates of mathematics and statistics department consisting of 28 male and female students who were classified into two groups, based upon their accumulative averages which reflect their general specialization. The first group included 10 students whose averages were Very good and Excellent the second group included 18 male and female students whose averages were Good. Five basis study subjects were selected, reflecting the major specialization down. These subjects were: Topology (1), Real Analysis (1), Partial Differential Equations, Mathematics Statistics (1) and Abstract Algebra (1).

Study hypothesis

The study assumes the existence of significant difference between both groups of students so as the discriminant function which we aspire to achieve will be useful for discrimination and classification. We can realize this hypothesis by using Hotelling's T2 test or Mahalanobis D2 test.
The study assumes that major specialization subject are different in respect with their relative importance. This can be assured by specifying the relative importance scales of these subjects.
The study assumes that the adopted classification rule leads to classification errors which can be realized by using the Actual Error Rate (AER) or the Appeared Error Rate (APER).

Study plan: This study has been assessed on three main researches: theoretical section, applied section and abstract and conclusions.

THEORETICAL SECTION

Through this study, we’ll discuss the theoretical bases of discrimination functions in case of two population existing. Certainly, our concern will focus on: Fisher’s Linear Discriminant Function due to its theoretical formation, its estimation method and the test of significance of the separation between two populations in addition to the methods of evaluating the performance of classification functions.

Discrimination and classification of two populations: Fisher’s Method: Suppose π1 and π2 are the two groups which are required to be discriminated and classified on the bases of scales taken from p of random variables:

(1)

Suppose that X1 represents the population of X values of the group π1 with mean vector μ1 and covariance matrix ∑1 and suppose that X2 represents the population of X values of the group π2 with the mean vector μ2 and covariance matrix ∑2. We now assume that f1(x) and f2(x) are multivariate normal densities, the first with mean vector π1 and covariance matrix ∑1 and the second with mean vector π2 and covariance matrix ∑2. And also assume that ∑1 = ∑2 = ∑.

That is:

As

Fisher’s idea was to transform the multivariate observations X to univariate observations Y, such that the Y derived from populations π1 and π2 were separated as much as possible.

We suppose that μ1Y refers to the average of values Y which we get from X values belonging to π1 and μ2Y refers to the average of values Y which we get from X values belonging to π2. That is:

Fisher proposed taking linear combinations from X for getting Y values from both populations π1 and π2. Krzanowski (1977) selected the following Linear Combination:

(2)

Which maximizes the distance square between μ1Y and μ2Y ascribed to Y variance as follows:

(3)

As:

(4)

(5)

(6)

By compensating (4-6), in (3), we get

(7)

where is

By selecting:

(8)

According to Johnson and Wichrn (1998) for any value c ≠ 0' selecting c = 1, we get the following linear combination:

(9)

The Eq. 9 is known as: Fisher’s Linear Discriminant Function.

The maximum of the ratio in Eq. 7 is:

(10)

From (Hand, 1981) it follows that the Eq. 9 can be used to classify new observations.

If

is the value of the discriminant function at a new observation Xo and

(11)

is the midpoint between both averages of the two populations of one variable,

Whereas, it can be proved that:

Accordingly, the classification rule is as follows:

Allocate Xo to π1 if:

(12)

And allocate Xo to π2 if:

(13)

Estimation of Fisher’s linear discriminant function: Suppose that we have n1 observations of the multivariate random variable: X’ = [X1, X2, …. Xn] from π1 and μ2 measurements of this quantity from π2, with n1+n2-2≥p. Then the respective data matrices are:

(14)

From these data matrices, the sample mean vectors and covariance matrices are determined by:

(15)

As a result of the assumption ∑1 = ∑2 = ∑, the sample covariance matrices S1 and S2 are combined (pooled) to derive a single unbiased estimator of ∑, this estimator is:

(16)

And

(17)

where Spooled is an unbiased estimation of ∑,

and

(18)

Is the estimator of .

By putting the sample estimators and Spooled in place of μ1, μ2 and ∑ in Eq. 9, we can get the sample estimator of Fisher’s linear discriminant function and from (Morisson, 1976) it follows that

(19)

The linear combination in (19) maximizes the ratio:

(20)

As

The maximum of the ratio (20) is:

(21)

where D2 is the sample squared distance between the two means.

Note that S2y in (20) may be calculate as:

With and

Since D2 can be used, in certain situations, to test whether the population means μ1 and μ2 differ significantly. Consequently, a test for differences in mean vectors can be viewed as a test for the significance of the separation that can be achieved.

The midpoint MP between the two univariate means and can be obtained from the following formula:

(22)

So, the classification rule will be as follows:

allocate Xo to π1 if:

(23)

And allocate Xo to π2 if:

(24)

Testing the significance of the separation between two populations: Suppose the populations π1 and π2 are multivariate normal with a common covariance matrix ∑. From the work (Morisson, 1976), we know that to test the following statistical hypothesis:

Ho: μ1 = μ2vs.H1 = μ1 ≠ μ2

We can use the hotelling's T2-statistic as:

(25)

To prove that

(26)

Or:

(27)

where is the value of F with degrees of freedom of numerator P and degrees of freedom of denominator n1+n2-P-1, where the area on its right is equal to α, we use the work (Anderson, 1984).

So, we reject the null hypothesis Ho: μ1 = μ2 at the level of significance α when:

Then we can conclude that the separation between the two populations π1 and π2 is significant.

Furthermore, D2 can be used for testing the above hypothesis as follows:

Using (21), then Eq. 25 becomes:

(28)

by substituting (28) in (27), we get:

(29)

Or:

(30)

Therefore, we reject Ho: μ1 = μ2 at the level of significance α when:

The Eq. 27 and 29 are equivalent. Either of them can be used for testing the statistical hypothesis above.

Evaluation of classification functions: When calculating the midpoint (MP) between the averages of π1 and π2 populations, incorrect classification may result, i.e., an observation comes from π2 and is misclassified as π1 or vice-versa. In both cases, we get on error in classification.

One of the important reason for judging the performance of any method of classification is the calculation of its error rate, i.e., calculation Misclassification probabilities, probable classification through knowing the probability distribution of π1 and π2 or unknown it.

In respect with this matter, we’ll discuss two evaluating methods. They are:

Let f1(x) and f2(x) be the probability density functions with the (Px1) vector random variable X for the populations π1 and π2, respectively. Let Ω be the sample space-that is, the collection of all possible observations X. let R1 be that set of X values for which we classify objects as π1, let R2 = Ω-R1 be the remaining X values for which we classify objects as π2.

Since every objects must be assigned to one and only one of the two populations, the sets R1 and R2 are mutually exclusive and exhaustive.

The conditional probability P(2\1) of classifying an object as π2 when, in fact, it is from π1 is:

Similarly, the conditional probability P(2\1) of classifying an objects as π1 when, it is really from π2 is:

Let P1 be the Prior Probability of π1 and P2 be the prior probability of π2, where P1+P2 = 1

According to (Huberty, 1984) the optimum error rate will be known

Let us derive an expression for the OER when P1 =P2 =1/2 and f1(x), f2(x) , are the multivariate normal densities.

Depending on the Eq. 12 and 13, we can defined the regions R1 and R2 as follows:

We can express these two regions by using:

And as follows:

But Y is a linear combination of normal random variables, so the probability densities of Y f1(y) and f2(y), are univariate normal with means and variance given by:

Now,

(31)

Where, Φ is the cumulative distribution function of a standard normal random variable.

In most practical situations, the population quantities and are unknown, so the Eq. 31 must be modified by replacing the population parameters by their sample counterparts. Therefore, the evaluated of the performance of the sample classification function is calculated through the Actual Error Rate (ARE) as follows.

(32)

Apparent Error Rate (APER): This rate is a measure of performance that does not depend on the form of the parent populations and that can be calculated for any classification procedure.

The (APER) is defined as the fraction of observations in the training sample that are misclassified by the sample classification function.

The apparent error rate can be calculated from the confusion matrix, which shows actual versus predicted group membership.

According to work (Johnson and Wichern, 1988) the confusion matrix has the form:

where:

n1 represents the observations from π1 and n2 the observations from π2,

n1c = No. of π1 items correctly classified as π1 Items,
n1m = No. of π1 items misclassified as π2 items,
n2c = No. of π2 items correctly classified as π2 items,
n2m = number of π2 items misclassified as πi Items,

The apparent error rate then is:

(33)

The APER is intuitively appealing and easy to calculate.

Unfortunately, it tends to underestimate the AER and the problem does not disappear unless the sample seizes n1 and n2 are very large.

Identifying the relative importance of the variables of discriminant function: The coefficient vector and the coefficient vector are not unique as any vector Cα and C when (C ≠ 0) will also maximize both ratios in (7) and in (11), respectively. Mostly, the vector can be transformed into a standard vector or multiplied by a constant amount for simplifying the concept and the illustration of its components. The work (Breiman et al., 1984) gives the common method of use in this respect

(34)

So that * has until length.

It can be noted that the magnitudes of in (34) all lie in the interval [-1, 1] this will help in comparing the coefficients and in assessing the relative importance of the variables of the discriminant function.

APPLICATION SECTION

The organization of data: A random sample was selected, consisting of (28) students of the graduates of mathematics and statistics department at Al. Hussein Bin Talal University for three academic courses.

Table 1: The achieved scores of major specialization subjects for the sample of students in both groups
X1 = Topology (1), X2 = Real Analysis(1), X3 = Partial Differential Equation, X4 = Mathematical Statistic (1), X5 = Abstract Algebra (1)

They are: the second course (2003/2004) and the summer course (2003/2004) and the first course (2004/2005). Their accumulative averages were: Good, very good, excellent.

On the basis of this sample, the data of this study were prepared on two stages: in the first stage, the students sample was distributed on two groups on the basis of their accumulative averages which reflect the student’s general specialization. The first group included (10) students whose accumulative averages were very good and excellent. The second group included (18) students whose accumulative averages were good.

But in the next stage, (5) basic study materials were selected representing the major specialization of the graduate students. These materials are: X1= Topology (1), X2 = Real analysis (1), X3 = Partial Differential Equations, X4 = Mathematical Statistics and X5= Abstract Algebra (1).

Table 1 shows the achieved scores of each student in the research sample and in the five materials which represent the variables of discriminant function which aims to estimate it.

Statistical analysis: By using the data given in table (1) and the mathematical formula given in (15) and (16), a computer calculation yields the summary statistics.

The above estimates represent the prerequisites required for all our further accounts.

Testing the significance of the separation between the two groups: To test whether the population means μ1 and μ2 differ significantly, hotelling's T2-statistic can be used through the test of the following statistical hypothesis:

Now, from (25) T = 43.0132 and from (27), F = 7.2792.

We see that F>F5.22 (0.01), therefore we reject Ho at 10% level of significance and then we conclude that the separation between the two groups under study is significant. This gives a logic justification for the importance of the discriminant function for discriminating or classifying the students in one group of both identified groups.

Calculating of the discriminant function: Fisher's (sample) linear dscriminant function; Eq. 19:

Was estimated and the our estimated equation is:

The analysis of variance table for the discriminate analysis can be performed to test the following hypothesis:

Ho12345 =0 Vs. H1 At least one of the five α'5 is nonzero and as follows:


Where

since F>F5.22(0.01), we reject Ho at the 10% level of significance. This means that the discriminant function has a big advantage in discriminating or classifying the students under study.

Identifying the relative importance of the variables of discriminant function: The Importance of each variable of discriminant function can be calculated in accordance with the Eq. 34.

By applying this formula, we find that:

On the basis of the above calculations, we observe that Topology (1) subject will be on top importance among the major specialization subject followed by the Mathematical Statistics (1) subject and then the Abstract Algebra (1) subject and the Real Analysis and the Partial Differential Equations.

Estimating the midpoint between both groups: The midpoint between both groups can be estimated by using the Eq. 22 and accordingly, we can get the following:

Therefore, the classification rule will be as follows:


Table 2: The actual classification and predicted classification in accordance with fisher’s function of students sample in both groups

Table 2 displays the results of the predicted classification in accordance with the above classification rule.

Evaluating of classification function: On the basis of the previous classification rule and in following up the values of the Table 2, we observe a wrong classification withinthe first group as two students belonging to the second group were classified wrongly in the first group.

For the purpose of evaluating the performance of the adopted classification method, we’ll calculate both averages APER and AER.

Apparent Error Rate (APER): For calculating this rate, we’ll first calculate the confusion matrix as follows:

By using the Eq. 33 we find that:

Actual Error Rate (AER): This rate is calculated in accordance with the formula No. (32) in which we find that:

The value of the error rate in both cases is close and does not exceed (10%). This value means that the optimal classification rule in use, resulted in placing about (10%) of the students in a population which they do not belong to.

CONCLUSIONS

This study is considered as an attempt of using the discriminant analysis method and classification for the purpose of achieving a discriminant function, through which we can discover the reasons of the actual difference between two groups of graduate students from mathematics and statistics sections at Al-Hussein Bin Talal University. Then their classification will be repeated on the basis of their major specialization.

In this study, Fisher’s linear discriminant Function was used as a tool for statistical analysis. Then, it will be estimated on the basis of a sample consisting of (28) male and female students who were classified into two groups as per their accumulative averages which reflect their general specialization. The first group included (10) students. Their averages were very good and excellent. Whereas the second group included (18) male and female students of Good averages. Five basic academic subjects were selected, representing the major specialization of students and their achieved scores were written down. These subjects are: Topology (1), Real Analysis (1), Partial Differential Equations, Mathematical Statistics (1) and Abstract Algebra (1).

The most important results and conclusions which this study achieved are:

The study argued the high significant differences of big morale function between both groups of students on the basis of (T2) test. These big differences assert the existence of real differences among the scientific levels of students and their mental and innovative abilities. Such thing requires the discrimination of those who are distinguished of them.
The study showed and on the basis of linear discriminant function which had been estimated that two students were re-classified into the second group as per their major specialization after they were within the first group on the basis of their general specialization.
The study showed that Topology (1) subject is on top of the major specialization subjects in respect of its relative importance, followed by the Mathematical Statistics(1) subject and Abstract Algebra (1)subject, the Real Analysis subject and finally the Partial Differential Equations subject.
The classification rule used in this study is marked by its high performance which led to classifying about 90% of students in the actual group which they physically belong to. It also classifies about 10% of students in a group other than the group which they belong to. This denotes the power of discriminant function which we achieved and the possibility of using it for discrimination and future classification of the students (of unknown affiliation) into the correct group.

REFERENCES

  • Anderson, T.W., 1984. An Introduction to Multivariate Statistical Methods. 2nd Edn., John Wiley, New Yark


  • Breiman, L., J.H. Friedman, R.A. Olshen and J. Stone, 1984. Classification and Regression Trees. 1st Edn., Wadsworth International Group, Belmont, CA, ISBN: 978-0412048418, pp: 102-116


  • Huberty, C.J., 1984. Applied Discriminant Analysis. John Wiley, New Yark


  • Johnson, R.A. and D.A. Wichern, 1998. Applied Multivariate Statistical Analysis. Prentic Hall Inc., New York


  • Krzanowski, W.J., 1977. The performance of fisher's linear discriminant function under non-optimal conditions. Technometrics, 19: 191-200.


  • Morisson, D.F., 1976. Multivariate Statistical Methods. McGraw-Hill Book Co. Inc., New Yark


  • Hand, D.J., 1981. Discrimination and Classification. John Wiley, New York, USA., ISBN: 978-0471280484

  • © Science Alert. All Rights Reserved