INTRODUCTION
Document classification is defined as the task of learning methods for categorizing
collections of electronic documents into their annotated classes, based on its
contents. For several decades now, document classification in the form of text
classification systems have been widely implemented in numerous applications
such as spam emails filtering (Cunningham et al.,
2003), (O’Brien and Vogel, 2002; Provost,
1999; Sahami et al., 1998), knowledge repositories
(Hartley et al., 2006) and ontology mapping (Su,
2002), contributed by the extensive and active research works. An increasing
number of statistical approaches have been developed for document classification,
such as knearestneighbor classification (Han et al.,
1999), Bayesian classification (McCallum and Nigam,
2003), support vector machines (SVMs) (Burges, 1998;
Joachims, 1998; Mohammadi and Gharehpetian,
2008; Sani et al., 2009) maximum entropy
(Nigam et al., 1999), decision rule (Apte
et al., 1994) and artificial neural networks (Abd
Alla, 2006; Soltanizadeh and Shokouhi, 2008).
Among the classification approaches as mentioned above, Bayesian classification
has been widely used due to its simplicity in both the training and classifying
stage. Some research works have proven that Bayesian classification approach
provides intuitively simple text generation models and performs surprisingly
well in many other domains, under some specific conditions (Domingos
and Pazzani, 1997; Kim et al., 2002; McCallum
and Nigam, 2003; Rish et al., 2001). Depending
on the precise nature of the probability model, Bayesian classifiers can be
trained very efficiently by requiring a small amount of training data to estimate
the parameters necessary for classification. As the tradeoff of its simplicity,
Bayesian classification approach has been reported to be less accurate than
the discriminative methods such as SVMs (Chakrabarti et
al., 2003; Joachims, 1998).
An optimum text classification system which is widely applicable in many real word applications should be equipped with the ability in handling different datasets of varying characteristics. This issue is addressed through the application of domain specific text classification algorithms. In this study, we propose the tournament structure ranking techniques, such as Round Robin tournament structure, SingleElimination tournament Structure and Swiss System tournament structure, in order to enhance the conventional Bayesian classification. Our aim for introducing the tournament structure based ranking techniques to Bayesian classification is to deal with the datasets of multiple highly similar categories more effectively than the conventional version, which implements the flat ranking technique. The reason of introducing the tournament structure ranking techniques for conventional Bayesian classification is that the tournament structure ranking techniques perform the iterative binary classification for the computation of the posterior probability for the input document to be annotated to each category, Pr(CategoryDocument), in multiple categories classification tasks. Based on the our investigation, the binary classification results to a pure computation of Pr(CategoryDocument) between two categories without any distortion by the information from other available categories in the classification tasks. This will lead to an effective calculation with minimum noise, especially in the sensitive classification tasks with highly similar categories, where the posterior probability values for each category are very similar to each other. On the other hand, the conventional Bayesian classification performs the computation of Pr(CategoryDocument) in a single round by taking all the available categories into account. We found that in the classification tasks with multiple highly similar categories, the flat ranking technique is less effective than the tournament structure ranking techniques due to the allin computation for the posterior probability for each category.
BAYESIAN CLASSIFICATION ALGORITHM
The conventional Bayesian classification approach performs its classification tasks starting with the initial step of analyzing text documents by extracting words which are contained in the document to generate a list of words. The list of words is constructed with the assumption that input document contains words w_{1}, w_{2}, w_{3}, ……Y, w_{n1}, w_{n}, where the length of the document (in terms of number of words) is n.
Based on the list of words, the trained Bayesian classifier calculates the posterior probability of the particular word from the document being annotated to a particular category by using Bayes formula which is shown in Eq. 1, since each word in the input document contributes to the document's categorical probability.
The derived equation above shows that by observing the value of a particular word, w_{j}, the prior probability of a particular category, C_{i}, Pr(C_{i}) can be converted to the posterior probability, Pr(C_{i}  w_{j}), which represents the probability of w_{j} being annotated to C_{i}. The prior probability, Pr(C_{i}) can be compute from Eq. 2, 3:
Meanwhile, the evidence, which we call the normalizing constant of a particular word, w_{j}, Pr(w_{j}) is calculated by using Eq. 4:
The total occurrence of a particular word in every category can be calculated by searching the training data base, which is composed from the lists of word occurrences for every category. As previously mentioned, the list of word occurrence for a category is generated from the analysis of all the training documents in that particular category during the initial training stage. The same method can be used to retrieve the sum of occurrence of all words in every category in the training data base.
To calculate the likelihood of a particular category, C_{i} with respect to a particular word, w_{j}, the lists of word occurrence from the training data base is searched to retrieve the occurrence of w_{j} in C_{i} and the sum of all words in C_{i}. These information will contribute to the value of Pr(w_{j}C_{i}) given in Eq. 5:
Table 1: 
Table of words occurrence and probabilities 

Based on the derived Bayes formula for text classification and the value of the prior probability Pr(Category), the likelihood Pr(WordCategory) and the evidence Pr(Word), along with the posterior probability, Pr(CategoryWord) of each word in the input document being annotated to each category can be measured.
The posterior probability value of each word being annotated to a category is then filled in at the appointed cells in a table respectively as shown in Table 1. After all the Probability cells have been filled, the overall probability for an input document to be annotated to a particular category, C_{i} is calculated by dividing the sum of each of the Probability column with the length of the document (total number of words), n, which is shown in Eq. 6:
where, w_{1}, w_{2}, w_{3}, ……Y, w_{n1}, w_{n}, are words that extracted from the input document.
The conventional Bayesian classifier is able to determine the right category of an input document by referring to the highest probability value calculated by the trained classifier based on Bayes formula. The right category is represented by the category which has the highest posterior probability value, Pr(CategoryDocument), as stated in bayes decision rule.
IMPLEMENTATION OF TOURNAMENT STRUCTURED BAYESIAN CLASSIFIER
The basic structure of the proposed tournament structured Bayesian classification
approach is shown in Fig. 1. The important features of the
classification system are the input analysis facility, the Bayesian classifier
equipped with tournament structure ranking techniques, which work together to
provide a method of guaranteeing an optimum classification for multiple highly
similar classification tasks. By introducing the tournament structure classification
ranking techniques, the enhanced Bayesian classifier is able to handle the classification
tasks effectively and efficiently for datasets with highly similar categories,
beyond the performance of the conventional Bayesian classifier.
As shown in Fig. 1, the task of our proposed classifier is divided into two stages, the training phase and the classifying phase. Hence, the dataset which has been acquired for the classification task have to be separated into two sets, one set contains the training documents which have been welllabeled by domain experts and another set contains the documents for testing and evaluation purposes. During the training phase, the training documents are entered into the classifier and the classifier will organize all the documents into different groups by the learning facility according to their label/category. The grouped documents are then stored into a domain database to build the training set. Before the classifier is used to perform its classification task, it needs to be trained with the training set. The training data generation facility will generate the training data, in the form of list of words and the frequency of their occurrence for each category. The lists of words and their frequency of occurrence for each category are generated by using a simple data structure algorithm and they are used to train the Bayesian classifier to equip the classifier ready for the classifying stage.
After the training phase, the classifier is ready to perform its task of classifying
unlabeled documents. However, for the ease of evaluation of the performance
of the classifier, all the testing documents have also been labeled by the domain
experts but this information is reserved to be compared with the results generated
by the classifier after processing the testing documents. At the first step
of the classifying phase, the testing document is entered into the classifier
and the input analysis facility will analyze the document and generate the list
of words associated with their frequency of occurrence for the document.

Fig. 1: 
Block diagram of proposed Bayesian classifier with tournament
structure ranking techniques 
The classifier will process the list of words occurrence in order to compute
the posterior probability values of each testing document to be annotated to
each of the categories. After that, the probability distribution of the testing
document in the vector space is generated and the classifier will generate the
classification result as the right category of the input document by referring
to the category which has the highest posterior probability value among the
other available categories. The iterative classifying process will continue
until all the testing documents have been categorized.
As most of the Artificial Intelligence (AI) learning machines need to be trained before they can perform their tasks, our proposed tournament structured Bayesian classifier requires a proper initial training to perform its classification tasks effectively and efficiently. For initial training purposes, training documents have been acquired from the wellorganized knowledge domain, or human experts to construct a repository of training data. In order to build the initial training set, human experts must manually label the training documents and ensure they are wellcategorized. For each category, a reasonable number of training documents need to be acquired. Since, we have implemented Bayesian classification approach as our probabilistic classifier, a training set with large number of entries is not required due to the individual attribute values are assumed as statistically independent. Bayesian classifier can be very robust to violations of its independent assumption. In other words, the classifier is robust enough to ignore serious deficiencies in its underlying naïve probability model, for example, the assumption of the independence of probable events. This robustness is encapsulated in the relative magnitudes of the probabilities (relatively constant for the same category).
The training documents are in the form of raw text documents to fit our system’s requirements. This is an advantage of Bayesian classification that able to handle raw text data, without requiring any transformation step to transform text data into representation suitable format, typically in numerical format, as requiring by most of the classification approaches such as the support vector machines and the knearest neighbor. At the basic level, Bayesian classifier examines a set of training documents that have been wellorganized and categorized and then compares the content of all categories in order to build a database of words and their occurrences by using a simple data structure algorithm. The database is used to identify or predict the membership of future documents to their right category, according to the probability of certain words occurring more frequently for certain categories. It overcomes the obstacles faced by many static technologies, such as blacklist checking and wordtoword comparisons using static databases filled with predefined keywords. The training set can be updated by human experts to upgrade the quality of training data, which will directly contribute to the performance of the probabilistic classifier. As the future work, the updates of training set of our classification system will be enhanced with an automatic intelligent domain database update facility which will greatly improve the ability and the performance of the system.
The input to the classification system is in the form of text document. The system analyses the input text data before proceeding to the probabilistic classifier for data identification and classification of the input document. The input analysis facility uses a simple word extraction algorithm to extract each individual word from the input text document prior to generating a list of words.
The tournament structured Bayesian classifier plays an important role when the classification process advances to the following stage. For instance, a particular word has its probability occurring in a particular category, then, the tournament structured Bayesian classifier computes the posterior probability of words being annotated to each category based on their frequency of occurrence in the particular category, using Bayes formula. The algorithms for the computation of posterior probability by each of the tournament structure ranking techniques are discussed in the subsections below. The overall probability of an input document being annotated in a particular category is computed based on the sum of the posterior probability values of every word in the document, divided by the length of the document (in terms of total number of words). The probability distribution of the particular input document in the vector space can be obtained by integrating all the posterior probability of an input document being annotated in each category, individually and sequentially, according to the order of categories in the vector space.
The output from the Bayesian classifier is the probability distribution of input document in the vector space. The probability distribution of input document is not only carrying vectorized data of the document in the format of numerical values, but also information about the right category of the input document. As we are implementing Bayesian classification as the core of our classifier, the right category of the input document can be determine by referring to the category which has the highest posterior probability value, Pr(CategoryDocument) among the other available categories in the probability distribution, as stated in Bayes Decision Rule.
Flat ranking technique: The flat classification ranking technique is widelyimplemented in the conventional Bayesian probability calculation in order to support multidimensional classification task. The probability value for a document D to be annotated to a category C is computed as Pr(CD). With the assumption that we have a category list as C_{1}, C_{2}, C_{3}, C_{4}, C_{5}, ……., C_{n}, thus, each document has nassociated probability values, where document D will have Pr(C_{1}D), Pr(C_{2}D), Pr(C_{3}D), Pr(C_{4}D), Pr(C_{5}D), ……, Pr(C_{n}D). The nassociated probability values represent the probability distribution for document D in the vector space. By implementing the flat ranking technique, the probability value of an input document being annotated to a particular category is calculated by considering all the available categories together in a single round. During the calculation of the prior probability, Pr(Category), the normalization constant, Pr(Word) and also the likelihood of a particular category with respect to that particular word, Pr(WordCategory), all the available categories in the classification task are involved to produce a complete calculation of the probability distribution in the vector space. The flat classification ranking technique selects the category with the highest posterior probability value, Pr(CD), as the right category being annotated for the input document, based on Bayes Decision Rule.
Based on our investigation, in the multidimensional classification tasks which involve highly similar categories, the flat ranking technique suffers from the distorted calculation of the posterior probability for the input document, D, to be annotated to each category, Pr(C_{i}D). Since all the categories are highly similar to each other, the information (list of keywords and their occurrence) from each category is also highly similar. This will lead to a distortion in computation while calculating the posterior probability for a particular input document to be annotated to a particular category by taking all the available categories together in a single round. Thus, in this study, we introduce the tournament structure ranking techniques, which perform the calculation of posterior probabilities based on the iterative binary classification, to overcome the problem faced by the flat ranking technique.
Round robin tournament ranking technique: Tournament structures are possible to be implemented in classification ranking algorithm in order to handle multicategories classification. In round robin tournament ranking technique, the calculation of the posterior probability values of an input document being annotated to each available category is performed only between two categories. It is an iterative binary classification algorithm and each competitor plays against all the others an equal number of times, typically once. The round robin tournament ranking technique contributes to a relatively complete and pure competition among all the categories, as compared to the flat ranking technique. The process iterates until every category has competed against all the others once.
The structure of the round robin tournament ranking technique that implemented
in our proposed system is based on the Host and Guest concept. Firstly, all
the categories are randomly sorted. The first category will act as the host
of the initial round and plays against all the others which are ranked below
it as guests, individually and sequentially, in the matches conducted in the
first round. In the second round, the second category will become the host and
those categories which are ranked below it will compete against the host.
This process iterates until all categories have competed against the others
an equal number of time, typically once. The illustration of structure of the
round robin tournament ranking technique that implemented in our proposed classification
system is as shown in Table 2.
There are several methods available to determine the final winner after the iterative calculation processes have completed. One of the methods is awarding the winning category of each match with a score, typically 1 and the loser is not awarded with any score, or in other words, score 0 is awarded. The scores gained from each match by each category are then added together after the competition until the calculation has completed. The category with the highest final score will be the overall winner, which represents the right category of the input document.
There is a situation where dilemma occurs in determining the right category of an input document when two or more categories have the same highest final score. This situation can be avoided by awarding the two competing categories of each match with the score which is equal to their posterior probability value calculated based on Bayes formula from the binary classification. With this method, the final highest scores for each category are rarely to be the same.
As a result, the round robin tournament ranking technique has an improved ability in distinguishing highly similar categories, since it performs an iterative binary classification algorithm. The binary classification algorithm has a greater ability in differentiating highly similar categories as compared to the flat ranking technique, which involves all the available categories together in a single round of classification. The binary classification algorithm is smart enough to isolate two categories temporarily and perform the posterior probability calculation without considering the other parties. However, the iterative binary classification process consumes a relatively long time as compared to other algorithms, such as the flat ranking technique. Therefore, the round robin classification technique is not efficient for the implementation in the classification tasks on datasets with a high dimensionality of categories.
Single elimination tournament ranking technique: By comparing with other classification ranking techniques, the single elimination tournament ranking technique has some restrictions. Firstly, most often the number of categories is fixed as a power of two. Somehow, in a given situation where the number of categories is not a power of two, typically the highestrated categories from the previous accomplishment will be advanced to the second round without joining any match in the first round. Besides, seeding is recommended as a preprocess to prevent the highestrated categories from being scheduled to face each other in the early stages of competition. The seeds ranking process can be executed by the classifier equipped with the classification ranking technique in either the flat ranking technique or the round robin tournament ranking technique. Therefore, the single elimination algorithm is more suitable to be implemented at the backend of a hybrid ranking technique system.
As similar to the round robin tournament ranking technique, the single elimination ranking technique also performs the probability values calculation in the form of binary competition for every match. In the first round, the best category is played against the worst and the second best against the second worst and so on for the rest. Brackets are set up, so that the top two seeds could not possibly meet until the final round, none of the top four can meet before the semifinals and so on. The winner of each match will proceed to the next round while the looser of each match will be eliminated from the tournament. This concept is applicable in the following rounds until the overall winner is represented by the winner of the final round. The structure of the single elimination tournament ranking technique that implemented in our proposed system is as shown in Fig. 2. In Fig. 2, a simple assumption has been made that the classification task consists of 16 categories.
In contrast to the round robin tournament ranking technique, as rounds progress, the successive rounds of the single elimination tournament ranking technique halve the number of remaining categories. This is due to the single elimination tournament ranking technique progresses the winners from the previous round to the next round and eliminates the losers. The single elimination tournament ranking technique is suitable for implementation in the domains which have a large number of categories. Since this ranking technique is also a binary classificationbased algorithm, it has a great ability in handling the classification tasks which involve categories with a high degree of similarities.

Fig. 2: 
Structure of the single elimination tournament ranking technique 
Table 3: 
Structure of the swiss system tournament ranking technique 

Swiss system tournament ranking technique: The Swiss system tournament
ranking technique can be implemented independently or at the backend of a hybrid
ranking technique system. The initial seeding of a Swiss system tournament ranking
technique is not compulsory, but is recommended. The competing categories are
then divided into two parts, the top half is paired with the bottom half. For
instance, if there are eight categories in the classifier, the first category
is paired with the fifth category; the second is paired with the sixth, the
third is paired with the seventh and so on. The structure of the Swiss system
tournament ranking technique that is implemented in our proposed system is shown
in Table 3.
After the first round of the competition, the winners from the first round will play against the winners and the losers will play against the losers. Similar to the round robin tournament ranking technique, the winning category of each match is awarded with a score, typically 1 and the loser is not awarded with any score. In further rounds, each competitor will be pitted against another competitor who has the same score. Modifications are then made to prevent competitors from meeting each other twice.
In contrast to the round robin tournament ranking technique, the Swiss system tournament ranking technique can determine the top rank and bottom rank competitors with fewer rounds, although the middle rankings are rather unreliable. On the other hand, in our classification tasks, we are only interested in the final overall winner, which represents the right category of input document. Therefore, the Swiss system tournament ranking technique is applicable in our classification system with large number of categories. Besides, similar to other binary classification based ranking techniques, it is suitable to be implemented in classification tasks where the domain contains categories of high degree of similarities.
However, the number of competing categories has become the biggest obstruction for the Swiss system tournament ranking technique. Likened to the round robin tournament ranking technique, the Swiss system tournament ranking technique has the potential in facing a dilemma in determining the right category of input documents, or the final winner. There is a possibility that two or more categories have the same highest and perfect score, winning all the matches but have never faced each other. Therefore, the conventional ranking technique needs to be slightly modified in terms of the number of rounds played. To determine a clear overall winner, we have applied the same concept with the single elimination tournament ranking technique in terms of number of rounds that is the base 2 logarithm and the number of competitors is rounded up by the power of two.
EXPERIMENTS, EVALUATIONS AND DISCUSSION
In order to evaluate the performance of the tournament structure ranking classification techniques to be implemented to classification tasks on datasets with high degree of similarity between their available categories, we have acquired four datasets of varying characteristics. Each of these datasets has its own unique characteristics in terms of the degree of similarity between categories and the dimensionality of categories, as shown in Table 4.
The degree of similarity can be defined by the designers or organizers of the dataset. There is a way to measure the degree of similarity between categories for a particular dataset and that is by comparing the words in the vocabulary of every document from each category in a dataset. If most of the categories from a dataset share a majority of the keywords, say more than 80% of the vocabulary size in each category, the dataset is considered to have a high degree of similarity between categories. A dataset with a low degree of similarity between categories is built from categories which are easily differentiated from each other by a set of specific keywords where the size of the specific keywords set is typically 80% of the vocabulary size of each category.
There exists another method to measure the degree of similarity between categories for a particular dataset. This method firstly computes the probability distribution in the vector space for the documents in each of the available categories in the dataset, which represented by the nassociated posterior probability values for the documents to be annotated to each of the ncategories. After that, the probability distributions of the documents in the vector space are analyzed to measure the similarity between the categories. This could be done by comparing the values of each of the posterior probability values for the documents to be annotated to each of the available categories among the others and a threshold is set for the difference between each of the nassociated posterior probability values for the documents. If the nassociated posterior probability values for the documents to be annotated to each of the available categories, Pr(C_{1}D), Pr(C_{2}D), Pr(C_{3}D), Pr(C_{4}D), Pr(C_{5}D), ……, Pr(C_{n}D), are close to each other, in other words, the difference between each of the nassociated posterior probability values for the documents are very small, or less than the threshold, the dataset is considered as a dataset with highly similar categories.
As for the number of categories of a dataset, it represents the total number of available categories in a dataset which has been predefined and can be easily measured.
Table 4: 
Four datasets and their characteristics which have been used
for experiments and 

Table 5: 
List of categories of the featured articles dataset 

Experiments on the featured articles dataset: The Featured articles
dataset was designed and organized by our research group by extracting different
types of articles from the wikipedia website. These articles were acquired from
23 randomly selected categories and were converted into plain text documents
for the ease of accessing by our prototype text classifier. The list of selected
categories for the Featured Articles dataset is shown in Table
5.
All the 23 categories from the featured articles dataset are generally unrelated to each other though some of the categories such as history, religion and beliefs, War are slightly related. Other categories such as engineering and technology and transport can also be considered as similar categories. However, the degree of similarity between categories of this dataset is considered as normal because each category still has its own set of specific keywords that differentiates itself from other categories.
Table 6 shows the classification accuracies of the classifiers built from different classification ranking techniques on the featured articles dataset.
The experimental results shown in Table 6 show that the flat
ranking classifier performs the best among the other classifiers while handling
the classification task on the Featured Articles dataset. This classifier has
achieved the highest accuracy rate of 71.15%.
Table 6: 
Experimental results of the classifiers with different classification
ranking techniques on the featured articles dataset 

Dataset: Featured articles, Training set: 230 documents, Testing
Set: 929 documents 
Table 7: 
List of categories in the vehicles dataset 

Table 8: 
Experimental results of the classifiers with different classification
ranking techniques on the vehicles dataset 

Dataset: Vehicles, Training set: 200 documents, Testing set:
440 documents 
The baseline performance is 70.51%, achieved by both the round robin tournament
ranking classifier and the Swiss system tournament ranking classifier. As we
can observe from Table 6, the classifiers equipped with the
flat ranking technique have outperformed the classifiers equipped with the tournament
structure ranking techniques.
Experiments on the vehicles dataset: The Vehicles dataset was built by extracting vehicle related articles from the Wikipedia website. This dataset was acquired by extracting articles from four sub categories in the Vehicles category, which are Aircrafts, Boats, Cars and Trains. All four categories are easily differentiated and each category has a set of unique keywords. The list of categories of the Vehicles dataset is shown in Table 7.
The experimental results of the classification task on the Vehicles dataset using different classifiers, built from different classification ranking techniques, are shown in Table 8.
The experimental results shown in Table 8 show that the flat ranking classifier has achieved the highest classification accuracy rate of 88.64% as compared to other classifiers. The baseline performance is 77.27%, produced by the single elimination tournament ranking classifier.
Experiments on the mathematics dataset: A dataset containing articles
about mathematical topics has been acquired from arxiv.org and it is one of
the datasets used to evaluate the performance of our prototype text classifier.
This dataset contains eight mathematical subcategories.
Table 9: 
List of categories of the Mathematics dataset 

Table 10: 
Experimental results of the classifiers with different classification
ranking techniques on the mathematics dataset 

Dataset: Mathematics, Training set: 80 documents, Testing
Set: 240 documents 
The list of categories of the Mathematics dataset is shown in Table
9.
All eight categories in the mathematics dataset have high degree of similarity. Due to the fact that the eight categories of this dataset are the subtopics in mathematics, the documents in all the categories share a lot of common mathematical keywords. There are only a few specific and unique keywords that differentiate each topic. Therefore, the frequency in which those specific keywords occur in that certain category is the main key in differentiating these topics.
Table 10 shows the experimental results of the classification task on the Mathematics dataset by classifiers with different classification ranking techniques.
The pattern of the experimental results in Table 10 is different from the previous experiments as discussed earlier. The flat ranking classifier scores lower classification accuracy than the classifiers equipped with tournament structure classification ranking techniques. The flat ranking classifier scores an accuracy of 80.42%, which is the baseline performance in this experiment. All the three classifiers equipped with the tournament structure classification ranking techniques achieved the best performance in this experiment with the highest classification accuracy of 81.25%.
Experiments on the automobiles dataset: The Automobiles dataset is a dataset which was designed and organized by collecting articles about automobiles from the Wikipedia website. This dataset contains nine categories of automobiles, differentiated in terms of geographical regions and types. Table 11 shows the list of categories in the Automobiles dataset.
All nine categories in the Automobiles dataset are considered highly similar
to each other since the documents from these categories describe the characteristics
of automobiles.
Table 11: 
List of categories of the mathematics dataset 

Table 12: 
Experimental results of the classifiers with different classification
ranking techniques on the automobiles dataset 

Dataset: Automobiles, Training set: 180 documents, Testing
set: 90 documents 
However, each category has a minor set of unique keywords that describes the
types of the automobiles and their geographical regions to differentiate their
characteristics from other categories.
The experimental results of the classification task on the automobiles dataset using classifiers of different classification ranking techniques are shown in Table 12.
The results shown in Table 12 show that all the three classifiers equipped with tournament structure ranking techniques achieve the greatest classification accuracy with a rate of 86.67%. The baseline performance is 77.78%, achieved by the classifier equipped with the flat ranking technique. The pattern of the experimental results as shown in Table 12 is similar with the pattern of the experimental results shown in Table 10, where all the three classifiers with tournament structure ranking techniques outperform the classifier equipped with flat ranking technique. The similar pattern of the experimental results of the mathematics dataset and the experimental results of the automobiles dataset is due to the fact that both of these datasets have high degree of similarity between their available categories.
Discussion of experimental results: Based on the experimental results that we have obtained from our experiments on datasets of varying characteristics, we can conclude that for datasets with low and normal degree of similarity between the available categories, the classifier with flat ranking technique has achieved the best classification performance among all the classifiers which equipped with the tournament structure ranking techniques. However, for the datasets with highly similar categories in terms of their contents, the classifiers equipped with the tournament structure ranking techniques outperform the flat ranking classifier.
Bayesian classification has been reported by many research works as one of
the poor performing classification approaches in general. As the results, many
active research works have been carried out to clarify the reasons that Bayesian
classifier fails in classification tasks and improve the performance of conventional
approach by implementing some enhancing techniques (Domingos
and Pazzani, 1997; Eyheramendy et al., 2003;
Kim et al., 2002; McCallum
and Nigam, 2003; Rish et al., 2001). From
the experiments that we conducted in our previous works, we found that conventional
Bayesian classifier fails to achieve high classification accuracy in some of
the classification domains, as reported by the research works mentioned above.
Based on our experiences, the poor performance of the conventional Bayesian
classification is distinct when the classifier deals with the datasets with
highly similar categories. One of the reasons that we found from our investigation
for the poor performance of conventional Bayesian classification in such a situation
is that it suffers from the confusion in differentiating highly similar categories
in the classification task. This is due to the conventional Bayesian classification
takes all the available categories into the computation of posterior probability
values for a document to be annotated to each available category in a single
round. In the datasets with highly similar categories, all the categories share
a lot of the same keywords. By considering this kind of categories in a single
round of computation, it would be difficult to obtain good classification results.
As the solution for handling domains with highly similar categories in a classification
task, we have proposed a series of tournament structure classification ranking
techniques, which perform the iterative binary classification to overcome this
problem. The iterative binary classification algorithm is able to solve this
problem by isolating two categories temporarily for the computation of the posterior
probability values for a document to be annotated to the competing categories,
without considering the other categories.
For both of the classification tasks on the featured articles dataset and the
vehicles dataset, the flat ranking technique outperforms all three tournament
structure based ranking techniques. This is probably caused by the appropriately
divided categories of the input documents which enables the flat ranking technique
to differentiate all the other categories in a single round. However, in a tournament
structure based ranking technique, the right category of an input document may
not be able to win most of competitions in the tournament system. This situation
is commonly found in tournament systems where the best competitor may not be
able to win a majority of the competitions or is accidentally eliminated before
the final round. Therefore, the flat ranking classification technique is more
suitable for implementation rather than the tournament structure based ranking
techniques in classification tasks where datasets with low and normal degree
of similarity between categories are involved. The tournament structure based
ranking techniques are more applicable in classification tasks which involve
highly similar categories wherein the flat ranking technique is not as effective
as the tournament structure ranking techniques in determining the right category
of the input documents or queries.
As we expected, the flat classification ranking technique performed poorer than the tournament structure based classification ranking techniques in the classification task on the Mathematics dataset and the Automobiles dataset which have high degree of similarity among their available categories. This is due to the fact that flat ranking technique calculates the probability values of each category of the input data in a single round by taking into account all categories. This will lead to great confusion in the classification task since all categories share a lot of the same keywords, in other words, the contents of all the categories are very similar to each other. In such a situation, a binary classification technique is able to overcome this problem. Therefore, the tournament structure based ranking techniques that perform the iterative binary classification are more suitable for implementation in classification tasks of datasets that have high degree of similarity among categories, as compared to the flat ranking technique.
Therefore, based on the experimental results that have been obtained in this study, we can conclude that classifiers equipped with the tournament structure ranking techniques are more suitable for application in classification tasks with datasets with highly similar categories, whereas the conventional flat ranking technique is more suitable to be implemented in classification tasks with datasets with low and normal degree of similarities between their available categories.
CONCLUSIONS
In conclusion, the experimental results presented in this study show that Bayesian classifiers that handle classification tasks on datasets with a low or normal degree of similarity among categories perform better using the flat classification ranking technique. The tournament structure ranking techniques, such as the round robin tournament ranking technique, the single elimination tournament ranking technique and the Swiss system tournament ranking technique perform better than the flat ranking technique in classification tasks on datasets consisting of highly similar categories. This is due to the tournament structure ranking techniques perform the iterative binary classification algorithm for the computation of the posterior probability for the input document to be annotated to each category, Pr(CategoryDocument), in multidimensional classification tasks. On the other hand, the flat ranking classification technique performs the computation of Pr(CategoryDocument) in one round by taking all the available categories into account.
Based on our investigation, the binary classification algorithm from the tournament structure based raking techniques contributes to a pure computation of Pr(CategoryDocument) between two categories without any distortion by the information from other available categories in the classification tasks. This results to an effective calculation without any noise and distortion, especially in the sensitive classification with highly similar categories, where the posterior probability values for each category are very similar to each other. However, the major drawback of the tournament structure classification ranking techniques is that they consumes a longer processing time and requires a convoluted computing process to perform the classification task. In the future, our research emphasizes in enhancing the ability and performance of our existing prototype by introducing more facilities to develop simple, low consumptions and widely applicable Bayesian classifiers.