Asian Science
Citation Index is committed to provide an authoritative, trusted
and significant information by the coverage of the most important
and influential journals to meet the needs of the global scientific
community.
Abstract: The world wide web is a vast resource of information
and services that continues to grow rapidly. Developing an automatic classifier,
which has ability of classifying documents into appropriate categories
predefined in the topic structure based on document contents is a crucial
task. Traditional methods of documents classification need characteristic
abstraction and classifier training. The work of collecting trainable
text terms is laborious and time-consuming. In order to solve the problem,
this study proposes an ontology based approach to improve the efficiency
and effectiveness of Chinese web documents classification and retrieval.
First, the approach establishes an ontology model based on knowledge base.
Second, it creates ontology for each subclass of the classification system.
It uses RDFS to convert knowledge into ontology and to define the relations
among ontology. Finally, web documents classification is performed automatically
using the ontology relevance calculating algorithm. Present experiments
show that the accuracy of ontology based approach is very close to most
classical methods includes Support Vector Machines, K-Nearest Neighbor
and Latent Semantic Analysis. Additionally, ontology based algorithm is
more stable and robust and can obtain better recalling rate than other
three methods.