HOME JOURNALS CONTACT

Information Technology Journal

Year: 2005 | Volume: 4 | Issue: 1 | Page No.: 21-31
DOI: 10.3923/itj.2005.21.31
Design and Implementation of a Semantic Document Management System
Guoren Wang, Bin Wang, Donghong Han and Baiyou Qiao

Abstract: Easily accessible information on the World Wide Web (WWW) and affordable large capacity secondary storage make it easy to build up very large document collections even in personal computers. However, the method of organizing files in computers has not been changed too much for decades. Searching for a particular document or file from a gegabytes collection based on traditional tree structured file directories becomes never an easy task. This study presents a system where documents are no longer identified by their file names. Instead, a document is represented by its semantics in terms of descriptor and contents vector. The descriptor of a document consists of a set of attributes, such as date of creation, its type, its size, annotations, etc. The content vector of a document consists of a set of terms extracted from the document. Such semantic information provides the user with associative searching capability, that is, documents can be obtained by giving required properties. The representation of document semantics and document organization and key word-based indexing techniques are discussed. Furthermore, for the largely used XML data in Web representing and exchanging, some structure-based querying techniques are proposed in this study, i.e. structural indexes and path expression optimization principles. A prototype visual based explorer that makes use of semantics of documents is also described.

Fulltext PDF

How to cite this article
Guoren Wang, Bin Wang, Donghong Han and Baiyou Qiao, 2005. Design and Implementation of a Semantic Document Management System. Information Technology Journal, 4: 21-31.

© Science Alert. All Rights Reserved