Asian Science Citation Index is committed to provide an authoritative, trusted and significant information by the coverage of the most important and influential journals to meet the needs of the global scientific community.  
ASCI Database
308-Lasani Town,
Sargodha Road,
Faisalabad, Pakistan
Fax: +92-41-8815544
Contact Via Web
Suggest a Journal
Information Technology Journal
Year: 2014  |  Volume: 13  |  Issue: 1  |  Page No.: 69 - 77

Research on Extraction Methods of Web Page’s Document Logical Structure

Wei Wang, Wei Wei, Qinghua Zheng, Jie Hu, Yingying Chen and Bin Zhou    

Abstract: Based on the analysis of characteristics of web page data set and difficulties of document logical structure extraction task, the method of document logical structure extraction of web page is proposed, moreover, four key technologies are proposed in order to extract document logical structure. Finally, the study download and process a number of web pages from Baidu baike and general sites related to two courses of computer science i.e., operating system and computer network. Evaluation on web pages of Baidu baike shows that the average error rate is 12.8 and 6.6% on operating system and computer network courses respectively and the average rate of general web pages on operating system and computer network is 30 and 22.6%, respectively. The experimental results validate the effectiveness of the method proposed in this study.

Cited References   |    Fulltext    |   Related Articles   |   Back
 
 
   
 
 
 
  Related Articles

 
 
 
 
 
 
 
 
 
Copyright   |   Desclaimer   |    Privacy Policy   |   Browsers   |   Accessibility