Subscribe Now Subscribe Today
Science Alert
Curve Top
Journal of Applied Sciences
  Year: 2009 | Volume: 9 | Issue: 18 | Page No.: 3317-3325
DOI: 10.3923/jas.2009.3317.3325
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail

Publisher Identifier Scheme for Printed Documents using Neural Networks

W.A.J. Rasheed and H.A. Ali

This study investigates means used to extract embedded specifications of printing layout in a document when handled as an image rather than to recognize its characters and word constituents. These specifications are manifested by the most significant attributes frequently found in page printouts like advertisements, conference proceedings and magazines. The commonly used tools for printing document are word processors, provided by different software packages along with operating systems for PCS, like MS Windows and Mackintosh. Most of the supported packages in addition to Win Word specify the significant attributes of document formats to fall into font and paragraph design. Font design includes type, size and style characteristics. Moreover, line spacing and inter-character gaps provide another important attributes of paragraph specifications. These attributes were extracted, analyzed and exploited in this study with the aim of constructing the proposed publisher identification system. A three stage software scheme is proposed that consists of paragraph layout and font layout detection algorithms based on statistical measures followed by feed forward neural network identifier. This technique is implemented as a tool for the overall analysis and investigations. Several experiments have been conducted to validate the procedure of the designed system. The system achieved 95% successful publisher identification. This identification inaccuracy can be attributed to the poor quality of printing in addition to the effect of noise. Hence, it can be considered as acceptable performance measure for detection and identification purposes once bad quality printed samples are excluded.
PDF Fulltext XML References Citation Report Citation
How to cite this article:

W.A.J. Rasheed and H.A. Ali, 2009. Publisher Identifier Scheme for Printed Documents using Neural Networks. Journal of Applied Sciences, 9: 3317-3325.

DOI: 10.3923/jas.2009.3317.3325






Curve Bottom