Paper
31 July 2002 Script determination of mixed Chinese/English document images using Kolmogorov complexity measure
Zheru Chi, Qing Wang
Author Affiliations +
Proceedings Volume 4875, Second International Conference on Image and Graphics; (2002) https://doi.org/10.1117/12.477053
Event: Second International Conference on Image and Graphics, 2002, Hefei, China
Abstract
In this paper, we propose an approach based on Kolmogorov Complexity (KC) measuie for determining script classes in mixed Chinese (complex characters)/English document images. This approach, which mainly consists of two steps: document image preprocessing and KC measure, can successfully separate Chinese text lines from English ones. Our approach is robust and reliable in handling document images of different appearances and densities, and various fonts, sizes and styles of characters used in documents. Experimental results on a set of 40 text line images (20 English text lines and 20 Complex Chinese text lines) from various document images show that 100% correct classification rate can be achieved.
© (2002) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zheru Chi and Qing Wang "Script determination of mixed Chinese/English document images using Kolmogorov complexity measure", Proc. SPIE 4875, Second International Conference on Image and Graphics, (31 July 2002); https://doi.org/10.1117/12.477053
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Binary data

Neural networks

Optical character recognition

Computer science

Image processing

Computer engineering

Image classification

Back to Top