16 January 2006 Graphic design principles for automated document segmentation and understanding
Author Affiliations +
Proceedings Volume 6067, Document Recognition and Retrieval XIII; 60670F (2006)
Event: Electronic Imaging 2006, 2006, San Jose, California, United States
When designers develop a document layout their objective is to convey a specific message and provoke a specific response from the audience. Design principles provide the foundation for identifying document components and relations among them to extract implicit knowledge from the layout. Variable Data Printing enables the production of personalized printing jobs for which traditional proofing of all the job instances could result unfeasible. This paper explains a rule-based system that uses design principles to segment and understand document context. The system uses the design principles of repetition, proximity, alignment, similarity, and contrast as the foundation for the strategy in document segmentation and understanding which holds a strong relation with the recognition of artifacts produced by the infringement of the constraints articulated in the document layout. There are two main modules in the tool: the geometric analysis module; and the design rule engine. The geometric analysis module extracts explicit knowledge from the data provided in the document. The design rule module uses the information provided by the geometric analysis to establish logical units inside the document. We used a subset of XSL-FO, sufficient for designing documents with an adequate amount complexity. The system identifies components such as headers, paragraphs, lists, images and determines the relations between them, such as header-paragraph, header-list, etc. The system provides accurate information about the geometric properties of the components, detects the elements of the documents and identifies corresponding components between a proofed instance and the rest of the instances in a Variable Data Printing Job.
© (2006) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
J. Fernando Vega-Riveros and Hector J. Santos Villalobos "Graphic design principles for automated document segmentation and understanding", Proc. SPIE 6067, Document Recognition and Retrieval XIII, 60670F (16 January 2006);
Get copyright permission  Get copyright permission on Copyright Marketplace

Graphic design

Rule based systems

Distance measurement

Error analysis


System identification


Probabilistic modeling of children's handwriting
Proceedings of SPIE (March 24 2014)
Evaluating interface aesthetics: measure of symmetry
Proceedings of SPIE (February 10 2006)
Interaction for style-constrained OCR
Proceedings of SPIE (January 29 2007)
Periodic image artifacts in digital halftone prints
Proceedings of SPIE (October 01 1990)
Production planning and automated imposition
Proceedings of SPIE (January 29 2007)

Back to Top