An efficient method for document image geometric layout analysis

Suyoung Chi, Yunkoo Chung, Dae Geun Jang, Weongeun Oh, Jaeyeon Lee, Kim Changhun

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Document image analysis is necessary for optical character recognition (OCR) and also very useful for many other document image manipulations. In this paper, we propose a document image geometric layout analysis system which has less region segmentation and classification error than that of the commercial software and previous works. The proposed method segments the document image into small regions to the size of a character using fast connected components generation method, so that it prevents the different types of connected components from combining. We also propose new criterion for clustering the connected components and some new techniques to deal with noise and reduce computation time. Experiment shows classification error rate of text and picture regions is decreased.

    Original languageEnglish
    Title of host publicationIASTED International Conference on Computer Graphics and Imaging
    EditorsM.H. Hamza, M.H. Hamza
    Pages238-243
    Number of pages6
    Publication statusPublished - 2003
    EventSixth IASTED International Conference on Computer Graphics and Imaging - Honolulu, HI, United States
    Duration: 2003 Aug 132003 Aug 15

    Publication series

    NameIASTED International Conference on Computer Graphics and Imaging

    Other

    OtherSixth IASTED International Conference on Computer Graphics and Imaging
    Country/TerritoryUnited States
    CityHonolulu, HI
    Period03/8/1303/8/15

    Keywords

    • Connected Component Analysis
    • Document Image Analysis
    • Optical Character Recognition(OCR)

    ASJC Scopus subject areas

    • Computer Graphics and Computer-Aided Design

    Fingerprint

    Dive into the research topics of 'An efficient method for document image geometric layout analysis'. Together they form a unique fingerprint.

    Cite this