A robust proposal generation method for text lines in natural scene images

Kun Fan, Seung Jun Baek

    Research output: Contribution to journalArticlepeer-review

    8 Citations (Scopus)

    Abstract

    Motivated by the success of object proposal generation methods for object detection, we propose a novel method for generating text line proposals from natural scene images. Our strategy is to detect text regions which we define as part of text lines containing a whole character or transitions between two adjacent characters. We observe that, if we scale text regions to a small and fixed size, their image gradients exhibit certain patterns irrespective of text shapes and language types. Based on this observation, we propose simple features which consist of means and standard deviations of image gradients to train a Random Forest so as to detect text regions over multiple image scales and color channels. Text regions are then merged into text line candidates which are ranked based on the Random Forest responses combined with the shapes of the candidates, e.g., horizontally elongated candidates are given higher scores, because they are more likely to contain texts. Even though our method is trained on English, our experiments demonstrate that it achieves high recall with a few thousand good quality proposals on four standard benchmarks, including multi-language datasets. Following the One-to-One and Many-to-One detection criteria, our method achieves 91.6%, 87.4%, 92.1% and 97.9% recall on the ICDAR 2013 Robust Reading Dataset, Street View Text Dataset, Pan's multilingual Dataset and Sampled KAIST Scene Text Dataset respectively, with an average of less than 1250 proposals.

    Original languageEnglish
    Pages (from-to)47-63
    Number of pages17
    JournalNeurocomputing
    Volume304
    DOIs
    Publication statusPublished - 2018 Aug 23

    Bibliographical note

    Funding Information:
    This work was supported in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (No. 2018R1A2B6007130 ) and (No. 2016R1A2B1014934 ), and in part by the Korean MSIT (Ministry of Science and ICT), under the National Program for Excellence in SW (2015-0-00936) supervised by the IITP (Institute for Information & communications Technology Promotion). Kun Fan received his B.S. degree in Communication Engineering from Harbin institute of technology, China in 2010. He is currently working toward the Ph.D. degree in Electrical and Computer Engineering in Korea University. His research interest includes machine learning and computer vision. Seung Jun Baek received his B.S. degree from Seoul National University in 1998, and M.S. and Ph.D. degrees from the University of Texas at Austin in 2002 and 2007, respectively, in electrical and computer engineering. From 2007 to 2009, he was a Member of Technical Staff with DSP Systems R&D Center, Texas Instruments. In 2009, he joined the College of Information and Communications, Korea University, Korea, where he is currently an associate professor. His research interests include information systems and communication networks, machine learning, compressive sensing, and game theory.

    Publisher Copyright:
    © 2018

    Keywords

    • Feature extraction
    • Random Forest
    • Scene text detection
    • Text line proposals

    ASJC Scopus subject areas

    • Computer Science Applications
    • Cognitive Neuroscience
    • Artificial Intelligence

    Fingerprint

    Dive into the research topics of 'A robust proposal generation method for text lines in natural scene images'. Together they form a unique fingerprint.

    Cite this