Abstract/Details

Unified detection and recognition for reading text in scene images


2008 2008

Other formats: Order a copy

Abstract (summary)

Although an automated reader for the blind first appeared nearly two-hundred years ago, computers can currently "read" document text about as well as a seven-year-old. Scene text recognition brings many new challenges. A central limitation of current approaches is a feed-forward, bottom-up, pipelined architecture that isolates the many tasks and information involved in reading. The result is a system that commits errors from which it cannot recover and has components that lack access to relevant information.

We propose a system for scene text reading that in its design, training, and operation is more integrated. First, we present a simple contextual model for text detection that is ignorant of any recognition. Through the use of special features and data context, this model performs well on the detection task, but limitations remain due to the lack of interpretation. We then introduce a recognition model that integrates several information sources, including font consistency and a lexicon, and compare it to approaches using pipelined architectures with similar information. Next we examine a more unified detection and recognition framework where features are selected based on the joint task of detection and recognition, rather than each task individually. This approach yields better results with fewer features. Finally, we demonstrate a model that incorporates segmentation and recognition at both the character and word levels. Text with difficult layouts and low resolution are more accurately recognized by this integrated approach. By more tightly coupling several aspects of detection and recognition, we hope to establish a new unified way of approaching the problem that will lead to improved performance. We would like computers to become accomplished grammar-school level readers.

Indexing (details)


Subject
Artificial intelligence;
Computer science
Classification
0800: Artificial intelligence
0984: Computer science
Identifier / keyword
Applied sciences, Automated readers, Scene text recognition, Text reading
Title
Unified detection and recognition for reading text in scene images
Author
Weinman, Jerod J.
Number of pages
218
Publication year
2008
Degree date
2008
School code
0118
Source
DAI-B 69/09, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
ISBN
9780549786115
Advisor
Hanson, Allen R.; Learned-Miller, Erik G.
Committee member
McCallum, Andrew; Rayner, Keith
University/institution
University of Massachusetts Amherst
Department
Computer Science
University location
United States -- Massachusetts
Degree
Ph.D.
Source type
Dissertations & Theses
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
3325128
ProQuest document ID
304565844
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
http://search.proquest.com/docview/304565844
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.