Abstract/Details

A statistical model for computer recognition of sequences of handwritten digits, with applications to ZIP codes


1998 1998

Other formats: Order a copy

Abstract (summary)

This thesis introduces a statistical model for computer recognition of sequences of unconstrained handwritten digits, specifically ZIP codes. The model integrates two major tasks in handwriting recognition: the segmentation of a sequence of characters into its individual components, and the recognition of these individual components.

We express the joint distribution of the segmentation, recognition, and image as a product of three terms: a term expressing our prior belief in the plausibility of the segmentation; a term expressing the plausibility of the segmentation and the image; and a term expressing the plausibility of the recognition, segmentation, and image. To calculate the third term, we incorporate a digit recognition algorithm developed by Amit, Geman, and Wilder to recognize the characters determined by a candidate segmentation. The strength of this recognition provides information about the plausibility of the segmentation.

Combining these sources of information, we obtain a posterior distribution that simultaneously optimizes both segmentation and recognition. Summing this posterior distribution over all segmentations gives us a posterior distribution on the recognition alone, and we rake its mode as the predicted ZIP code. To make this optimization feasible, a generalized dynamic programming algorithm is implemented.

We describe how the model is implemented as a computer software system and present results from a test dataset of ZIP code images taken from US mail. The system uses little preprocessing, instead adapting to the image, and special adjustments are not required for slanting, touching, or overlapping characters. The system also relies little on rule-based heuristics, making extensive training or tuning unnecessary, and as a result is generalizable to any problem involving the recognition of sequences of a fixed number of visual or aural symbols.

Indexing (details)


Subject
Statistics;
Computer science
Classification
0463: Statistics
0984: Computer science
Identifier / keyword
Applied sciences; Pure sciences; Computer recognition; Handwritten; Zip codes
Title
A statistical model for computer recognition of sequences of handwritten digits, with applications to ZIP codes
Author
Wang, Steve C.
Number of pages
147
Publication year
1998
Degree date
1998
School code
0330
Source
DAI-B 59/07, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
ISBN
0591957604, 9780591957600
Advisor
Amit, Yali
University/institution
The University of Chicago
University location
United States -- Illinois
Degree
Ph.D.
Source type
Dissertations & Theses
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
9841583
ProQuest document ID
304466770
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
http://search.proquest.com/docview/304466770
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.