Date: Tue, 10 Dec 1996 14:58:03 GMT Server: NCSA/1.4.2 Content-type: text/html Last-modified: Tue, 13 Feb 1996 23:42:37 GMT Content-length: 2450 From Technical Diagrams to Electronic Documents
Department of Electrical Engineering
University of Washington

From Technical Diagrams to Electronic Documents

Sponsors

Example of a Technical Diagram. Warning: this image is BIG!!

Problem Statement and Objectives

InfoAccess is a small Washington State company whose main product line is Guide, a collection of software modules that allow the semi-automatic conversion of technical manuals to interactive electronic documents. The manuals are typically technical documents such as installation, operations and maintenance manuals, which contain large numbers of complex diagrams. Diagrams are converted to images with ``hot spots,'' which are regions in which the user can click a mouse and receive additional information or help. The hot spots are located where there are ``callouts'' in the original diagram; these are numbers or text identifying a portion of the diagram and usually adjacent to a straight line or arrow pointing to this portion. Currently the callouts must be located and identified by hand; this is slow and tedious. InfoAccess would like an image analysis system that can automatically find the callouts, read the numbers or text, and send an ASCII character string plus the image coordinates of the callout to the appropriate GUIDE package.

The problem to be solved in this work is the development of automatic methods for locating and recognizing patterns in complex technical document images. InfoAccess is currently most interested in the callouts, which are usually numbers or text, sometimes surrounded by circles or boxes, since automatic callout detection software would be of immediate use in their current product. However, developing a general approach will allow them to produce more powerful future products. Our objective for this work is to develop an approach to document pattern matching that is specifically applicable to the automatic callout recognition problem, that is easily extendable to recognition of more advanced patterns such as parts and subassemblies, that can be trained to recognize new patterns, and that is efficient and easy to use.