[Back to RI Seminar Schedule]


Warning:
This page is provided for historical and archival purposes only. While the seminar dates are correct, we offer no guarantee of informational accuracy or link validity. Contact information for the speakers, hosts and seminar committee are certainly out of date.


RI SEMINAR -- Dragutin Petkovic


ABSTRACT

Today's hardware technology enables us to acquire, store, manipulate and transmit large numbers of images. In various applications we are amassing vast image databases and they are expected to grow in importance and volume in the near future. Yet we lack a natural means of querying and accessing these databases. Currently, we access image databases based on textual search over previously entered keywords. Although useful, there are several problems with this approach, such as the fact that often the original keywords do not allow for unanticipated search in subsequent applications, and more important, inadequacy of uniform textual descriptions of such categories as color, texture, shape, layout etc. In addition, many applications need to be able to select all images "like" some other image. In other words, in addition to the simple text-based queries that can be handled today, we wish to allow users to search through databases consisting of millions of images using sketches, layout or structural descriptions, texture, color, images, and other iconic and graphical information to specify the images desired. An example query might be: Find all images with a pattern similar to one to which the user is pointing. Patterns might include a part of another image (e.g., a piece of an X-ray image or a particular texture patch), user-drawn sketch (e.g., of a dress, building, fabrics, ancient vase, etc.), desired color distribution, etc. The size of the image database to be searched can be very large (in the range of hundreds of thousands) in the fields of art, medicine, photo-agencies, multimedia in general, etc. We call the above search techniques query by image content (QBIC) and they have some important distinctions compared to traditional searches. They are approximate - there is no exact match. In other words, QBIC techniques serve as "information filters" and simply reduce the search for the user who will ultimately discard false calls. Interactivity of QBIC technology is the key, allowing the user to use visual query and visual evaluation and refinement, and to decide what to discard and what to keep. We also contrast QBIC technology with typical machine vision applications. There are several important differences. In QBIC applications, through the interaction with the system, the user is offered the possibility of a virtually unlimited set of unanticipated queries rather than having a system automatically classify and recognize samples into a small number of predefined classes (part is good/bad, this is a chair etc). It is also important to note that in QBIC applications the main output is a set of images with desired properties that the user will use for subsequent application (inclusion in multimedia story etc.), rather than a symbolic decision as in typical pattern recognition applications where the system is producing a limited set of symbolic outputs that is predefined and hardcoded. The success in QBIC technology requires integration and synergy between image analysis, manual and semi-automated human control, visual user interfaces, databases and multimedia. The key technical problems and challenges are: how to represent the images so they can be easily queried, what are the similarity measures that best match human or application requirements, cost of digital data entry (scanning, outlining etc.), and how to have all of this done economically and efficiently (database, system and user interface issues). In a real applications, we of course envision combining QBIC with text and keyword search techniques. Possible applications of QBIC technology include: "edutainment", journalism, museum cataloging, document processing, medical, intelligence and military, organizing the database of multimedia mail objects, etc. We believe QBIC technology should be a part of future multimedia databases that will contain text, sound, image and video.

The QBIC project in IBM Almaden Research Center in San Jose, CA, is conducting a theoretical, experimental, and prototyping study of the problem of querying large still image databases efficiently based on image content. Since the problem is difficult, we aim to discover general principles, but at the same time we aim to identify target application(s) for which we will prototype concrete pilot systems. We developed a number of algorithms that allow the user to search based on color, texture, and shape. The search can be focused on either image objects (areas previously outlined by the user) or on the whole images. The search argument can be an object from a particular image, or user selected patterns of color, texture or shape selected from suitable "pickers", or any weighted combination of these patterns. In case of color search, the system also allows for the search for images with desired color distribution (i.e. approximately 50 % red, 20 % blue, 10 % white, where the above colors are selected from a color picker) etc. The querying is done through a graphical user interface. The output is given as a set of images sorted by similarity to the desired pattern, where the user selects the maximal number of images to be shown (retrieved) i.e. "give me the 20 best matches". Our system runs in an AIX environment on RS/6000 with X/Motif based graphical user interface, and we are experimenting with a database of approximately 1000 natural images containing a variety of motifs and approximately 1000 user outlined objects. We measured the retrieval performance using variations of normalized precision and recall. The results of initial experimentation are very encouraging.

Host:           Yangsheng Xu (xu@cs.cmu.edu)
Appointment:    Lalit Katragadda (lalit@cs.cmu.edu)

Biography

Dragutin Petkovic received his B.S.E.E. and M.S.Sc. degrees from the University of Belgrade in 1976 and 1979, respectively, and his Ph.D. degree in Electrical Engineering from University of California, Irvine in 1983. His thesis work was in automated detection of lung nodules from chest radiographs. From 1978 to 1981 he worked as a research engineer in Institute "Boris Kidric", Vinca, Yugoslavia. In 1983 he joined IBM Almaden Research Center, San Jose, CA, where he currently manages the Advanced Algorithms, Architectures and Applications Department. His research interests include image analysis and pattern recognition applied to industrial, commercial and biomedical problems, content based search, large image and multimedia databases, image processing VLSI architectures, and advanced user interfaces. He is also associate editor of International Journal of Machine Vision and Applications (Springer-Verlag) and upcoming IEEE Multimedia Magazine.
Christopher Lee | chrislee@ri.cmu.edu
Last modified: Thu Oct 13 18:50:21 1994