Journal of Artificial Intelligence Research 15 (2001), pp. 31-90. Submitted 8/00; published 8/01.
© 2001 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.
Grounding the Lexical Semantics of Verbs in Visual Perception using Force Dynamics and Event Logic

Jeffrey Mark Siskind
NEC Research Institute, Inc.
4 Independence Way
Princeton, NJ 08540 USA


This paper presents an implemented system for recognizing the occurrence of events described by simple spatial-motion verbs in short image sequences. The semantics of these verbs is specified with event-logic expressions that describe changes in the state of force-dynamic relations between the participants of the event. An efficient finite representation is introduced for the infinite sets of intervals that occur when describing liquid and semi-liquid events. Additionally, an efficient procedure using this representation is presented for inferring occurrences of compound events, described with event-logic expressions, from occurrences of primitive events. Using force dynamics and event logic to specify the lexical semantics of events allows the system to be more robust than prior systems based on motion profile.

