Presentation suggestions and background info for Machine Perception & Modeling of Human Behavior

Similar Courses

It seems most courses in this area come from the vision community. We will emphasize more than just vision.

Trevor Darrell, MIT, 6.892: Computer Vision for Interface and Surveillance: Algorithms and Implications

Michael Black, Brown, CS0296: Topics in Computer Vision or How to Build a 3D Person Tracker

Ahmed Elgammal, Rutgers, 198:676: Looking at People

Irfan Essa, Georgia Tech CS7322

webcourse, Technion, 236875: Visual Recognition

Greg Mori, Simon Fraser, CMPT882: Recognition Problems in Computer Vision

Jason Hong, CMU, HCI-899: Research Topics in Ubiquitous Computing



Face recogntion

Biometrics: Fingerprint, ...

RFID tags

R. Want. RFID: A Key to Automating Everything.

K. Fishkin et al., I Sense a Disturbance in the Force: Unobtrusive Detection of Interactions with RFID-tagged Objects.

R. Want et al., Bridging Physical and Virtual Worlds with Electronic Tags.

Inferring Multiple ADLs from Interactions with Objects. Article to appear in IEEE Pervasive Computing Magazine about using RFIDs to infer ADLs. Matthai Philipose, Kenneth P. Fishkin, etc.,

The Probabilistic Activity Toolkit: Towards Enabling Activity-Aware Computer Interfaces, Matthai Philipose, Kenneth P. Fishkin, Mike Perkowitz, Donald Patterson, Dirk Hahnel, Intel Research Seattle Tech Memo IRS-TR-03-013, November 2003,


D. Fox, J. Hightower, H. Kautz, L. Liao, and D. Patterson. Bayesian techniques for location estimation. In Proceedings of The 2003 Workshop on Location-Aware Computing, October 2003. part of the 2003 Ubiquitous Computing Conference.

M. Bennewitz, W. Burgard, and S. Thrun. Learning motion patterns of persons for mobile service robots. In Proceedings of the IEEE International Conference on Robotic and Automation (ICRA), 2002.

Posture and Movement

What do we want to measure?


Goniometer: worn device that measures joint angle. Google to see what is out there.

A super goniometer


Vision bibliography: see 16.7.1 Surveillance, Human Motion, Surveys, Reviews, Overviews, Surveillance of Vehicles and Occupants, Driver Monitoring, Eyes, Gaze, 16.7.3 Understanding Motion and Events, 16.7.4 Action Models, Motion Detection for Events, Backgrounds, 16.7.5 Human Motion Understanding and Analysis, Human Detection, Pedestrians, Counting, Locating, Human Motion, General Analysis, Tracking People, Human Tracking, Tracking People with Multiple Cameras or Depth, Tracking People with 3D Models, Walking, Gait Recognition, Human ID Using Gait, Recognition of People Through Gait, Human Activities, Human Activities, Interactions, Groups, Human Activities, Sports, Planned Activities,

A nice list: Eigammal at Rutgers

A sampling of papers.

Ismail Haritaoglu, David Harwood, and Larry S. Davis. W4: Real-time surveillance of people and their activities. Pattern Analysis and Machine Recognition, 22(8):809830, 2000.

Namrata Vaswani, A. RoyChowdhury, and Rama Chellappa. Activity recognition using the dynamics of the configuration of interacting objects. Computer Vision and Pattern Recognition (CVPR), 2003.

C Bregler and J Malik. Tracking people with twists and exponential maps. Computer Vision and Pattern Recognition (CVPR), pages 815,1998.

G Cheung, S Baker, and T Kanade. Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture. Computer Vision and Pattern Recognition (CVPR), pages 77 84, June 2003.

S. Ju, M. Black, and Y. Yacoob. Cardboard people: A parameterized model of articulated motion. Int. Conf. on Automatic Face and Gesture Recognition, pages 3844, 1996.

I Kakadiaris and D Mataxas. Model-based estimation of 3d human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12):1453 1459, December 2000.

D. Ramanan and D. A. Forsyth. Finding and tracking people from the bottom up. Computer Vision and Pattern Recognition (CVPR), June 2003.

H Sidenbladh and M J. Black. Learning the statistics of people in images and video. International Journal of Computer Vision, 2003.

H. Sidenbladh, M. Black, and D. Fleet. Stochastic tracking of 3d human figures using 2d image motion. ECCV, 2:702718, 2000.

L Sigal, S Bhatia, S Roth, M Black, and M Isard. Tracking loose-limbed people. Computer Vision and Pattern Recognition (CVPR), 2004.

J. Zhang, R. Collins, and Y. Liu. Representation and matching of articulated shapes. IEEE Conference on Computer Vision and Pattern Recognition (CVPR),, pages 342 349, 2004.

D Ramanan and D Forsyth. Automatic annotation of everyday movements. Neural Info. Proc. Systems (NIPS),, Dec 2003.

C. Stauffer and W. Grimson. Learning patterns of activity using realtime tracking. Pattern Analysis and Machine Intelligence (PAMI), 22(8):747757., 2000.

C Wren, A Azarbayejani, T Darrell, and A Pentland. Pfinder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):780785, July 1997.

C Wren and A Pentland. Dynamic modeling of human motion. Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, 1998.

Wren, Azarbayejani and Pentland. Real-time self-calibrating stereo person tracking using 3-d shape estimation from blob features. Perceptual Computing Technical Report TR-363, MIT Media Laboratory, January 1996.

1.J. Yang, R. Stiefelhagen, U. Meier, and A. Waibel, "Visual tracking for multimodal human computer interaction," Proceedings of CHI 98 , pp. 140-147.

2.J. Yang, X. Zhu, R. Gross, J. Kominek, Y. Pan, A. Waibel, "Multimodal People ID for a Multimedia Meeting Browser," Proceedings of ACM Multimedia 99 .


J. Mantyla, J. Himberg, T. Seppanen, "Recognizing Human Motion with Multiple Acceleration Sensors", In IEEE International Conference on Systems, Man and Cybernetics, Tucson, USA, 2001. pdf

An Efficient Real-Time Human Posture Tracking Algorithm Using Low-Cost Inertial and Magnetic Sensors, Anthony Gallagher, Yoky Matsuoka, and Wei-Tech Ang Proceedings of 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems


Perception of Human Manipulation Based on Contact State Transition Masahiro KONDO, Jun UEDA, Yoshio MATSUMOTO, Tsukasa OGASAWARA Proceedings of 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems


Huge literature. Here are some textbooks:

Statistical Methods for Speech Recognition (Language, Speech, and Communication) by Frederick Jelinek

Fundamentals of Speech Recognition by Lawrence Rabiner, Biing-Hwang Juang


Bodymedia device./A> ICML 2004 workshop: Physiological Data Modeling Contest

Picard: Affective Computing

in various contexts


computer use

use of communication device


motor psychophysics: Minimum jerk, torque change, minimum error

Flash, T. and Hogan, N. (1985) The coordination of arm movements: an experimentally confirmed mathematical model. J. Neurosci. 7: 1688-1703.

Nakano E, Imamizu H, Osu R, Uno Y, Gomi H, Yoshioka T, Kawato M: Quantitative examinations of internal representations for arm trajectory planning: Minimum commanded torque change model. Journal of Neurophysiology 81, 2140-2155 (1999).

Dornay M, Uno Y, Kawato M, Suzuki R: Minimum muscle-tension change trajectories predicted by using a 17-muscle model of the monkey's arm. Journal of Motor Behavior 28, 83-100 (1996).

Uno Y, Kawato M, Suzuki R: Formation and control of optimal trajectory in human multijoint arm movement - minimum torque-change model-. Biological Cybernetics 61, 89-101 (1989).

Miyamoto H, Nakano E, Wolpert DM, Kawato M: TOPS (Task Optimization in the Presence of Signal-dependent noise) model. Systems and Computers in Japan, 35, 48-58 (2004).

J. R. Flanagan and D. J. Ostry. Trajectories of human multi-joint arm movements:evidence of joint level planning. In Khatib O Hayward V, editor, Experimental robotics 1, pages 594-613h. Springer-Verlag, 1990.

Optimality principles in sensorimotor control, Todorov E (2004) Nature Neuroscience 7(9): 907-915

neuroscience: motor tape, central pattern generator (CPG), synergies

motor tape is old, and hard to find references for. It is having a revival in graphics (see Motion Graphs).

Google "central pattern generator"

A d'Avella, P Saltiel, and E Bizzi. Combinations of muscle synergies in the construction of a natural motor behavior. Nature Neuroscience, 6(3):300-308, Mar 2003.

graphics: Motion Graph, dimensionality reduction

With the availability of relatively cheap motion capture, people in graphics have been asking how to generate complex motion from libraries of stored motion examples.

Kovar, Gleicher, and Pighin

CMU Motion Graphs

Dimensionality Reduction

Motion Synthesis

1.Michael Gleicher, Hyun Joon Shin, Lucas Kovar, and Andrew Jepsen. Snap Together Motion: Assembling Run-Time Animation.2003 Symposium on Interactive 3D Graphics. April 2003

2.Lucas Kovar Michael Gleicher Frederic Pighin. Motion Graphs. ACM Transactions on Graphics 21(3) (Proceedings of SIGGRAPH 2002).July 2002.

3.J. Lee, J. Chai, P. S. A. Reitsma, J. K. Hodgins, and N. S. Pollard. Interactive control of avatars animated with human motion data. Proc. SIGGRAPH 2002

4.O. Arikan and D. Forsyth. Interactive motion generation from examples. In Proceedings of ACM SIGGRAPH 02, 2002

5.Arikan, O., Forsyth, D. A., O'Brien, J. F., "Motion Synthesis from Annotations." To appear in ACM SIGGRAPH 2003.

Can we learn anything from how video game characters are controlled?


state machine

Very common idea, simple discrete way to encode temporal structure.

Dynamic system

Continuous way to encode temporal structure (differential equations encode system dynamics).

Schaal, S. (2003). Dynamic movement primitives - A framework for motor control in humans and humanoid robots, The International Symposium on Adaptive Motion of Animals and Machines. There are several dynamic system publications on this page

Saltzman, E., & Kelso, J. A. S. (1987). Skilled actions: A task dynamic approach. Psychological Review, 94, 84-106.

robotics style planner


A compromise: Synthesizing Animations of Human Manipulation Tasks Katsu Yamane, James Kuffner, and Jessica K. Hodgins

dynamic programming style planner

Stilman paper should be available soon.

Stolle paper should be available soon.

Morimoto paper should be available soon.

Atkeson [NIPS 1993]

Atkeson [NIPS 2002]

Morimoto and Atkeson have developed robust versions of the local trajectory planner. [IROS 2003]

Behavior based, classifier selecting primitives

@article{Bentivengna/etal2005b, author = {D.C. Bentivegna and C.G. Atkeson and G. Cheng}, title = {Learning From Observation and Practice Using Behavioral Primitives: Marble Maze}, journal = {International Journal of Robotics Research}, year = {2005} }

Dynamic Time Warping

Googling Dynamic Time Warping turns up some useful web pages.

C. S. Myers and L. R. Rabiner. A comparative study of several dynamic time-warping algorithms for connected word recognition. The Bell System Technical Journal, 60(7):1389-1409, September 1981.

Eamonn Keogh and Michael Pazzani. Dynamic time warping with higher order features. In SIAM International Conference on Data Mining, SDM 2001. SIAM, 2001.

Tim Oates, Matthew D. Schmill, and Paul R. Cohen. A method for clustering the experiences of a mobile robot that accords with human judgments. In Proceedings 17th National Conference on Artificial Intelligence, pages 846-851. AAAI Press, 2000.

Various HMMs.

M.Brand, N.Oliver, and A.Pentland. Coupled hidden markov models for complex action recognition. Computer Vision and Pattern Recognition (CVPR), 1996.

A. Galata, N. Johnson, and D. Hogg. Learning variable length markov models of behaviour. Computer Vision and Image Understanding (CVIU), (3):398413, 2001.

A. Galata, N. Johnson, and D. Hogg. Learning behaviour models of human activities. British Machine Vision Conference (BMVC), 1999.

Thad Starner and Alex Pentland. Visual recognition of american sign language using hidden markov models. Technical Report TR-306, MIT Media Laboratory, 1995.

A. Wilson and A. Bobick. Learning visual behavior for gesture analysis. Proceedings of the IEEE Symposium on Computer Vision, pages 229 234, 1995.

B. Hannaford and P. Lee. Multi-dimensional hidden markov model of telemanipulation tasks with varying outcomes. Proceedings of the IEEE International Conference on on Systems, Man and Cybernetics, pages 127 133, 1990.

C Hundtofte, G Hager, and A Okamura. Building a task language for segmentation and recognition of user input to cooperative manipulation systems. IEEE Virtual Reality Conference, pages 225230, 2002.

C. Sean Hundtofte, Gregory D. Hager, and Allison M. Okamura. Building a task language for segmentation and recognition of user input to cooperative manipulation systems. 10th Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems, 2002.

Pook, P. Teleassistance: Using Deictic Gestures to Control Robot Action. PhD thesis, University of Rochester, 1995.


Wilson, A. and A. Bobick, "Parametric Hidden Markov Models for Gesture Recognition," IEEE Transaction on Pattern Analysis & Machine Intelligence, 21(9), September 1999, pp. 884-900.


Bayesian networks

S Intille.

Nuria M. Oliver, Barbara Rosario, and Alex P. Pentland. A bayesian computer vision system for modeling human interactions. Pattern Analysis and Machine Recognition, 22(8):831843, 2000.

Yifan Shi, Yan Huang, David Minnen, Aaron Bobick, and Irfan Essa. Propagation networks for recognition of partially ordered sequential action. CVPR, 2004.

Dynamic Bayesian networks

Interactive behavior, dialog management

(Stochastic) context free grammars

Darnell Moore and Irfan Essa. Recognizing multitasked activities using stochastic context-free grammar. Eighteenth national conference on Artificial intelligence, pages 770 776, 2002.

D Minnen, I Essa, and T Starner. Expectation grammars: Leveraging highlevel expectations for activity recognition. Computer Vision and Pattern Recognition (CVPR), 2003.

Yuri A. Ivanov and Aaron F. Bobick. Recognition of visual activities and interactions by stochastic parsing. Pattern Analysis and Machine Recognition, 22(8):852872, 2000.


Segmentation and Recognition of Continuous Human Activity

Segmenting Motion Capture Data into Distinct Behaviors J. Barbic and A. Safonova and J.Y. Pan and C. Faloutsos and J. K. Hodgins and N. S. Pollard (Graphics Interface, May 2004 )


Formal knowledge representation

Leora Morgenstern. Mid-sized axiomatizations of commonsense problems: A case study in egg cracking,. Studia Logica, 67, 2001.

Activity/Behavior Recognition

N. Oliver, E. Horvitz, and A. Garg. Layered representations for human activity recognition. In Fourth IEEE Int. Conf. on Multimodal Interfaces, pages 3-8, 2002.

S. Lee and K. Mase. Activity and location recognition using wearable sensors. In First IEEE International Conference on Pervasive Computing and Communications, pages 24-32, 2002.

@article{ oliver00bayesian author = "Nuria M. Oliver and Barbara Rosario and Alex Pentland" , title = {{A Bayesian Computer Vision System for Modeling Human Interactions}} , journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence" , volume = "22" , number = "8" , pages = "831-843" , year = "2000" , url = "" }

@Inproceedings{perkowitz04 author = {Mike Perkowitz and Matthai Philipose and Donald J. = Patterson and Kenneth P Fishkin} , title = {{Mining Models of Human Activities from the Web}} , booktitle = {{Proceedings of WWW2004: The Thirteenth International World Wide Web Conference}} , year = "2004" , volume = "" , publisher = "" , editor = "" , pages = "" , url = "" }

K. P. Fishkin, H. Kautz, D. Patterson, M. Perkowitz, and M. Philipose. Guide: Towards understanding daily life via auto-identification and statistical analysis. In UBIHEALTH 2003 The 2nd International Workshop on Ubiquitous Computing for Pervasive Healthcare Applications, 2003.

Sort by type of activity?


Computer and Vision Research Center, UT Austin.

Case Studies

Context Aware/Ubiquitous/Pervasive Computing

M. Weiser. The Computer for the Twenty-First Century

R. Want et. al., Disappearing Hardware.

Intelligent Environments

M. Mozer. The neural network house: An environment that adapts to its inhabitants. In In Proceedings of the American Association for Artificial Intelligence Spring Symposium on Intelligent Environments, pages 110-114, 1998.

Health Monitoring

T. Barger, D. Brown, and M. Alwan. Health status monitoring through analysis of behavioral patterns. In AI*IA 2003 - 8th National Congress of Italian Association for Artificial Intelligence: Workshop on Ambient Intelligence, 2003.

Aging in Place

A. Mihailidis, B. Carmichael, and J. Boger. The use of computer vision in an intelligent environment to support aging-in-place, safety, and independence in the home. IEEE Transaction on Information Technology in Biomedicine (Special Issue on Pervasive Healthcare), 8(3):111, 2004.

S. Helal, B. Winkler, C. Lee, Y. Kaddoura, L. Ran, C. Giraldo, S. Kuchibhotla, and W. Mann. Enabling location-aware pervasive computing applications for the elderly. In First IEEE International Conference on Pervasive Computing and Communications, page 531, 2003.

S. Intille and K. Larson. Designing and evaluating technology for independent aging in the home. In Proceedings of the International Conference on Aging, Disability and Independence, 2003.

@Article{lawton90 author = "M. Powell Lawton" , title = {{Aging and Performance of Home Tasks}} , journal = {Human Factors} , pages ={527-536} , volume = "32" , number = "5" , month = "October" , year = "1990" , note = "NLM:0374660;PMID:2074107" }

@Article{lawton83 author = "M. Powell Lawton" , title = {{Environment and Other Determinants of Well-Being in Older People}} , journal = {The Gerontologist} , pages ={349-357} , volume = "23" , number = "4" , month = "August" , year = "1983" , note = "NLM:0375327;PMID:6352420" }

@Inproceedings{pollack02 author = "M. E. Pollack and C. E .McCarthy and S. Ramakrishnan and I. Tsamardinos and L. Brown and S. Carrion and D. Colbry and C. Orosz and B. Peintner" , title = {{Autominder: A Planning , Monitoring , and Reminding Assistive Agent}} , booktitle = {7th International Conference on Intelligent Autonomous Systems} , month = "March" , year = "2002" }

@article{consolvo04 author = " S. Consolvo and P. Roessler and B.E. Shelton and A. LaMarcha and B. Schilit and S. Bly " , title = {{Technology for Care Networks of Elders}} , journal = {IEEE Pervasive Computing Mobile and Ubiquitous Systems: Successful Aging} , volume = "3" , number = "2" , month ="Apr-Jun" , year = "2004" , pages = "22-29" , note = {{ 022320041335\_230.pdf}} }

A Jobbagy, E Furnee, P. Harcos, and M. Tarczy. Early detection of parkinson's disease through automatic movement evaluation. IEEE Engineering in Medicine and Biology Magazine, 17(2):8188, March/April 1998.

Lum PS, Taub E, Schwandt D, Postman M, Hardin P, and Uswatte G. Automated constraint-induced therapy extension (autocite) for movement deficits after stroke. J Rehabil Res Dev., 41(3A):24958, May 2004.

Martha Pollack, Laura Brown, Dirk Colbry, Colleen E. McCarthy, Cheryl Orosz, Bart Peintner, Sailesh Ramakrishnan, and Ioannis Tsamardinos. Autominder: An intelligent cognitive orthotic system for people with memory impairment. Robotics and Autonomous Systems, 2003.

Assistive Technology


List of relevant articles in vision: see Surveillance of Vehicles and Occupants, Driver Monitoring, Eyes, Gaze,

Real-Time Non-Rigid Driver Head Tracking for Driver Mental State Estimation. S. Baker, I. Matthews, J. Xiao, R. Gross, T. Kanade, and T. Ishikawa 11th World Congress on Intelligent Transportation Systems, October, 2004.

User modeling (HCI)


Overview of proposed 'stages' of stroke recovery:

S Brunnstrom. Movement therapy in hemiplegia: a neurophysiological approach. Harper and Row, New York, New York, 1st edition, 1970.

Stroke Assessment tools:

M Levin, J Desrosiers, D Beauchemin, N Bergeron, and A Rochette. Development and validation of a scale for rating motor compensations used for reaching in patients with hemiparesis: the reaching performance scale. Phys Ther., 84(1):822, Jan 2004.

AR Fugl-Meyer, L Jaasko, I Leyman, S Olsson, and S Steglind. The poststroke hemiplegic patient. 1. a method for evaluation of physical performance. Scand J Rehabil Med., 7(1):1331, 1975.

C Collin, D Wade, S Davies, and V Horne. The barthel adl index: a reliability study. Int Disabil Stud., 10(2):613, 1988.

R. Jebsen, N Taylor, Trieschmann, M Trotter, and L Howard. An objective and standardized test of hand function. Archives of Physical Medicine and Rehabilitation., pages 311319., June 1969.

B Kopp, A Kunkel, H Flor, T Platz, U Rose, K Mauritz, K Gresser, K McCulloch, and E Taub. The arm motor ability test (amat): Reliability, validity and sensitivity to change. Arch. Phys. Med. Rehab., 78:615620, 1997.

Ethics/privacy issues

Discuss limits to 1st party video capture.

Discuss 3rd parties in public video capture.

Discuss wiretap laws with respect to audio, computer activity capture.

Discuss risk of keystroke capture, screen capture in computer use monitoring.

Underlying techniques

Bayesian Filtering, Kalman Filtering, Particle Filtering


L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257286, 1989.

Jelinek. Statistical Methods for Speech Recognition. MIT Press, 1998.


Classification (SVM)


Assistive Cognition, U. Wash.

Monitoring Human Activity, U. Minnesota

Looking @ People, Brown

Microsoft SEER

Workshop on Human Motion

Stuff for projects



Bodymedia device. ICML 2004 workshop: Physiological Data Modeling Contest

BioPac (EEG, EMG, Accelerometer)

Binary sensors: Motion detectors, contact switches, ...

Keycapture Open source Windows application that runs in the background and records all keystroke and mouse events.

Morae More sophisticated (commercial) software for running HCI studies.

Pitt Eye Tracker VisionTrak ETL 400 desktop-mounted eye tracking system.

CMU HCII usability lab has an eye tracker as well.

The Language Activity Monitor (LAM) - The LAM *used* to be a widget that attached to the serial port of AAC (alternative or augmentative communications) devices and recorded keystrokes. Now the LAM is called U-LAM and is software that runs on a PC (which is connected by serial port to the AAC device). Other AAC devices have the same functionality as U-LAM built directly into the systems. Some of this recording/analysis software is written by Enkidu. Enkidu was recently purchased by Dynavox, which may explain why their website is "in transition."

Pitt HERL wheelchair activity monitors

Vibration sensors and accelerometers.