Video provides a rich and widespread medium for documentation, education, and entertainment. However, the challenge of browsing, skimming and navigating long recorded videos limits the utility of such videos for reference and reuse. Recent advances in computer vision and natural language processing can surface objects, actions, scenes and speech as navigable text. However, the higher-level semantic structure of the video (e.g., the outline, summary, or scenes) remains hidden from the user. I propose combining domain-specific human annotations with automatic methods to enable navigating videos using structured text documents. This talk explores this idea through systems spanning three domains: educational lecture videos, films, and critique sessions.
Amy Pavel is a PhD candidate (in her final year!) working on Human Computer Interaction in the department of Electrical Engineering and Computer Science at UC Berkeley. She is advised by professors Björn Hartmann at UC Berkeley and Maneesh Agrawala at Stanford, and her work has been supported by a Sandisk Gold fellowship and an NDSEG fellowship.
Faculty Host: Jeff Bigham