Upper body gestures in lecture videos: indexing and correlating to pedagogical significance
The growth of digitally recorded educational lectures has led to a problem of information overload. Semantic video browsers present one solution whereby content-based features are used to highlight points of interest. We focus on the domain of single-instructor lecture videos. We hypothesize that arm and upper body gestures made by the instructor can yield significant pedagogic information regarding the content being discussed such as importance and difficulty. Furthermore, these gestures may be classified, automatically detected and correlated to pedagogic significance (e.g., highlighting a subtopic which may be a focal point of a lecture). This information may be used as cues for a semantic video browser. We propose a fully automatic system which, given a lecture video as input, will segment the video into gestures and then identify each gesture according to a refined taxonomy. These gestures will then be correlated to a vocabulary of significance. We also plan to extract other features of gestures such as speed and size and examine their correlation to pedagogic significance. We propose to develop body part recognition and temporal segmentation techniques to aid natural gesture recognition. Finally, we plan to test and verify the efficacy of this hypothesis and system on a corpus of lecture videos by integrating the points of pedagogic significance as indicated by the gestural information into a semantic video browser and performing user studies. The user studies will measure the accuracy of the correlation as well as the usefulness of the integrated browser.