Rate-invariant analysis of trajectories on riemannian manifolds with application in visual speech recognition

Jingyong Su, Anuj Srivastava, Fillipe D.M. De Souza, Sudeep Sarkar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Scopus citations

Abstract

In statistical analysis of video sequences for speech recognition, and more generally activity recognition, it is natural to treat temporal evolutions of features as trajectories on Riemannian manifolds. However, different evolution patterns result in arbitrary parameterizations of these trajectories. We investigate a recent framework from statistics literature that handles this nuisance variability using a cost function/distance for temporal registration and statistical summarization & modeling of trajectories. It is based on a mathematical representation of trajectories, termed transported square-root vector field (TSRVF), and the L2 norm on the space of TSRVFs. We apply this framework to the problem of speech recognition using both audio and visual components. In each case, we extract features, form trajectories on corresponding manifolds, and compute parametrization-invariant distances using TSRVFs for speech classification. On the OuluVS database the classification performance under metric increases significantly, by nearly 100% under both modalities and for all choices of features. We obtained speaker-dependent classification rate of 70% and 96% for visual and audio components, respectively.

Original languageEnglish
Title of host publicationProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
PublisherIEEE Computer Society
Pages620-627
Number of pages8
ISBN (Electronic)9781479951178, 9781479951178
DOIs
StatePublished - Sep 24 2014
Event27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014 - Columbus, United States
Duration: Jun 23 2014Jun 28 2014

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014
CountryUnited States
CityColumbus
Period06/23/1406/28/14

Fingerprint Dive into the research topics of 'Rate-invariant analysis of trajectories on riemannian manifolds with application in visual speech recognition'. Together they form a unique fingerprint.

  • Cite this

    Su, J., Srivastava, A., De Souza, F. D. M., & Sarkar, S. (2014). Rate-invariant analysis of trajectories on riemannian manifolds with application in visual speech recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 620-627). [6909480] (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). IEEE Computer Society. https://doi.org/10.1109/CVPR.2014.86