GBIB Beta

Virtual character performance from speech
Stacy Marsella, Yuyu Xu, Margaux Lhommet, Andrew Feng, Stefan Scherer, Ari Shapiro
Symposium on Computer Animation, July 2013, pp. 25--35.

Abstract: We demonstrate a method for generating a 3D virtual character performance from the audio signal by inferring the acoustic and semantic properties of the utterance. Through a prosodic analysis of the acoustic signal, we perform an analysis for stress and pitch, relate it to the spoken words and identify the agitation state. Our rule-based system performs a shallow analysis of the utterance text to determine its semantic, pragmatic and rhetorical content. Based on these analyses, the system generates facial expressions and behaviors including head movements, eye saccades, gestures, blinks and gazes. Our technique is able to synthesize the performance and generate novel gesture animations based on coarticulation with other closely scheduled animations. Because our method utilizes semantics in addition to prosody, we are able to generate virtual character performances that are more appropriate than methods that use only prosody. We perform a study that shows that our technique outperforms methods that use prosody alone.

Article URL: http://dx.doi.org/10.1145/2485895.2485900

BibTeX format:

@inproceedings{Marsella:2013:VCP,
  author = {Stacy Marsella and Yuyu Xu and Margaux Lhommet and Andrew Feng and Stefan Scherer and Ari Shapiro},
  title = {Virtual character performance from speech},
  booktitle = {Symposium on Computer Animation},
  pages = {25--35},
  month = jul,
  year = {2013},
}

Search for more articles by Stacy Marsella.
Search for more articles by Yuyu Xu.
Search for more articles by Margaux Lhommet.
Search for more articles by Andrew Feng.
Search for more articles by Stefan Scherer.
Search for more articles by Ari Shapiro.

Return to the search page.