figshare
Browse
PHD_Thesis.pdf (131.93 MB)

Expressive Modulation of Neutral Visual Speech

Download (131.93 MB)
thesis
posted on 2017-10-24, 10:23 authored by Felix ShawFelix Shaw
The need for animated graphical models of the human face is commonplace in the movies, video games and television industries, appearing in everything from low budget advertisements and free mobile apps, to Hollywood blockbusters costing hundreds of millions of dollars. Generative statistical models of animation attempt to address some of the drawbacks of industry standard practices such as labour intensity and creative inflexibility.
This work describes one such method for transforming speech animation curves between different expressive styles. Beginning with the assumption that expressive speech animation is a mix of two components, a high frequency speech component (the content) and a much lower-frequency expressive component (the style), we use Independent Component Analysis (ICA) to identify and manipulate these components independently of one another. Next we learn how the energy for different speaking styles is distributed in terms of the low-dimensional independent components model. Transforming the speaking style involves projecting new animation curves into the low-dimensional ICA space, redistributing the energy in the independent components, and finally reconstructing the animation curves by inverting the projection.
We show that a single ICA model can be used for separating multiple expressive styles into their component parts. Subjective evaluations show that viewers can reliably identify the expressive style generated using our approach, and that they have difficulty in identifying transformed animated expressive speech from the equivalent ground-truth.

Funding

EPSRC

History