Facial Expression Transfer with Input-Output Temporal Restricted Boltzmann Machines

Matthew D. Zeiler, Graham W. Taylor, Leonid Sigal, Iain Matthews, and Rob Fergus

Neural Information Processing Systems (December 12-17, 2011)
Supplemental Material:  Supplementary  Code  Videos

Abstract
We present a type of Temporal Restricted Boltzmann Machine that defines a prob- ability distribution over an output sequence conditional on an input sequence. It shares the desirable properties of RBMs: efficient exact inference, an exponen- tially more expressive latent state than HMMs, and the ability to model nonlinear structure and dynamics. We apply our model to a challenging real-world graphics problem: facial expression transfer. Our results demonstrate improved perfor- mance over several baselines modeling high-dimensional 2D and 3D data.

Supplemental Materials

Videos

2D Comparison Retarget – This visualizes the input sequence (source) on the left hand side in blue dots against the ground truth output sequence (target) on the right with our IOTRBM model’s prediction superimposed on the output in white.

2D Retarget Under Noise – This video demonstrates the effects of initializing an Autoregressive model (left) with 1 standard deviation Gaussian noise (added to the initialization of the output sequence) compared to our IOTRBM (right).

3D Comparison Retarget – This visualizes the input sequence (source) on the left hand side in blue dots against the ground truth output sequence (target) on the right with our FIOTRBM model’s prediction superimposed on the output in white.

3D Retarget Compared to Autoregressive Model – This is a direct comparison of the Autoregression retargeting performance on a sequence of the 3D dataset to our FIOTRBM model. Notice the accumulate of errors in the Autoregressive model as sequence regresses.