Talks

PhD Proposal: Affective Human Motion Detection and Synthesis

Uttaran Bhattacharya

5105 Brendan Iribe Center for Computer Science and Engineering (IRB)

Wednesday, December 1, 2021, 5:15-7:15 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Human emotion perception is an integral component of intelligent systems currently being designed for a wide range of socio-cultural applications, including behavior prediction, social robotics, medical therapy and rehabilitation, surveillance, and animation of digital characters in multimedia. Human observers perceive emotions from a number of cues or modalities, including faces, speech, and body expressions. Studies in affective computing indicate that emotions perceived from body expressions are extremely consistent across observers because humans tend to have less conscious control over their body expressions. Our research focuses on this aspect of emotion perception, as we attempt to build predictive methods for automated emotion recognition from body expressions, as well as build generative methods for synthesizing digital characters with appropriate affective body expressions. This talk elaborates on both these components of our research in two parts.

In the first part, we will go through two approaches for designing and training partially supervised methods for emotion recognition from body expressions, specifically gaits. In one approach, we leverage existing gait datasets annotated with emotions to generate large-scale synthetic gaits corresponding to the emotion labels. In the other approach, we leverage large-scale unlabeled gait datasets together with smaller annotated gait datasets to learn meaningful latent representations for emotion recognition. We design an autoencoder coupled with a classifier to learn latent representations for simultaneously reconstructing all input gaits and classifying the labeled gaits into emotion classes.

In the second part, we will go through two variations of generative methods to synthesize emotionally expressive body expressions, specifically gaits and gestures. The first variation is asynchronous generation, where we synthesize only one modality of the digital characters — in our case, body expressions — with affective expressions. Our approach is to design an autoregression network that takes in a history of the characters’ pose sequences and the intended future emotions, and generates their future pose sequences with the desired affective expressions. The second variation is the more challenging synchronous generation, where the affective contents of two or more modalities, such as body gestures and speech, need to be synchronized with each other. Our approach utilizes machine translation to translate from speech to body gestures, and adversarial discrimination to differentiate between original and synthesized gestures in terms of affective expressions, to produce state-of-the-art affective body gestures synchronized with speech.

Examining Committee:

Chair:
Department Representative:
Members:

Dr. Dinesh Manocha
Dr. Ming Lin
Dr. Aniket Bera
Dr. Huaishu Peng
Dr. Jae Shim
Dr. Viswanathan Swaminathan

Bio

Uttaran Bhattacharya joined the Ph.D. program in Computer Science at the University of Maryland, College Park, in 2018. He is advised by Dr. Dinesh Manocha in the GAMMA Lab, with research focus on affective computing and human motion recognition and synthesis. He is currently developing automated techniques to generate 3D animations of human body expressions, such as gaits and gestures, corresponding to different emotions in a variety of social contexts.

This talk is organized by Tom Hurst