Towards expressive musical robots: a cross-modal framework for emotional gesture, voice and music
Graduate School of Informatics, Kyoto University, Kyoto, Japan
EURASIP Journal on Audio, Speech, and Music Processing 2012, 2012:3 doi:10.1186/1687-4722-2012-3Published: 17 January 2012
It has been long speculated that expression of emotions from different modalities have the same underlying 'code', whether it be a dance step, musical phrase, or tone of voice. This is the first attempt to implement this theory across three modalities, inspired by the polyvalence and repeatability of robotics. We propose a unifying framework to generate emotions across voice, gesture, and music, by representing emotional states as a 4-parameter tuple of speed, intensity, regularity, and extent (SIRE). Our results show that a simple 4-tuple can capture four emotions recognizable at greater than chance across gesture and voice, and at least two emotions across all three modalities. An application for multi-modal, expressive music robots is discussed.