Date of Original Version

3-2010

Type

Conference Proceeding

Journal Title

Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

First Page

5230

Last Page

5233

Rights Management

© 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Abstract or Description

The fusion of multiple recognition engines is known to be able to outperform individual ones, given sufficient independence of methods, models, and knowledge sources. We therefore investigate latefusion of different speech-based recognizers of emotion. Two generally different streams of information are considered: acoustics and linguistics fed by state-of-the-art automatic speech recognition. A total of five emotion recognition engines from different sites that provide heterogeneous output information are integrated by either simple democratic vote or learning `which predictor to trust when'. We are able to significantly outperform the best individual engine by fusion, and the so far best reported result on the recently introduced Emotion Challenge task.

DOI

10.1109/ICASSP.2010.5494986

Share

COinS
 

Published In

Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 5230-5233.