Welcome to Journal of Beijing Institute of Technology
Volume 22Issue 1
.
Turn off MathJax
Article Contents
Lü Kun, JIA Yun-de, ZHANG Xin. Audio-visual emotion recognition with multilayer boosted HMM[J]. JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2013, 22(1): 89-93.
Citation: Lü Kun, JIA Yun-de, ZHANG Xin. Audio-visual emotion recognition with multilayer boosted HMM[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2013, 22(1): 89-93.

Audio-visual emotion recognition with multilayer boosted HMM

  • Received Date:2012-04-12
  • Emotion recognition has become an important task of modern human-computer interaction. A multilayer boosted HMM (MBHMM) classifier for automatic audio-visual emotion recognition is presented in this paper. A modified Baum-Welch algorithm is proposed for component HMM learning and adaptive boosting (AdaBoost) is used to train ensemble classifiers for different layers (cues). Except for the first layer, the initial weights of training samples in current layer are decided by recognition results of the ensemble classifier in the upper layer. Thus the training procedure using current cue can focus more on the difficult samples according to the previous cue. Our MBHMM classifier is combined by these ensemble classifiers and takes advantage of the complementary information from multiple cues and modalities. Experimental results on audio-visual emotion data collected in Wizard of Oz scenarios and labeled under two types of emotion category sets demonstrate that our approach is effective and promising.
  • loading
  • [1]
    Pantic M, Rothkrantz L J M. Automatic analysis of facial expressions: the state of the art[J]. IEEE Trans on PAMI, 2000,22(12): 1424-1445.
    [2]
    Fasel B, Luttin J. Automatic facial expression analysis: a survey[J]. Pattern Recognition, 2003,36(1):259-275.
    [3]
    Zeng Z, Pantic M, Roisman G I, et al. A survey of affect recognition methods: audio, visual, and spontaneous expressions[J]. IEEE Trans on PAMI, 2009,31(1):39-58.
    [4]
    Ekman P, Friesen W V. Constants across cultures in the face and emotion[J]. J Personality Social Psychol,1971, 17 (2):124–129.
    [5]
    Cowie R, Douglas-Cowie E, Tsapatsoulis N, et al. Emotion recognition in human-computer interaction[J]. IEEE Signal Processing Magazine, 2001,18(1):32-80.
    [6]
    Scherer K R. Appraisal theory. Handbook of cognition and emotion[M]. Dalgleish T, Power M J, ed.[S.l.]: Wiley,1999:637-663.
    [7]
    Zeng Z, Tu J, Pianfetti B, et al. Audio-visual affect recognition through multi-stream fused HMM for HCI[C]//Proc IEEE Int'l Conf Computer Vision and Pattern Recognition (CVPR'05). San Diego, CA, USA: IEEE, 2005:967-972.
    [8]
    Zeng Z, Hu Y, Liu M, et al. Training combination strategy of multi-stream fused hidden Markov model for audio-visual affect recognition[C]//Proc 14th ACM Int'l Conf Multimedia (Multimedia'06). Santa Barbara, CA, USA: ACM, 2006:65-68.
    [9]
    Petridis S, Pantic M. Audiovisual discrimination between laughter and speech[C]//IEEE Int'l Conf Acoustics, Speech, and Signal Processing (ICASSP). Las Vegas, NV, USA: IEEE, 2008: 5117-5120.
    [10]
    Nicolaou M A, Gunes H, Pantic M. Audio-visual classification and fusion of spontaneous affective data in likelihood space[C]//2010 International Conference on Pattern Recognition. Istanbul, Turkey: IEEE, 2010: 3695-3699.
    [11]
    Nicolaou M A, Gunes H, Pantic M. Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space[J]. IEEE Trans on Affective Computing, 2011, 2(2):92-105.
    [12]
    Freund Y, Schapire R E. A decision-theoretic generalization of on-line learning and an application to boosting[J]. Journal of Computer and System Science, 1997,55(1): 119-139.
    [13]
    Vukadinovic D, Pantic M. Fully automatic facial feature point detection using Gabor feature based boosted classifiers[C]//Proc IEEE Int'l Conf Systems, Man and Cybernetics. Waikoloa, HI, USA: IEEE, 2005, 2:1692-1698.
    [14]
    Patras I, Pantic M. Particle filtering with factorized likelihoods for tracking facial features[C]//Proc Int'l Conf Automatic Face and Gesture Recognition. Seoul: IEEE, 2004: 97-104.
    [15]
    Rabiner L R. A tutorial on hidden Markov models and selected applications in speech recognition[J]. Proc IEEE, 1989,77(2):257-286.
    [16]
    Arslan L M, Hansen J H l. Selective training for hidden Markov models with applications to speech classification[J]. IEEE Trans on Speech and Audio Processing, 1999,7(1): 46-54.
    [17]
    McKeown G, Valstar M, Cowie R, et al. The semaine corpus of emotionally coloured character interactions[C]//Proc IEEE Int'l Conf Multimedia and Expo. Singapore: IEEE, 2010:1079-1084.
  • 加载中

Catalog

    通讯作者:陈斌, bchen63@163.com
    • 1.

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (814) PDF downloads(679) Cited by()
    Proportional views
    Related

    /

      Return
      Return
        Baidu
        map