Welcome to Journal of Beijing Institute of Technology
Volume 19Issue 1
.
Turn off MathJax
Article Contents
WANG Jing, ZHANG Ying, ZHAO Sheng-hui, KUANG Jing-ming. Non-Intrusive Objective Speech Quality Measurement Based on Fuzzy GMM and SVR for Narrowband Speech[J]. JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2010, 19(1): 76-81.
Citation: WANG Jing, ZHANG Ying, ZHAO Sheng-hui, KUANG Jing-ming. Non-Intrusive Objective Speech Quality Measurement Based on Fuzzy GMM and SVR for Narrowband Speech[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2010, 19(1): 76-81.

Non-Intrusive Objective Speech Quality Measurement Based on Fuzzy GMM and SVR for Narrowband Speech

  • Received Date:2009-01-16
  • Based on fuzzy Gaussian mixture model (FGMM) and support vector regression (SVR), an improved version of non-intrusive objective measurement for assessing quality of output speech without inputting clean speech is proposed for narrowband speech. Its perceptual linear predictive (PLP) features extracted from clean speech and clustered by FGMM are used as an artificial reference model. Input speech is separated into three classes, for each a consistency parameter between each feature pair from test speech signals and its counterpart in the pre-trained FGMM reference model is calculated and mapped to an objective speech quality score using SVR method. The correlation degree between subjective mean opinion score (MOS) and objective MOS is analyzed. Experimental results show that the proposed method offers an effective technique and can give better performances than the ITU-T P.563 method under most of the test conditions for narrowband speech.
  • loading
  • [1]
    ITU-T Rec. P.800 Methods for subjective determination of transmission quality[S]. Geneva: International Telecommunication Union, 1996.
    [2]
    ITU-T Rec. P.862 Perceptual evaluation of speech quality (PESQ): an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs[S]. Geneva: International Telecommunication Union, 2001.
    [3]
    Gray P, Hollier M P, Massara R E. Nonintrusive speech-quality assessment using vocal-tract models[J]. IEE Proc on Vision, Image and Signal Processing, 2000, 147(6): 493-501.
    [4]
    ITU-T Rec. P.563 Single ended method for objective speech quality assessment in narrow-band telephony applications[S]. Geneva: International Tele-communication Union, 2004.
    [5]
    Falk T H, Xu Q, Chan W Y. Non-intrusive GMM-based speech quality measurement //IEEE. ICASSP2005. Pennsylvania: IEEE, 2005: 125-12.
    [6]
    ITU-T Rec. G.729-Annex B A silence compression scheme for G.729 optimized for terminals conforming to Recommendation V.70[S]. Geneva: International Telecommunication Union, 1996.
    [7]
    Hermansky H. Perceptual linear prediction (PLP) analysis of speech . Journal of the Acoustic Acoust Society of America, 1990, 87(4): 1738-1752.
    [8]
    Dat T, Van L, Michael W. Fuzzy gaussian mixture speaker models for speaker recognition[J]. Special Issue of the Australian Journal Intelligent Information Processing System, 1998, 5(4): 290-300.
    [9]
    Bezdek J C. Pattern recognition with fuzzy objective function algorithms[M]. New York: Plenum, 1981.
    [10]
    Grimm M, Kroschel K, Narayanan S. Support vector regression for automatic recognition of spontaneous emotions in speech //IEEE.ICASSP2007. Honolulu:IEEE, 2007:1085-1088.
    [11]
    Wang H F, Hu D J. Comparison of SVM and LS-SVM for Regression //International Conference on Neural Networks and Brain (ICNN&B). Beijing: IEEE, 2005:279-283. (Edited by
  • 加载中

Catalog

    通讯作者:陈斌, bchen63@163.com
    • 1.

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (582) PDF downloads(78) Cited by()
    Proportional views
    Related

    /

      Return
      Return
        Baidu
        map