Welcome to Journal of Beijing Institute of Technology
Volume 29Issue 4
Dec. 2020
Turn off MathJax
Article Contents
Limin Pan, Xiaonan Qin, Senlin Luo. DSP-TMM: A Robust Cluster Analysis Method Based on Diversity Self-Paced T-Mixture Model[J]. JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2020, 29(4): 531-543. doi: 10.15918/j.jbit1004-0579.20070
Citation: Limin Pan, Xiaonan Qin, Senlin Luo. DSP-TMM: A Robust Cluster Analysis Method Based on Diversity Self-Paced T-Mixture Model[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2020, 29(4): 531-543.doi:10.15918/j.jbit1004-0579.20070

DSP-TMM: A Robust Cluster Analysis Method Based on Diversity Self-Paced T-Mixture Model

doi:10.15918/j.jbit1004-0579.20070
Funds:the 13th 5-Year National Science and Technology Supporting Project (2018YFC2000302)
More Information
  • Corresponding author:professor, Ph.D. E-mail:luosenlin@bit.edu.cn
  • Received Date:2020-08-31
  • Publish Date:2020-12-30
  • In order to implement the robust cluster analysis, solve the problem that the outliers in the data will have a serious disturbance to the probability density parameter estimation, and therefore affect the accuracy of clustering, a robust cluster analysis method is proposed which is based on the diversity self-paced t-mixture model. This model firstly adopts the t-distribution as the sub-model which tail is easily controllable. On this basis, it utilizes the entropy penalty expectation conditional maximal algorithm as a pre-clustering step to estimate the initial parameters. After that, this model introduces l 2,1-norm as a self-paced regularization term and developes a new ECM optimization algorithm, in order to select high confidence samples from each component in training. Finally, experimental results on several real-world datasets in different noise environments show that the diversity self-paced t-mixture model outperforms the state-of-the-art clustering methods. It provides significant guidance for the construction of the robust mixture distribution model.
  • loading
  • [1]
    Xu D, Tian Y. A comprehensive survey of clustering algorithms [J]. Annals of Data Science, 2015, 2(2): 165−193. doi:10.1007/s40745-015-0040-1
    [2]
    Mandal S, Prasanna S R M, Sundaram S. GMM posterior features for improving online handwriting recognition [J]. Expert Systems with Applications, 2018, 97: 421−433. doi:10.1016/j.eswa.2017.12.047
    [3]
    Padmanabhan J, Johnson Premkumar M J. Machine learning in automatic speech recognition: A survey [J]. IETE Technical Review, 2015, 32(4): 240−251. doi:10.1080/02564602.2015.1010611
    [4]
    Hatwar S, Wanare A. Gmm based image segmentation and analysis of image restoration tecniques [J]. International Journal of Computer Applications, 2015, 109(16): 45−50.
    [5]
    Reynolds D A. Gaussian mixture models [J]. Encyclopedia of biometrics, 2009, 1: 27. doi:10.1007/978-0-387-73003-5_196
    [6]
    Xia R, Zhang Q, Deng X. Multiscale Gaussian convolution algorithm for estimate of Gaussian mixture model [J]. Communications in Statistics-Theory and Methods, 2019, 48(23): 5889−5910. doi:10.1080/03610926.2018.1523431
    [7]
    Smyth P. Mixture models and the EM algorithm[D]. Irvine, USA: Department of Computer Science, University of California, 2017.
    [8]
    Duarte D P, Nogueira R N, Bilro L B. Semi-supervised Gaussian and t-distribution hybrid mixture model for water leak detection [J]. Measurement Science and Technology, 2019, 30(12): 125109. doi:10.1088/1361-6501/ab3b48
    [9]
    Chamroukhi F. Robust mixture of experts modeling using the t distribution [J]. Neural Networks, 2016, 79: 20−36. doi:10.1016/j.neunet.2016.03.002
    [10]
    Zhang Y, Tang Q, Niu L, et al. Self-paced mixture of t distribution model[C]//2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2018: 2796-2800.
    [11]
    Bhowmick D, Davison A C, Goldstein D R, et al. A Laplace mixture model for identification of differential expression in microarray experiments [J]. Biostatistics, 2006, 7(4): 630−641. doi:10.1093/biostatistics/kxj032
    [12]
    Peel D, McLachlan G J. Robust mixture modelling using the t distribution [J]. Statistics and computing, 2000, 10(4): 339−348. doi:10.1023/A:1008981510081
    [13]
    Chamroukhi F. Skew t mixture of experts [J]. Neurocomputing, 2017, 266: 390−408. doi:10.1016/j.neucom.2017.05.044
    [14]
    Lin T I, Wang W L, McLachlan G J, et al. Robust mixtures of factor analysis models using the restricted multivariate skew-t distribution [J]. Statistical Modelling, 2018, 18(1): 50−72. doi:10.1177/1471082X17718119
    [15]
    Wei Y, Tang Y, McNicholas P D. Mixtures of generalized hyperbolic distributions and mixtures of skew-t distributions for model-based clustering with incomplete data [J]. Computational Statistics & Data Analysis, 2019, 130: 18−41.
    [16]
    Jiang L, Meng D, Zhao Q, et al. Self-paced curriculum learning[C]//Twenty-Ninth AAAI Conference on Artificial Intelligence. Palo Alto, California, USA: AAAI Press, 2015.
    [17]
    Kwedlo W. A new method for random initialization of the EM algorithm for multivariate Gaussian mixture learning[C]//Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013. Heidelberg : Springer, 2013: 81-90.
    [18]
    Arthur D, Vassilvitskii S. K-means++: the advantages of careful seeding[C]//SODA’07: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2007: 1027-1035.
    [19]
    Qiu L, Fang F, Yuan S. Improved density peak clustering-based adaptive Gaussian mixture model for damage monitoring in aircraft structures under time-varying conditions [J]. Mechanical Systems and Signal Processing, 2019, 126: 281−304. doi:10.1016/j.ymssp.2019.01.034
    [20]
    Kwedlo W. A new method for random initialization of the EM algorithm for multivariate Gaussian mixture learning[C]//Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013. Heidelberg : Springer, 2013: 81-90.
    [21]
    Štepánová K, Vavrečka M. Estimating number of components in Gaussian mixture model using combination of greedy and merging algorithm [J]. Pattern Analysis and Applications, 2018, 21(1): 181−192. doi:10.1007/s10044-016-0576-5
    [22]
    Lu D, Tripodis Y, Gerstenfeld L C, et al. Clustering of temporal gene expression data with mixtures of mixed effects models with a penalized likelihood [J]. Bioinformatics, 2019, 35(5): 778−786. doi:10.1093/bioinformatics/bty696
    [23]
    Yang M S, Lai C Y, Lin C Y. A robust EM clustering algorithm for Gaussian mixture models [J]. Pattern Recognition, 2012, 45(11): 3950−3961. doi:10.1016/j.patcog.2012.04.031
    [24]
    Jiang L, Meng D, Yu S I, et al. Self-paced learning with diversity [J]. Advances in Neural Information Processing Systems, 2014: 2078−2086.
    [25]
    Hoaglin D C, Mosteller F. Understanding robust and exploratory data analysis[M]. New York: Wiley, 1983.
    [26]
    Scrucca L, Fop M, Murphy T B, et al. Mclust 5: Clustering, classification and density estimation using Gaussian finite mixture models [J]. The R Journal, 2016, 8(1): 289. doi:10.32614/RJ-2016-021
    [27]
    Hubert L, Arabie P. Comparing partitions [J]. Journal of classification, 1985, 2(1): 193−218. doi:10.1007/BF01908075
    [28]
    Fowlkes E B, Mallows C L. A method for comparing two hierarchical clusterings [J]. Journal of the American statistical association, 1983, 78(383): 553−569. doi:10.1080/01621459.1983.10478008
  • 加载中

Catalog

    通讯作者:陈斌, bchen63@163.com
    • 1.

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(3)/Tables(4)

    Article Metrics

    Article views (1382) PDF downloads(36) Cited by()
    Proportional views
    Related

    /

    Return
    Return
      Baidu
      map