Welcome to Journal of Beijing Institute of Technology
Volume 24Issue 4
.
Turn off MathJax
Article Contents
HAN Lei, LUO Sen-lin, CHEN Qian-rou, PAN Li-min. Fast Chinese syntactic parsing method based on conditional random fields[J]. JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2015, 24(4): 519-525. doi: 10.15918/j.jbit1004-0579.201524.0414
Citation: HAN Lei, LUO Sen-lin, CHEN Qian-rou, PAN Li-min. Fast Chinese syntactic parsing method based on conditional random fields[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2015, 24(4): 519-525.doi:10.15918/j.jbit1004-0579.201524.0414

Fast Chinese syntactic parsing method based on conditional random fields

doi:10.15918/j.jbit1004-0579.201524.0414
  • Received Date:2014-05-03
  • A fast method for phrase structure grammar analysis is proposed based on conditional random fields (CRF). The method trains several CRF classifiers for recognizing the phrase nodes at different levels, and uses the bottom-up to connect the recognized phrase nodes to construct the syntactic tree. On the basis of Beijing forest studio Chinese tagged corpus, two experiments are designed to select the training parameters and verify the validity of the method. The result shows that the method costs 78.98.ms and 4.63.ms to train and test a Chinese sentence of 17.9 words. The method is a new way to parse the phrase structure grammar for Chinese, and has good generalization ability and fast speed.
  • loading
  • [1]
    Xue N, Xia F, Chiou F, et al. The Penn Chinese TreeBank: phrase structure annotation of a large corpus [J]. Natural Language Engineering, 2005, 11(02): 207-38.
    [2]
    Bikel D M, Chiang D. Two statistical parsing models applied to the Chinese Treebank //Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics-Volume 12. Stroudsburg, PA, USA: Association for Computational Linguistics, 2000: 1-6.
    [3]
    Bikel D M. On the parameter space of generative lexicalized statistical parsing models[D]. PhiLadelphia: University of Pennsylvania, 2004.
    [4]
    Chiang D, Bikel D M. Recovering latent information in treebanks //Proceedings of the 19th international conference on Computational linguistics-Volume 1, Stroudsbury, PA, USA: Association for Computational Linguistics, 2002: 1-7.
    [5]
    Levy R, Manning C. Is it harder to parse Chinese, or the Chinese Treebank? //Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1. Stroudsburg, PA, USA: Association for Computational Linguistics, 2003: 439-446.
    [6]
    Xiong D, Li S, Liu Q, et al. Parsing the penn chinese treebank with semantic knowledge[M]//Natural Language Processing-IJCNLP 2005. Berlin, Heidelberg: Springer, 2005: 70-81.
    [7]
    Jiang Zhengping. Statistical Chinese parsing[D]. Singapore: National University of Singapore, 2004.
    [8]
    Mi H T, Xiong D Y, Liu Q. Research on strategies for integrating Chinese lexical analysis and parsing[J]. Journal of Chinese Information Processing, 2008, 22(2): 10-17. (in Chinese)
    [9]
    Chen Gong, Luo Senlin, Chen Kaijiang, et al. Method for layered Chinese parsing based on subsidiary context and lexical information [J]. Journal of Chinese Information Processing, 2012, 26(01): 9-15. (in Chinese)
    [10]
    Yamada H, Matsumoto Y. Statistical dependency analysis with support vector machines [J]. Machine Learning, 1999, 34(1-3): 151-175.
    [11]
    Sagae K, Lavie A. A classifier-based parser with linear run-time complexity //Parsing 05 Proceedings of the Ninth International Workshop on Parsing Technology. Stroudsburg, PA, USA: Association for Computational Linguistics, 2005: 125-132.
    [12]
    Cheng Y, Asahara M, Matsumoto Y. Machine learning-based dependency analyzer for Chinese [J]. Journal of Chinese Language and Computing, 2005, 15(1): 13-24.
    [13]
    Wang M, Sagae K, Mitamura T. A fast, accurate deterministic parser for Chinese //Proceeding ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2006: 425-432.
    [14]
    Lafferty J, McCallum A, Pereira F. Conditional random fields: probabilistic models for segmenting and labeling sequence data //the Eighteenth International Conference on Machine Learning, San Francisco, CA, USA, 2012.
    [15]
    Luo Senlin, Liu Yingying, Feng Yang, et al. Method of building BFS-CTC: a Chinese tagged corpus of sentential semantic structure [J]. Transactions of Beijing Institute of Technology, 2012, 32(03): 311-315. (in Chinese)
    [16]
    Liu Yingying, Luo Senlin, Feng Yang, et al. BFS-CTC: a Chinese corpus of sentential semantic structure [J]. Journal of Chinese Information Processing, 2013, (27): 72-80. (in Chinese)
    [17]
    Charniak E. Statistical parsing with a context-free grammar and word statistics //the Fourteenth National Conference on Artificial Intelligence and Ninth Conference on Innovative Applications of Artificial Intelligence, Providence, Rhode Island, 1997.
  • 加载中

Catalog

    通讯作者:陈斌, bchen63@163.com
    • 1.

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (602) PDF downloads(532) Cited by()
    Proportional views
    Related

    /

      Return
      Return
        Baidu
        map