Welcome to Journal of Beijing Institute of Technology
Volume 29Issue 1
.
Turn off MathJax
Article Contents
Jize Yin, Senlin Luo, Zhouting Wu, Limin Pan. Chinese Named Entity Recognition with Character-Level BLSTM and Soft Attention Model[J]. JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2020, 29(1): 60-71. doi: 10.15918/j.jbit1004-0579.18161
Citation: Jize Yin, Senlin Luo, Zhouting Wu, Limin Pan. Chinese Named Entity Recognition with Character-Level BLSTM and Soft Attention Model[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2020, 29(1): 60-71.doi:10.15918/j.jbit1004-0579.18161

Chinese Named Entity Recognition with Character-Level BLSTM and Soft Attention Model

doi:10.15918/j.jbit1004-0579.18161
  • Received Date:2018-10-15
  • Unlike named entity recognition (NER) for English, the absence of word boundaries reduces the final accuracy for Chinese NER. To avoid accumulated error introduced by word segmentation, a deep model extracting character-level features is carefully built and becomes a basis for a new Chinese NER method, which is proposed in this paper. This method converts the raw text to a character vector sequence, extracts global text features with a bidirectional long short-term memory and extracts local text features with a soft attention model. A linear chain conditional random field is also used to label all the characters with the help of the global and local text features. Experiments based on the Microsoft Research Asia (MSRA) dataset are designed and implemented. Results show that the proposed method has good performance compared to other methods, which proves that the global and local text features extracted have a positive influence on Chinese NER. For more variety in the test domains, a resume dataset from Sina Finance is also used to prove the effectiveness of the proposed method.
  • loading
  • [1]
    Grishman R, Sundheim B M. Message understanding conference-6:A brief history[C]//Proceedings of the 16th International Conference on Computational Linguistics-Volume 1, San Mateo, CA, USA, 1996.
    [2]
    Bunescu R C, Mooney R J. A shortest path dependency kernel for relation extraction[C]//Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA, 2005.
    [3]
    Babych B, Hartley A. Improving machine translation quality with automatic named entity recognition[C]//Proceedings of the 7th International EAMT Workshop on MT and other Language Technology Tools, Improving MT through Other Language Technology Tools:Resources and Tools for Building MT,Stroudsburg, PA, USA,2003.
    [4]
    Han A L F, Zeng X D, Wong D F, et al. Chinese named entity recognition with graph-based semi-supervised learning model[C]//Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing (SIGHAN-8),Stroudsburg, PA, USA, 2015.
    [5]
    Dong C H, Zhang J J, Zong C Q, et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[C]//International Conference on Computer Processing of Oriental Languages,New York City, NY, USA, 2016.
    [6]
    Lu Y N, Zhang Y, Ji D H. Multi-prototype Chinese character embedding[C]//Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016),Portorož, Slovenia,2016.
    [7]
    Zheng X Q, Feng J T, Lin M X, et al. Context-specific and multi-prototype character representations[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16),Palo Alto, CA, USA,2016.
    [8]
    Zheng X Q, Feng J T, Chen Y, et al. Learning context-specific word/character embeddings[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17),San Francisco, CA, USA,2017.
    [9]
    Zhang Y,Yang J. Chinese NER using Lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Long Papers),Stroudsburg, PA, USA, 2018.
    [10]
    Gao J F, Li M, Wu A,et al. Chinese word segmentation and named entity recognition:a pragmatic approach[J]. Computational Linguistics, 2005, 31(4):531-574.
    [11]
    Zhang S X, Qin Y, Wen J, et al. Word segmentation and named entity recognition for SIGHAN Bakeoff3[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing,Stroudsburg, PA, USA, 2006.
    [12]
    Mao X N, Dong Y, He S, et al. Chinese word segmentation and named entity recognition based on conditional random fields[C]//Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing,Stroudsburg, PA, USA, 2008.
    [13]
    Luo W C, Yang F. An empirical study of automatic Chinese word segmentation for spoken language understanding and named entity recognition[C]//Proceedings of NAACL-HLT 2016,Stroudsburg, PA, USA,2016.
    [14]
    Peng N Y, Dredze M. Named entity recognition for Chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing,Stroudsburg, PA, USA,2015.
    [15]
    Peng N Y, Dredze M. Improving named entity recognition for Chinese social media with word segmentation representation learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,Stroudsburg, PA, USA,2016.
    [16]
    He H F, Sun X. F-score driven max margin neural network for named entity recognition in Chinese social media[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics:Volume 2, Short Papers,Stroudsburg, PA, USA, 2017.
    [17]
    Kuru O, Can O A, Yuret D. CharNER:Character-level named entity recognition[C]//Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics:Technical Papers,Stroudsburg, PA, USA, 2016.
    [18]
    Collobert R, Weston J. A unified architecture for natural language processing:Deep neural networks with multitask learning[C]//Proceedings of the 25th International Conference on Machine Learning,Stroudsburg, PA, USA,2008.
    [19]
    Chiu J P C, Nichols E. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4:357-370.
    [20]
    Liu X H, Wei F R, Zhang S D, et al. Named entity recognition for tweets[J]. ACM Transactions on Intelligent Systems and Technology (TIST)-Special Section on Twitter and Microblogging Services, Social Recommender Systems, and CAMRa2010:Movie Recommendation in Context, 2013, 4(1):3.
    [21]
    Chou C L, Chang C H, Huang Y Y. Boosted web named entity recognition via tri-training[J]. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)-TALLIP Notes and Regular Papers, 2016, 16(2):10.
    [22]
    Bikel D M, Miller S, Schwartz R, et al. Nymble:a high-performance learning name-finder[C]//Proceedings of the Fifth Conference on Applied Natural Language Processing,Stroudsburg, PA, USA,1997.
    [23]
    Sekine S. NYU:Description of the Japanese NE system used for MET-2[C]//Proceedings of the Seventh Message Understanding Conference (MUC-7),Stroudsburg, PA, USA,1998.
    [24]
    Borthwick A, Sterling J, Agichtein E, et al. NYU:Description of the MENE named entity system as used in MUC-7[C]//Proceedings of the Seventh Message Understanding Conference (MUC-7),Stroudsburg, PA, USA,1998.
    [25]
    Isozaki H, Kazawa H. Efficient support vector classifiers for named entity recognition[C]//Proceedings of the 19th International Conference on Computational Linguistics-Volume 1,Stroudsburg, PA, USA, 2002.
    [26]
    Takeuchi K, Collier N. Use of support vector machines in extended named entity recognition[C]//Proceedings of the 6th Conference on Natural Language Learning-Volume 20,Stroudsburg, PA, USA, 2002.
    [27]
    Kazama J, Makino T, Ohta Y, et al. Tuning support vector machines for biomedical named entity recognition[C]//Proceedings of the ACL-02 Workshop on Natural Language Processing in the Biomedical Domain-Volume 3,Stroudsburg, PA, USA, 2002.
    [28]
    Asahara M, Matsumoto Y. Japanese named entity extraction with redundant morphological analysis[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1,Stroudsburg, PA, USA,2003.
    [29]
    McCallum A, Li W. Early results for named entity recognition with conditional random fields, features induction and web-enhanced lexicons[C]//Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4,Stroudsburg, PA, USA,2003.
    [30]
    Hao Z F, Wang H F, Cai R C, et al. Product named entity recognition for Chinese query questions based on a skip-chain CRF model[J]. Neural Computing and Applications, 2013, 23(2):371-379.
    [31]
    Patra R, Saha S K. A Kernel-based approach for biomedical named entity recognition[J]. The Scientific World Journal, 2013, 2013:950796.
    [32]
    Chen A T, Peng F C, Shan R, et al. Chinese named entity recognition with conditional probabilistic models[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing,Stroudsburg, PA, USA, 2006.
    [33]
    Zhou J S, Qu W G, Zhang F. Chinese named entity recognition via joint identification and categorization[J]. Chinese Journal of Electronics, 2013, 22(2):225-230.
    [34]
    Chorowski J, Bahdanau D, Cho K, et al. End-to-end continuous speech recognition using attention-based recurrent NN:First results[EB/OL].[2014-12-04]. https://arxiv.org/abs/1412.1602.
    [35]
    Hermann K M, Ko Ačiský T, Grefenstette E, et al. Teaching machines to read and comprehend[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 1,Cambridge, MA, USA, 2015.
    [36]
    Vinyals O, Kaiser L, Koo T, et al. Grammar as a foreign language[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 2,Cambridge, MA, USA, 2015.
    [37]
    Bapna A, Chen M X, Firat O, et al. Training deeper neural machine translation models with transparent attention[EB/OL].[2018-09-04]. https://arxiv.org/abs/1808.07561.
    [38]
    Bahdanau D, Cho K H, Bengio Y. Neural machine translation by jointly learning to align and translate[EB/OL].[2016-05-19]. https://arxiv.org/abs/1409.0473.
    [39]
    Xu K, Ba J L, Kiros R, et al. Show, attend and tell:Neural image caption generation with visual attention[C]//Proceedings of the 32nd International Conference on Machine Learning (ICML'15), Lille, France, 2015.
    [40]
    Shang L F, Lu Z D, Li H. Neural responding machine for short-text conversation[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing,Stroudsburg, PA, USA, 2015.
    [41]
    Luong M T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing,Stroudsburg, PA, USA, 2015.
    [42]
    Graves A. Generating sequences with recurrent neural networks[EB/OL].[2014-06-05]. https://arxiv.org/abs/1308.0850.
    [43]
    Graves A, Wayne G, Danihelka I. Neural turing machines[EB/OL].[2014-12-10]. https://arxiv.org/abs/1410.5401.
    [44]
    Graves A, Wayne G, Reynolds M, et al. Hybrid computing using a neural network with dynamic external memory[J]. Nature, 2016, 538:471-476.
    [45]
    Gu J T, Lu Z D, Li H, et al. Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,Stroudsburg, PA, USA, 2016.
    [46]
    Gulcehre C, Ahn S, Nallapati R, et al. Pointing the unknown words[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,Stroudsburg, PA, USA, 2016.
    [47]
    Cao Z Q, Luo C W, Li W J, et al. Joint copying and restricted generation for paraphrase[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17),San Francisco, CA, USA, 2017.
    [48]
    Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA,2017.
    [49]
    Choi H, Cho K, Bengio Y. Fine-grained attention mechanism for neural machine translation[J]. Neurocomputing, 2018, 284:171-176.
    [50]
    Elbayad M, Besaier L, Verbeek J. Pervasive attention:2D convolutional neural networks for sequence-to-sequence prediction[EB/OL].[2018-08-11]. https://arxiv.org/abs/1808.03867.
    [51]
    Hu M H, Peng Y X, Huang Z, et al. Reinforced mnemonic reader for machine reading comprehension[C]//Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), Stockholm, Sweden,2018.
    [52]
    Shen T, Zhou T Y, Long G D, et al. DiSAN:directional self-attention network for RNN/CNN-Free language understanding[EB/OL].[2017-11-20]. https://arxiv.org/abs/1709.04696.
    [53]
    Shen T, Zhou T Y, Long G D, et al. Bi-directional block self-attention for fast and memory-efficient sequence modeling[EB/OL].[2018-04-03]. https://arxiv.org/abs/1804.00857.
    [54]
    Strubell E, Verga P, Andor D, et al. Linguistically-informed self-attention for semantic role labeling[EB/OL].[2018-08-28]. https://arxiv.org/abs/1804.08199.
    [55]
    Tang G B, Müller M, Rios A, et al. Why self-attention? A targeted evaluation of neural machine translation architectures[EB/OL].[2018-08-28]. https://arxiv.org/abs/1808.08946.
    [56]
    Yu A W, Dohan D, Luong M T, et al. QANET:Combining local convolution with global self-attention for reading comprehension[EB/OL].[2018-04-23]. https://arxiv.org/abs/1804.09541.
    [57]
    Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
    [58]
    Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005, 18(5-6):602-610.
    [59]
    Lafferty J D, Mccallum A, Pereira F C N. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001),San Mateo, CA, USA, 2001.
    [60]
    Levow G A. The third international Chinese language processing bakeoff:Word segmentation and named entity recognition[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing,Stroudsburg, PA, USA, 2006.
  • 加载中

Catalog

    通讯作者:陈斌, bchen63@163.com
    • 1.

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (424) PDF downloads(204) Cited by()
    Proportional views
    Related

    /

      Return
      Return
        Baidu
        map