Citation: | Jize Yin, Senlin Luo, Zhouting Wu, Limin Pan. Chinese Named Entity Recognition with Character-Level BLSTM and Soft Attention Model[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2020, 29(1): 60-71.doi:10.15918/j.jbit1004-0579.18161 |
[1] |
Grishman R, Sundheim B M. Message understanding conference-6:A brief history[C]//Proceedings of the 16th International Conference on Computational Linguistics-Volume 1, San Mateo, CA, USA, 1996.
|
[2] |
Bunescu R C, Mooney R J. A shortest path dependency kernel for relation extraction[C]//Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA, 2005.
|
[3] |
Babych B, Hartley A. Improving machine translation quality with automatic named entity recognition[C]//Proceedings of the 7th International EAMT Workshop on MT and other Language Technology Tools, Improving MT through Other Language Technology Tools:Resources and Tools for Building MT,Stroudsburg, PA, USA,2003.
|
[4] |
Han A L F, Zeng X D, Wong D F, et al. Chinese named entity recognition with graph-based semi-supervised learning model[C]//Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing (SIGHAN-8),Stroudsburg, PA, USA, 2015.
|
[5] |
Dong C H, Zhang J J, Zong C Q, et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[C]//International Conference on Computer Processing of Oriental Languages,New York City, NY, USA, 2016.
|
[6] |
Lu Y N, Zhang Y, Ji D H. Multi-prototype Chinese character embedding[C]//Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016),Portorož, Slovenia,2016.
|
[7] |
Zheng X Q, Feng J T, Lin M X, et al. Context-specific and multi-prototype character representations[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16),Palo Alto, CA, USA,2016.
|
[8] |
Zheng X Q, Feng J T, Chen Y, et al. Learning context-specific word/character embeddings[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17),San Francisco, CA, USA,2017.
|
[9] |
Zhang Y,Yang J. Chinese NER using Lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Long Papers),Stroudsburg, PA, USA, 2018.
|
[10] |
Gao J F, Li M, Wu A,et al. Chinese word segmentation and named entity recognition:a pragmatic approach[J]. Computational Linguistics, 2005, 31(4):531-574.
|
[11] |
Zhang S X, Qin Y, Wen J, et al. Word segmentation and named entity recognition for SIGHAN Bakeoff3[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing,Stroudsburg, PA, USA, 2006.
|
[12] |
Mao X N, Dong Y, He S, et al. Chinese word segmentation and named entity recognition based on conditional random fields[C]//Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing,Stroudsburg, PA, USA, 2008.
|
[13] |
Luo W C, Yang F. An empirical study of automatic Chinese word segmentation for spoken language understanding and named entity recognition[C]//Proceedings of NAACL-HLT 2016,Stroudsburg, PA, USA,2016.
|
[14] |
Peng N Y, Dredze M. Named entity recognition for Chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing,Stroudsburg, PA, USA,2015.
|
[15] |
Peng N Y, Dredze M. Improving named entity recognition for Chinese social media with word segmentation representation learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,Stroudsburg, PA, USA,2016.
|
[16] |
He H F, Sun X. F-score driven max margin neural network for named entity recognition in Chinese social media[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics:Volume 2, Short Papers,Stroudsburg, PA, USA, 2017.
|
[17] |
Kuru O, Can O A, Yuret D. CharNER:Character-level named entity recognition[C]//Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics:Technical Papers,Stroudsburg, PA, USA, 2016.
|
[18] |
Collobert R, Weston J. A unified architecture for natural language processing:Deep neural networks with multitask learning[C]//Proceedings of the 25th International Conference on Machine Learning,Stroudsburg, PA, USA,2008.
|
[19] |
Chiu J P C, Nichols E. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4:357-370.
|
[20] |
Liu X H, Wei F R, Zhang S D, et al. Named entity recognition for tweets[J]. ACM Transactions on Intelligent Systems and Technology (TIST)-Special Section on Twitter and Microblogging Services, Social Recommender Systems, and CAMRa2010:Movie Recommendation in Context, 2013, 4(1):3.
|
[21] |
Chou C L, Chang C H, Huang Y Y. Boosted web named entity recognition via tri-training[J]. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)-TALLIP Notes and Regular Papers, 2016, 16(2):10.
|
[22] |
Bikel D M, Miller S, Schwartz R, et al. Nymble:a high-performance learning name-finder[C]//Proceedings of the Fifth Conference on Applied Natural Language Processing,Stroudsburg, PA, USA,1997.
|
[23] |
Sekine S. NYU:Description of the Japanese NE system used for MET-2[C]//Proceedings of the Seventh Message Understanding Conference (MUC-7),Stroudsburg, PA, USA,1998.
|
[24] |
Borthwick A, Sterling J, Agichtein E, et al. NYU:Description of the MENE named entity system as used in MUC-7[C]//Proceedings of the Seventh Message Understanding Conference (MUC-7),Stroudsburg, PA, USA,1998.
|
[25] |
Isozaki H, Kazawa H. Efficient support vector classifiers for named entity recognition[C]//Proceedings of the 19th International Conference on Computational Linguistics-Volume 1,Stroudsburg, PA, USA, 2002.
|
[26] |
Takeuchi K, Collier N. Use of support vector machines in extended named entity recognition[C]//Proceedings of the 6th Conference on Natural Language Learning-Volume 20,Stroudsburg, PA, USA, 2002.
|
[27] |
Kazama J, Makino T, Ohta Y, et al. Tuning support vector machines for biomedical named entity recognition[C]//Proceedings of the ACL-02 Workshop on Natural Language Processing in the Biomedical Domain-Volume 3,Stroudsburg, PA, USA, 2002.
|
[28] |
Asahara M, Matsumoto Y. Japanese named entity extraction with redundant morphological analysis[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1,Stroudsburg, PA, USA,2003.
|
[29] |
McCallum A, Li W. Early results for named entity recognition with conditional random fields, features induction and web-enhanced lexicons[C]//Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4,Stroudsburg, PA, USA,2003.
|
[30] |
Hao Z F, Wang H F, Cai R C, et al. Product named entity recognition for Chinese query questions based on a skip-chain CRF model[J]. Neural Computing and Applications, 2013, 23(2):371-379.
|
[31] |
Patra R, Saha S K. A Kernel-based approach for biomedical named entity recognition[J]. The Scientific World Journal, 2013, 2013:950796.
|
[32] |
Chen A T, Peng F C, Shan R, et al. Chinese named entity recognition with conditional probabilistic models[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing,Stroudsburg, PA, USA, 2006.
|
[33] |
Zhou J S, Qu W G, Zhang F. Chinese named entity recognition via joint identification and categorization[J]. Chinese Journal of Electronics, 2013, 22(2):225-230.
|
[34] |
Chorowski J, Bahdanau D, Cho K, et al. End-to-end continuous speech recognition using attention-based recurrent NN:First results[EB/OL].[2014-12-04]. https://arxiv.org/abs/1412.1602.
|
[35] |
Hermann K M, Ko Ačiský T, Grefenstette E, et al. Teaching machines to read and comprehend[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 1,Cambridge, MA, USA, 2015.
|
[36] |
Vinyals O, Kaiser L, Koo T, et al. Grammar as a foreign language[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 2,Cambridge, MA, USA, 2015.
|
[37] |
Bapna A, Chen M X, Firat O, et al. Training deeper neural machine translation models with transparent attention[EB/OL].[2018-09-04]. https://arxiv.org/abs/1808.07561.
|
[38] |
Bahdanau D, Cho K H, Bengio Y. Neural machine translation by jointly learning to align and translate[EB/OL].[2016-05-19]. https://arxiv.org/abs/1409.0473.
|
[39] |
Xu K, Ba J L, Kiros R, et al. Show, attend and tell:Neural image caption generation with visual attention[C]//Proceedings of the 32nd International Conference on Machine Learning (ICML'15), Lille, France, 2015.
|
[40] |
Shang L F, Lu Z D, Li H. Neural responding machine for short-text conversation[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing,Stroudsburg, PA, USA, 2015.
|
[41] |
Luong M T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing,Stroudsburg, PA, USA, 2015.
|
[42] |
Graves A. Generating sequences with recurrent neural networks[EB/OL].[2014-06-05]. https://arxiv.org/abs/1308.0850.
|
[43] |
Graves A, Wayne G, Danihelka I. Neural turing machines[EB/OL].[2014-12-10]. https://arxiv.org/abs/1410.5401.
|
[44] |
Graves A, Wayne G, Reynolds M, et al. Hybrid computing using a neural network with dynamic external memory[J]. Nature, 2016, 538:471-476.
|
[45] |
Gu J T, Lu Z D, Li H, et al. Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,Stroudsburg, PA, USA, 2016.
|
[46] |
Gulcehre C, Ahn S, Nallapati R, et al. Pointing the unknown words[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,Stroudsburg, PA, USA, 2016.
|
[47] |
Cao Z Q, Luo C W, Li W J, et al. Joint copying and restricted generation for paraphrase[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17),San Francisco, CA, USA, 2017.
|
[48] |
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA,2017.
|
[49] |
Choi H, Cho K, Bengio Y. Fine-grained attention mechanism for neural machine translation[J]. Neurocomputing, 2018, 284:171-176.
|
[50] |
Elbayad M, Besaier L, Verbeek J. Pervasive attention:2D convolutional neural networks for sequence-to-sequence prediction[EB/OL].[2018-08-11]. https://arxiv.org/abs/1808.03867.
|
[51] |
Hu M H, Peng Y X, Huang Z, et al. Reinforced mnemonic reader for machine reading comprehension[C]//Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), Stockholm, Sweden,2018.
|
[52] |
Shen T, Zhou T Y, Long G D, et al. DiSAN:directional self-attention network for RNN/CNN-Free language understanding[EB/OL].[2017-11-20]. https://arxiv.org/abs/1709.04696.
|
[53] |
Shen T, Zhou T Y, Long G D, et al. Bi-directional block self-attention for fast and memory-efficient sequence modeling[EB/OL].[2018-04-03]. https://arxiv.org/abs/1804.00857.
|
[54] |
Strubell E, Verga P, Andor D, et al. Linguistically-informed self-attention for semantic role labeling[EB/OL].[2018-08-28]. https://arxiv.org/abs/1804.08199.
|
[55] |
Tang G B, Müller M, Rios A, et al. Why self-attention? A targeted evaluation of neural machine translation architectures[EB/OL].[2018-08-28]. https://arxiv.org/abs/1808.08946.
|
[56] |
Yu A W, Dohan D, Luong M T, et al. QANET:Combining local convolution with global self-attention for reading comprehension[EB/OL].[2018-04-23]. https://arxiv.org/abs/1804.09541.
|
[57] |
Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
|
[58] |
Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005, 18(5-6):602-610.
|
[59] |
Lafferty J D, Mccallum A, Pereira F C N. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001),San Mateo, CA, USA, 2001.
|
[60] |
Levow G A. The third international Chinese language processing bakeoff:Word segmentation and named entity recognition[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing,Stroudsburg, PA, USA, 2006.
|