CWSXLNET: A SENTIMENT ANALYSIS MODEL BASED ON CHINESE WORD SEGMENTATION INFORMATION ENHANCEMENT

CWSXLNet: A Sentiment Analysis Model Based on Chinese Word Segmentation Information Enhancement

CWSXLNet: A Sentiment Analysis Model Based on Chinese Word Segmentation Information Enhancement

Blog Article

This paper proposed a method for improving the XLNet model to address the shortcomings of segmentation algorithm for processing Chinese language, such as long sub-word lengths, long word lists and incomplete word list battery-powered generators coverage.To address these issues, we proposed the CWSXLNet (Chinese Word Segmentation XLNet) model based on Chinese word segmentation information enhancement.The model first pre-processed Chinese pretrained text by Chinese word segmentation tool, and proposed a Chinese word segmentation attention mask mechanism by combining PLM (Permuted Language Model) and two-stream self-attention mechanism of XLNet.

While performing natural language processing at word granularity, it can reduce the degree of masking between masked and non-masked words for two words belonging to the same word.For the Chinese sentiment analysis task, proposed the CWSXLNet-BiGRU-Attention model, which introduces bi-directional GRU as well as self-attention mechanism in the downstream task.Experiments show that CWSXLNet has achieved 89.

91% precision, 91.53% recall rate and 90.71% F1-score, and CWSXLNet-BiGRU-Attention has achieved 92.

61% precision, 93.19% recall rate and 92.90% F1-score on ChnSentiCorp Olive Oil dataset, which indicates that CWSXLNet has better performance than other models in Chinese sentiment analysis.

Report this page