Sage Journals: Discover world-class research

Abstract

Sentiment classification aims to solve the problem of automatic judgment of sentiment polarity. In the sentiment classification task of text data, such as online reviews, traditional deep learning models are dedicated to algorithm optimization but ignore the characteristics of imbalanced distribution of the number of classified samples and the inclusion of weak tagging information such as ratings and tags. Based on the traditional deep learning model, the method of random oversampling and cost sensitivity is used to increase the contribution of a minority of samples to the model loss function and avoid the model biasing to the majority of samples. The model training is divided into two stages. In the first stage, a large amount of weak tagging data is used to train the model, therefore a model that captures the sentiment semantics of the data is obtained. After that, the model parameters trained in the first stage are used as the initial parameters of the second stage model training, and only a small amount of tagging data is used to continue training the model to reduce the impact of noise, thus reducing the use of manual tagging samples. The experimental results show that the method is considerably better than traditional deep learning models in the sentiment classification task of hotel review data.

Keywords

Bi-directional long short-term memory neural network deep learning imbalanced classification sentiment classification weak tagging information

Get full access to this article

View all access options for this article.

References

Cambria and Erik, Affective computing and sentiment analysis, IEEE Intelligent Systems 31.2 (2016), 102–107.

Jiang

Zhou

Liu

and Zhao

, Target-dependent twitter sentiment classification, Proceedings of Annual Meeting of the Association for Computational Linguistics Human Language Technologies 1 (2011), 151–160.

Kiritchenko

Zhu

, Cherry and Mohammad

, S. NRC-Canada-2014: etecting aspects and sentiment in customer reviews, in: Proceedings of the 8th International Workshop on Semantic Evaluation, 2014, pp. 437–442.

D.T.

and Zhang

, Target-dependent twitter sentiment classification with rich automatic features, in: Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015.

Zhou

Huang

J.X.

Chen

and Hu

QV.

, Deep Learning for Aspect-Level Sentiment Classification: Survey, Vision, and Challenges, IEEE Access, 2019, 78454–78483.

Mikolov

Sutskever

Chen

Corrado

G.S.

and Dean

, Distributed representations of words and phrases and their compositionality, in: Advances in Neural Information Processing Systems, 2013, pp. 3111–3119.

Pennington

Socher

and Manning

C.D.

, Glove: Global vectors for word representation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1532–1543.

Peng

Khan

Cambria

and Hussain

, Sentic LSTM: a hybrid network for targeted aspect-based sentiment analysis, Cognitive Computation 104 (2018), 639–650.

Chawla

N.V.

Japkowicz

and Kotcz

, Special issue on learning from imbalanced data sets, ACM SIGKDD Explorations Newsletter 6.1 (2004), 1–6.

10.

Tian

Chao

K.M.

Zheng

Shah

Lan

and Yue

, A topic sentence-based instance transfer method for imbalanced sentiment classification of Chinese product reviews, Electronic Commerce Research and Applications 16 (2016), 66–76.

11.

Xiao

Wang

and Du

J.Y.

, Improving the performance of sentiment classification on imbalanced datasets with transfer learning, IEEE Access 7 (2019), 28281–28290.

12.

and Garcia

E.A.

, Learning from imbalanced data, IEEE Transactions on Knowledge and Data Engineering 21.9 (2009), 1263–1284.

13.

Rizun

Taranenko

and Waloszek

, Improving the accuracy in sentiment classification in the light of modelling the latent semantic relations, Information 9.12 (2018), 307.

14.

Khoo

C.S.

and Johnkhan

S.B.

, Lexicon-based sentiment analysis: comparative evaluation of six sentiment lexicons, Journal of Information Science 44.4 (2018), 491–511.

15.

Pang

Lee

and Vaithyanathan

, Thumbs up? Sentiment classification using machine learning techniques, arXiv preprint cs/0205070, 2002.

16.

Yao

Meng

and Wu

, Chinese text sentiment analysis based on extended sentiment dictionary, IEEE Access 7 (2019), 43749–43762.

17.

Ghiassi

and Lee

, A domain transferable lexicon set for Twitter sentiment analysis using a supervised machine learning approach, Expert Systems with Applications 106 (2018), 197–216.

18.

Rana

, Emotion classification from noisy speech-A deep learning approach, arXiv preprint arXiv:1603.05901, 2016.

19.

Devlin

Chang

M.W.

Lee

and Toutanova

, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805, 2018.

20.

Lango

, Tackling the problem of class imbalance in multi-class sentiment classification: an experimental study, Foundations of Computing and Decision Sciences 44.2 (2019), 151–178.

21.

Kübler

Liu

and Sayyed

Z.A.

, To use or not to use: feature selection for sentiment analysis of highly imbalanced data, Natural Language Engineering 24.1 (2018), 3–37.

22.

Tian

Chao

K.M.

Zheng

Shah

Lan

and Yue

, A topic sentence-based instance transfer method for imbalanced sentiment classification of Chinese product reviews, Electronic Commerce Research and Applications 16 (2016), 66–76.

23.

Dahou

Xiong

Zhou

and Elaziz

M.A.

, Multi-channel embedding convolutional neural network model for arabic sentiment classification, ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18.4 (2019), 1–23.

24.

Jamal

Xianqiao

and Aldabbas

, Deep learning-based sentimental analysis for large-scale imbalanced twitter data, Future Internet 11.9 (2019), 190.

25.

and Tan

, Semi-supervised target-oriented sentiment classification, Neurocomputing 337 (2019), 120–128.

26.

Hajmohammadi

M.S.

Ibrahim

and Selamat

, Cross-lingual sentiment classification using multiple source languages in multi-view semi-supervised learning, Engineering Applications of Artificial Intelligence 36 (2014), 195–203.

27.

Zhou

Chen

and Wang

, Fuzzy deep belief networks for semi-supervised sentiment classification, Neurocomputing 131 (2014), 312–322.

28.

Zhang

Zhu

and Zhou

, Multi-modal sentiment classification with independent and interactive knowledge via semi-supervised learning, IEEE Access 8 (2020), 22945–22954.

29.

Novak

P.K.

Smailović

Sluba

and Mozetič

, Sentiment of emojis, PloS One 10.12 (2015).

30.

Pasupa

and Ayutthaya

T.S.N.

, Thai sentiment analysis with deep learning techniques: a comparative study based on word embedding, POS-tag, and sentic features, Sustainable Cities and Society 50 (2019), 101615.

31.

Ch’ng

Chong

A.Y.L.

and See

, Multi-class twitter sentiment classification with emojis, Industrial Management & Data Systems, 2018.

32.

Bansal

and Srivastava

, Lexicon-based twitter sentiment analysis for vote share prediction using emoji and N-gram features, International Journal of Web Based Communities 15.1 (2019), 85–99.

33.

Wang

and Pal

, Detecting emotions in social media: a constrained optimization approach, in: Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015.

34.

O’Connor

Balasubramanyan

Routledge

B.R.

and Smith

N.A.

, From tweets to polls: Linking text sentiment to public opinion time series, in: Fourth International AAAI Conference on Weblogs and Social Media, 2010.

35.

Krawczyk

McInnes

B.T.

and Cano

, Sentiment classification from multi-class imbalanced twitter data using binarization, in: International Conference on Hybrid Artificial Intelligence Systems, 2017, pp. 26–37.

36.

Zhou

Wang

Lee

S.Y.M.

and Wang

, Imbalanced sentiment classification, in: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, 2011, pp. 2469–2472.

37.

Chen

Xia

Liu

and Wang

, Word embedding composition for data imbalances in sentiment and emotion classification, Cognitive Computation 7.2 (2015), 226–240.

38.

Guo

Zhang

and Yang

, Imbalanced text sentiment classification using universal and domain-specific knowledge, Knowledge-Based Systems 160 (2018), 1–15.