This article introduces a novel approach for sentiment analysis – the clustering-based sentiment analysis approach. By applying a TF-IDF weighting method, a voting mechanism and importing term scores, an acceptable and stable clustering result can be obtained. The methodology has competitive advantages over the two existing types of approaches: symbolic techniques and supervised learning methods. It is a well-performed, efficient and non-human participating approach to solving sentiment analysis problems.
ChiuC-M. Towards a hypermedia-enabled and web-based data analysis framework. Journal of Information Science2004; 30: 60.
2.
ChaovalitPZhouL. Movie review mining: a comparison between supervised and unsupervised classification approaches. In: Proceedings of the 38th Hawaii international conference on system sciences, IEEE Computer Society, 2005.
3.
AueAGamonMCustomizing sentiment classifiers to new domains: a case study. Proceedings of recent advances in natural language processing (RANLP)2005, pp. 207–218.
4.
TanSWuGTangHChengX. A novel scheme for domain-transfer problem in the context of sentiment analysis. In: Proceedings of sixteenth ACM conference on information and knowledge management (CIKM)2007, pp. 979–982.
5.
ThetTTNaJ-CKhooCSG. Aspect-based sentiment analysis of movie reviews on discussion boards. Journal of Information Science2010; 36: 823.
6.
PangBLeeLVaithyanathanS. Thumbs up? Sentiment classification using machine learning techniques. In: Conference on empirical methods in natural language processing (EMNLP). Philadelphia, Pennsylvania, USA, 2002, p. 79.
7.
TurneyPD. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: 40th annual meeting of the Association for Computational Linguistics (ACL), Philadelphia, Pennsylvania, USA, 2002, p. 417.
8.
WiebeJM. Learning subjective adjectives from corpora. In: Conference on artificial intelligence, Menlo Park, CA. AAAI Press2000, pp. 735–741.
9.
PangBLeeL. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2004, p. 271.
10.
YuHHatzivassiloglouV. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. Conference on empirical methods in natural language processing. Stroudsburg, PA: Association for Computational Linguistics, 2003, p. 129.
11.
BoiyEHensPDeschachtKMoensM-F. Automatic sentiment analysis in on-line text. In: International conference on electronic publishing pages, Vienna, Austria, 2007, pp. 349–360.
12.
AndrewsNOFoxEA. Recent developments in document clustering. Computer Science, Virginia Tech, Tech Rep. 2007.
13.
SaltonGWongAYangCS. A vector space model for automatic indexing. Communications of the ACM1975; 18: 613–620.
14.
CesaranoCDorrBPicarielloAReforgiatoDSagoffASubrahmanianVS. Oasys: an opinion analysis system. In: AAAI spring symposium on computational approaches to Analyzing Weblogs, 2004.
15.
RossS. A first course in probability. Prentice Hall, 1994.
16.
MillerGABeckwithRFellbaumCGrossDMillerKJ. Introduction to wordnet: an on-line lexical database. International Journal of Lexicography1990; 3: 235.
17.
KampsJMarxMMokkenRJDe RijkeM. Using wordnet to measure semantic orientations of adjectives. In: International conference on language resources and evaluation2004, p. 1115.
TurneyP. Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: European conference on machine learning. Berlin: Springer, 2001, p. 491.
20.
PangBLeeL. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd annual meeting of the Association for Computational Linguistics, 2005, pp. 115–124.
21.
DasSRChenMY. Yahoo! for Amazon: sentiment extraction from small talk on the web. Management Science2007; 53: 1375–1388.
22.
SaltonGBuckleyC. Term-weighting approaches in automatic text retrieval. Information Processing & Management1988; 24: 513–523.
23.
HartiganJA. Clustering algorithms. John Wiley & Sons, Inc., 1975.
24.
HatzivassiloglouVMcKeownKR. Predicting the semantic orientation of adjectives. In: 35th annual meeting of the Association for Computational Linguistics and eighth conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 1997, pp. 174–181.
25.
PorterMF. An algorithm for suffix stripping. Program: electronic library and information systems1993; 14(3): 130–137.
26.
BenamaraFCesaranoCPicarielloAReforgiatoDSubrahmanianV. Sentiment analysis: adjectives and adverbs are better than adjectives alone. International conference web-logs and social media (ICwsm 07), 2007.
27.
ToutanovaKManningCD. Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: 2000 joint SIGDAT conference on empirical methods in natural language processing and very large corpora: held in conjunction with the 38th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2000, pp. 63–70.
28.
MantheyBRglinH. Improved smoothed analysis of the k-means method. In: Twentieth annual ACM-SIAM symposium on discrete algorithms. Society for Industrial and Applied Mathematics, 2009, pp. 461–470.