Sage Journals: Discover world-class research

Abstract

Supervised Word Sense Disambiguation (WSD) systems use features of the target word and its context to learn about all possible samples in an annotated dataset. Recently, word embeddings have emerged as a powerful feature in many NLP tasks. In supervised WSD, word embeddings can be used as a high-quality feature representing the context of an ambiguous word. In this paper, four improvements to existing state-of-the-art WSD methods are proposed. First, we propose a new model for assigning vector coefficients for a more precise context representation. Second, we apply a PCA dimensionality reduction process to find a better transformation of feature matrices and train a more informative model. Third, a new weighting scheme is suggested to tackle the problem of unbalanced data in standard WSD datasets and finally, a novel idea is presented to combine word embedding features extracted from different independent corpora, which uses a voting aggregator among available trained models. All of these proposals individually improve disambiguation performance on Standard English lexical sample tasks, and using the combination of all proposed ideas makes a significant improvement in the accuracy score.

Keywords

Word sense disambiguation Word embedding Supervised learning Support vector machine

Get full access to this article

View all access options for this article.

References

Agirre

and Edmonds

. 2007. Word sense disambiguation: Algorithms and applications, volume33. Springer Science & Business Media.

Akbani

, Kwek

and Japkowicz

, 2004. Applying support vector machines to imbalanced datasets. , In European conference on machine learningpp. 39–50. Springer.

Bengio

, Ducharme

, Vincent

and Jauvin

, A neural probabilistic language model, Journal of Machine Learning Research 3 (2003), 1137–1155.

Chen

, Liu

and Sun

, A unified model for word sense representation and disambiguation. , In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014), pp. 1025–1035.

Edmonds

and Cotton

, Senseval-2: overview., In The Proceedings of the Second International Workshop on EvaluatingWord Sense Disambiguation Systems (2001) pp. 1–5. Association for Computational Linguistics.

Galar

, Fernandez

, Barrenechea

, Bustince

and Herrera

, A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 42(4) (2012), 463–484.

Globerson

, Chechik

, Pereira

and Tishby

, Euclidean embedding of co-occurrence data, Journal of Machine Learning Research 8 (2007), 2265–2295.

Guo

, Che

, Wang

and Liu

, Learning sense-specific word embeddings by exploiting bilingual resources. , In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers (2014), pp.497–507.

Iacobacci

, Taher Pilehvar

and Navigli

, Sensembed: Learning sense embeddings for word and relational similarity, In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) volume 1, (2015), pp.95–105.

10.

Iacobacci

, Taher Pilehvar

and Navigli

, Embeddings for word sense disambiguation: An evaluation study,, In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) volume 1, (2016), pp. 897–907.

11.

Joulin

, Grave

, Bojanowski

, Douze

, Jégou

and Mikolov

, Fasttext. zip: Compressing text classification models, arXiv preprint arXiv:1612.03651 2016.

12.

Kågebäck

and Salomonsson

, Word sense disambiguation using a bidirectional lstm, arXiv preprint arXiv:1606.03568 2016.

13.

Lample

, Ballesteros

, Subramanian

, Kawakami

and Dyer

, Neural architectures for named entity recognition, arXiv preprint arXiv:1603.01360 2016.

14.

Lebret

and Collobert

, Word emdeddings through hellinger pca, arXiv preprint arXiv:1312.5542 2013.

15.

and Jurafsky

, Do multi-sense embeddings improve natural language understanding? arXiv preprint arXiv:1506.01070 2015.

16.

Liu

, Joty

and Meng

, Fine-grained opinion mining with recurrent neural networks and word embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (2015), pp 1433–1443.

17.

Mihalcea

, Chklovski

and Kilgarriff

, The senseval-3 english lexical sample task. In Proceedings of SENSEVAL-3, the third international workshop on the evaluation of systems for the semantic analysis of text 2004.

18.

Mikolov

, Sutskever

, Chen

, Corrado

G.S.

and Dean

, Distributed representations of words and phrases and their compositionality. , In Advances in neural information processing systems (2013), pp.3111–3119.

19.

Miller

G.A.

, Wordnet: A lexical database for English, Communications of the ACM 38(11) (1995), 39–41.

20.

Navigli

and Paolo Ponzetto

, Babelnet: Building a very large multilingual semantic network . In Proceedings of the 48th annual meeting of the association for computational linguistics (2010) pp. 216–225. Association for Computational Linguistics.

21.

Neelakantan

, Shankar

, Passos

and McCallum

, Efficient non-parametric estimation of multiple embeddings per word in vector space, arXiv preprint arXiv:1504.06654 2015.

22.

Pelevina

, Arefyev

, Biemann

and Panchenko

, Making sense of word embeddings, ArXiv preprint arXiv:1708.03390 2017.

23.

Pennington

, Socher

and Manning

, Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (2014), pp 1532–1543.

24.

Pesaranghader

, Pesaranghader

, Matwin

and Sokolova

, One single deep bidirectional lstm network for word sense disambiguation of text data. In Advances in Artificial Intelligence: 31st Canadian Conference on Artificial Intelligence, Canadian AI 2018, Toronto, ON, Canada, May 8–11, 2018, Proceedings 31, (2018), pp. 96–107. Springer.

25.

Raganato

, Delli Bovi

and Navigli

, Neural sequence learning models for word sense disambiguation. . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, (2017), pp.1156–1167.

26.

Raunak

, Effective dimensionality reduction for word embeddings, arXiv preprint arXiv:1708.03629 2017.

27.

Rothe

and Schütze

, Autoextend: Extending word embeddings to embeddings for synsets and lexemes, arXiv preprint arXiv:1507.01127 2015.

28.

Taghipour

and Tou Ng

, Semi-supervised word sense disambiguation using word embeddings in general and specific domains. , In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2015), pp.314–323.

29.

Tian

, Dai

, Bian

, Gao

, Zhang

, Chen

and Liu

T-Y.

, A probabilistic model for learning multi-prototype word embeddings. . In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers (2014), pp. 151–160.

30.

Veropoulos

, Campbell

and Cristianini

et al., Controlling the sensitivity of support vector machines. , In Proceedings of the international joint conference on AI 55 (1999), 60.

31.

and Thang Vu

, Character composition model with convolutional neural networks for dependency parsing on morphologically rich languages, arXiv preprint arXiv:1705.10814 2017.

32.

Yuan

, Richardson

, Doherty

, Evans

and Altendorf

, Semi-supervised word sense disambiguation with neural models, arXiv preprint arXiv:1603.07012 2016.

33.

Zhong

and Tou Ng

, It makes sense: A widecoverage word sense disambiguation system for free text. , In Proceedings of the ACL 2010 system demonstrations (2010) pp. 78–83. Association for Computational Linguistics.

34.

Zou

W.Y.

, Socher

, Cer

and Manning

C.D.

, Bilingual word embeddings for phrase-based machine translation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, (2013), pp. 1393–1398.