Content area
Full Text
Lang Resources & Evaluation (2013) 47:239268 DOI 10.1007/s10579-012-9196-x
Antonio Reyes Paolo Rosso Tony Veale
Published online: 24 July 2012 Springer Science+Business Media B.V. 2012
Abstract Irony is a pervasive aspect of many online texts, one made all the more difcult by the absence of face-to-face contact and vocal intonation. As our media increasingly become more social, the problem of irony detection will become even more pressing. We describe here a set of textual features for recognizing irony at a linguistic level, especially in short texts created via social media such as Twitter postings or tweets. Our experiments concern four freely available data sets that were retrieved from Twitter using content words (e.g. Toyota) and user-generated tags (e.g. #irony). We construct a new model of irony detection that is assessed along two dimensions: representativeness and relevance. Initial results are largely positive, and provide valuable insights into the gurative issues facing tasks such as sentiment analysis, assessment of online reputations, or decision making.
Keywords Irony detection Figurative language processing Negation
Web text analysis
1 Introduction
Web-based technologies have become a signicant source of data in a variety of scientic and humanistic disciplines, and provide a rich vein of information that is easily mined. User-generated Web 2.0 content (such as text, audio and images)
A. Reyes (&) P. Rosso
Natural Language Engineering Lab, ELiRF, Universidad Politcnica de Valencia, Valencia, Spain e-mail: [email protected]
P. Rossoe-mail: [email protected]
T. Veale
School of Computer Science and Informatics, University College Dublin, Dublin, Ireland e-mail: [email protected]
A multidimensional approach for detecting irony in Twitter
123
240 A. Reyes et al.
provides knowledge that is topical, task-specic, and dynamically updated to broadly reect changing trends, behavior patterns and social preferences. Consider, for instance, the work described in Pang et al. (2002) which shows the role of implicit knowledge in automatically determining the subjectivity and polarity of movie reviews, or the ndings reported in Balog et al. (2006) regarding the role of user-generated tags for analyzing mood patterns among bloggers.
This paper deals with a specic aspect of human communication that relies precisely on this kind of information: irony. This linguistic phenomenon, which is widespread in web content, has important implications for tasks such as sentiment analysis (cf. Reyes et al. 2009 about the importance of...