Abstract

The rise of social networks has allowed misogynistic, xenophobic, and homophobic people to spread their hate-speech to intimidate individuals or groups because of their gender, ethnicity or sexual orientation. The consequences of hate-speech are devastating, causing severe depression and even leading people to commit suicide. Hate-speech identification is challenging as the large amount of daily publications makes it impossible to review every comment by hand. Moreover, hate-speech is also spread by hoaxes that requires language and context understanding. With the aim of reducing the number of comments that should be reviewed by experts, or even for the development of autonomous systems, the automatic identification of hate-speech has gained academic relevance. However, the reliability of automatic approaches is still limited specifically in languages other than English, in which some of the state-of-the-art techniques have not been analyzed in detail. In this work, we examine which features are most effective in identifying hate-speech in Spanish and how these features can be combined to develop more accurate systems. In addition, we characterize the language present in each type of hate-speech by means of explainable linguistic features and compare our results with state-of-the-art approaches. Our research indicates that combining linguistic features and transformers by means of knowledge integration outperforms current solutions regarding hate-speech identification in Spanish.

Details

Title

Evaluating feature combination strategies for hate-speech detection in Spanish using linguistic features and transformers

Author

García-Díaz, José Antonio¹

; Jiménez-Zafra, Salud María²

; García-Cumbreras, Miguel Angel²

; Valencia-García, Rafael¹

¹ Universidad de Murcia, Facultad de Informática, Murcia, Spain (GRID:grid.10586.3a) (ISNI:0000 0001 2287 8496)
² Universidad de Jaén, Computer Science Department, SINAI, CEATIC, Jaén, Spain (GRID:grid.21507.31) (ISNI:0000 0001 2096 9837)

Pages

2893-2914

Publication year

2023

Publication date

Jun 2023

Publisher

Springer Nature B.V.

ISSN

21994536

e-ISSN

21986053

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1007/s40747-022-00693-x

ProQuest document ID

2825544335

© The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Evaluating feature combination strategies for hate-speech detection in Spanish using linguistic features and transformers

Jump to:

Abstract

Details

Suggested sources