Skip to main content
WorldCist'23 - 11st World Conference on Information Systems and Technologies

Full Program »

Detection of Racism On Multilingual Social Media: An Nlp Approach

This paper presents a comparison between various text vectorization and machine learn-ing algorithms for solving the problem of detection of racism on multi-lingual social media. We train classification models on Facebook comments and tweets in three differ-ent languages: English, French and Arabic. Our findings suggest that for the English-language comments, the combination of KNN with TF-IDF works best with an accuracy of 78.34%, while for French, the use of the SVM classifier with BOW provides an accu-racy of 82.56%. For Arabic we obtain an accuracy of 91.13% when KNN is coupled with BOW. Overall, our results suggest that the combination of SVM and TF-IDF is the best choice for detection of racism on social media that contains content in English, French and Arabic at the same time. As part of this work, we also present a new annotat-ed dataset of social media comments in three languages.

Ikram El Miqdadi
Faculty of Sciences and Techniques of Fez

Jamal Kharroubi
Faculty of Sciences and Techniques of Fez

Nikola S. Nikolov
University of Limerick


Powered by OpenConf®
Copyright ©2002-2022 Zakon Group LLC