Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT

In the first quarter of 2020, the World Health Organization (WHO) declared COVID-19 a public health emergency around the globe. Different users from all over the world shared their opinions about COVID-19 on social media platforms such as Twitter and Facebook. At the beginning of the pandemic, it became relevant to assess public opinions regarding COVID-19 using data available on social media. We used a recently proposed hierarchy-based measure for tweet analysis (H-TFIDF) for feature extraction over sentiment classification of tweets. We assessed how H-TFIDF and concatenation of H-TFIDF with bidirectional encoder representations from transformers (BH-TFIDF) perform over state-of-the-art bag-of-words (BOW) and term frequency-inverse document frequency (TF-IDF) features for sentiment classification of COVID-19 tweets. A uniform experimental setup of the training-test (90% and 10%) split scheme was used to train the classifier. Moreover, evaluation was performed with the gold standard expert labeled dataset to measure precision for each binary classified class.

Mots clés

text mining data mining data analysis twitter

Domaines

Ingénierie de l'environnement

Sylvie Blin-Sarah : Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-03662199

Soumis le : lundi 9 mai 2022-10:39:49

Dernière modification le : mardi 26 mars 2024-11:51:23

Dates et versions

hal-03662199 , version 1 (09-05-2022)

Identifiants

HAL Id : hal-03662199 , version 1
DOI : 10.5220/0010887800003123
WOS : 000778794900073

Citer

Mehtab Syed, Elena Arsevska, Mathieu Roche, Maguelonne Teisseire. Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT. 15th International Conference on Health Informatics, Feb 2022, Online Streaming, Belgium. pp.648-656, ⟨10.5220/0010887800003123⟩. ⟨hal-03662199⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CIRAD AGROPARISTECH CNRS TETIS INRAE INRAEOCCITANIEMONTPELLIER MATHNUM

100 Consultations

0 Téléchargements