Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Access content directly
Conference Papers Year : 2022

Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT

Abstract

In the first quarter of 2020, the World Health Organization (WHO) declared COVID-19 a public health emergency around the globe. Different users from all over the world shared their opinions about COVID-19 on social media platforms such as Twitter and Facebook. At the beginning of the pandemic, it became relevant to assess public opinions regarding COVID-19 using data available on social media. We used a recently proposed hierarchy-based measure for tweet analysis (H-TFIDF) for feature extraction over sentiment classification of tweets. We assessed how H-TFIDF and concatenation of H-TFIDF with bidirectional encoder representations from transformers (BH-TFIDF) perform over state-of-the-art bag-of-words (BOW) and term frequency-inverse document frequency (TF-IDF) features for sentiment classification of COVID-19 tweets. A uniform experimental setup of the training-test (90% and 10%) split scheme was used to train the classifier. Moreover, evaluation was performed with the gold standard expert labeled dataset to measure precision for each binary classified class.

Dates and versions

hal-03662199 , version 1 (09-05-2022)

Identifiers

Cite

Mehtab Syed, Elena Arsevska, Mathieu Roche, Maguelonne Teisseire. Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT. 15th International Conference on Health Informatics, Feb 2022, Online Streaming, Belgium. pp.648-656, ⟨10.5220/0010887800003123⟩. ⟨hal-03662199⟩
89 View
0 Download

Altmetric

Share

Gmail Facebook X LinkedIn More