A metadata approach to classify domain-specific documents for Event-based Surveillance Systems - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement
Conference Papers Year : 2023

A metadata approach to classify domain-specific documents for Event-based Surveillance Systems

Abstract

Digital news sources are the primary source of information for health officials and stakeholders to stay informed about potential health risks. However, with the abundance of news sources available, it can be challenging to distinguish relevant news articles from irrelevant ones. To address this issue, we propose a metadata-based approach for classifying news articles containing information on health events. The first step involves extracting metadata from each news article in the dataset. We then use a machine learning model to classify news articles as relevant or irrelevant. The proposed approach was validated using two different datasets with varying combinations of relevant and irrelevant news articles. The experiments were conducted using a 70%-30% train-test split. The results of the experiments show that the proposed approach is highly effective in classifying relevant news articles for Event-based Surveillance System (EBS). Additionally, several metadata features were identified as being important for the classification task.
No file

Dates and versions

hal-04178896 , version 1 (08-08-2023)

Identifiers

Cite

Mehtab Alam Syed, Elena Arsevska, Mathieu Roche, Maguelonne Teisseire. A metadata approach to classify domain-specific documents for Event-based Surveillance Systems. 2023 International Conference on Communication, Computing and Digital Systems (C-CODE), May 2023, Islamabad, Pakistan. pp.1-5, ⟨10.1109/C-CODE58145.2023.10139883⟩. ⟨hal-04178896⟩
52 View
0 Download

Altmetric

Share

More