The Role of Unstructured Data in Real-Time Disaster-related Social Media Monitoring

doi:10.5281/zenodo.1149056

Published December 11, 2017 | Version v1

Conference paper Open

The Role of Unstructured Data in Real-Time Disaster-related Social Media Monitoring

1. CELI Language Technology

Social media can be an important, constantly updated, source of information concerning natural disasters. User-generated, free text messages contain useful elements for the three main phases of disaster management: awareness/early warning, response, post-disaster assessments. However, most of the previous research focus on studying contents collected in relation to specific events. More work can be done in extending Information Extraction tasks to continuous streams of documents (potentially) hazard-related, regardless of time or location. We describe a Natural Language Processing architecture, employed in our study, to collect and monitor keywordbased streams, associated to different languages and event types. Starting from existing work, we review the definitions of disaster-related Information Types and Informativeness to better capture relevant and interesting items in the newly defined streams. To act as both a guideline in this procedure and a gold standard in automatic classification we created and annotated a multi-language, multi-hazard corpus of more than 10,000 tweets, sampled from our collected data-streams. We conclude by discussing the methodology behind and the results achieved by rule-based classifiers that we developed using domain and linguistic knowledge. Our approach is found to be viable in performing Information Extraction on generic, hazard-related (but noisy), social media data streams.

Notes

© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Files

15_Tarasconi_et_al_2017_DSEM.pdf

Files (263.5 kB)

Name	Size	Download all
15_Tarasconi_et_al_2017_DSEM.pdf md5:caf05557fcb83a0354df45864d25ae45	263.5 kB	Preview Download

Additional details

I-REACT – Improving Resilience to Emergencies through Advanced Cyber Technologies 700256: European Commission

	All versions	This version
Views	223	222
Downloads	235	234
Data volume	64.8 MB	64.6 MB

The Role of Unstructured Data in Real-Time Disaster-related Social Media Monitoring

Creators

Description

Notes

Files

15_Tarasconi_et_al_2017_DSEM.pdf

Files (263.5 kB)

Additional details

Funding