Conference paper Open Access

The Role of Unstructured Data in Real-Time Disaster-related Social Media Monitoring

Tarasconi, Francesco; Farina, Michela; Mazzei, Antonio; Bosca, Alessio

Social media can be an important, constantly updated, source of information concerning natural disasters. User-generated, free text messages contain useful elements for the three main phases of disaster management: awareness/early warning, response, post-disaster assessments. However, most of the previous research focus on studying contents collected in relation to specific events. More work can be done in extending Information Extraction tasks to continuous streams of documents (potentially) hazard-related, regardless of time or location. We describe a Natural Language Processing architecture, employed in our study, to collect and monitor keywordbased streams, associated to different languages and event types. Starting from existing work, we review the definitions of disaster-related Information Types and Informativeness to better capture relevant and interesting items in the newly defined streams. To act as both a guideline in this procedure and a gold standard in automatic classification we created and annotated a multi-language, multi-hazard corpus of more than 10,000 tweets, sampled from our collected data-streams. We conclude by discussing the methodology behind and the results achieved by rule-based classifiers that we developed using domain and linguistic knowledge. Our approach is found to be viable in performing Information Extraction on generic, hazard-related (but noisy), social media data streams.

© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Files (263.5 kB)
Name Size
15_Tarasconi_et_al_2017_DSEM.pdf
md5:caf05557fcb83a0354df45864d25ae45
263.5 kB Download
48
25
views
downloads
All versions This version
Views 4848
Downloads 2525
Data volume 6.6 MB6.6 MB
Unique views 4343
Unique downloads 2323

Share

Cite as