Improving Named Entity Linking Corpora Quality

Weichselbraun, Albert; Brasoveanu, Adrian; Kuntschik, Philipp; Nixon, Lyndon

doi:10.5281/zenodo.3404911

Published September 11, 2019 | Version v1

Conference paper Open

Improving Named Entity Linking Corpora Quality

1. HTW Chur
2. MODUL Technology

Gold standard corpora and competitive evaluations play a key role in benchmarking named entity linking (NEL) performance and driving the development of more sophisticated NEL systems.
The quality of the used corpora and the used evaluation metrics are crucial in this process. We, therefore, assess the quality of three popular evaluation corpora, identifying four major issues which affect these gold standards: (i) the use of different annotation styles, (ii) incorrect and missing annotations, (iii) Knowledge Base evolution, (iv) and differences in annotating co-occurrences.
This paper addresses these issues by formalizing NEL annotations and corpus versioning which allows standardizing corpus creation, supports corpus evolution, and paves the way for the use of lenses to automatically transform between different corpus configurations. In addition, the use of clearly defined scoring rules and evaluation metrics ensures a better comparability of evaluation results.

Files

ranlp2019_poster (2).pdf

Files (617.0 kB)

Name	Size	Download all
ranlp2019_poster (2).pdf md5:22c72ed321971362f8c971a285c511c9	617.0 kB	Preview Download

Additional details

References: https://github.com/orbis-eval/corpus_quality_paper (URL)

European Commission
ReTV - Enhancing and Re-Purposing TV Content for Trans-Vector Engagement 780656

	All versions	This version
Views	1,132	1,128
Downloads	206	206
Data volume	129.6 MB	129.6 MB

ranlp2019_poster (2).pdf

Files (617.0 kB)

Related works

Funding

Improving Named Entity Linking Corpora Quality

Authors/Creators

Description

Files

ranlp2019_poster (2).pdf

Files (617.0 kB)

Additional details

Related works

Funding