Enhancing georeferenced biodiversity inventories: automated information extraction from literature records reveal the gaps
- 1. LMU Münich
- 2. Chicago Field Museum
- 3. University of Oslo
Description
Data and code supplement to our article revised submission to PeerJ.
The file is compressed using standard zip. The uncompressed size is about 50 GB. There is a readme.md in the archive, which explains the structure of the contents.
Abstract:
We use natural language processing (NLP) to retrieve location data for cheilostome bryozoan species (text-mined occurrences [TMO]) in an automated procedure. We compare these results with data combined from two major public databases (DB): the Ocean Biogeographic Information System (OBIS), and the Global Biodiversity Information Facility (GBIF). Using DB and TMO data separately and in combination, we present latitudinal species richness curves using standard estimators (Chao2 and the Jackknife) and range-through approaches. Our combined DB and TMO species richness curves quantitatively document a bimodal global latitudinal diversity gradient for extant cheilostomes for the first time, with peaks in the temperate zones. 79% of the georeferenced species we retrieved from TMO (N = 1408) and DB (N = 4549) are non-overlapping. Despite clear indications that global location data compiled for cheilostomes should be improved with concerted effort, our study supports the view that many marine latitudinal species richness patterns deviate from the canonical latitudinal diversity gradient (LDG). Moreover, combining online biodiversity databases with automated information retrieval from the published literature is a promising avenue for expanding taxon-location datasets.
Files
locations_supplement_28jun2022.zip
Files
(6.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:b0d99897cc5cba4eb45c3f8495b5df6e
|
4.1 GB | Download |
|
md5:e1df642c9278c2399dde7c9110d615c6
|
2.5 GB | Preview Download |