Planned intervention: On Thursday March 28th 07:00 UTC Zenodo will be unavailable for up to 5 minutes to perform a database upgrade.
Published October 16, 2018 | Version 2.1
Dataset Open

SemFi - Finnish Semantic Database with Syntactic Relations

  • 1. University of Helsinki

Description

SemFi is a semantic database for Finnish in which the words are linked to each other by the syntactic relations and their frequency in a big corpus.

SemFi is based on the syntactic bigrams of The Finnish Internet Parsebank provided by Turku University.

The semfi.db file is an SQLite database and it is the one that should be used. The results_json.zip is mainly intended for those who are interested in working with SemUr which is a translated version of SemFi.

The previous version of this dataset has successfully been used in the hard AI task of creating Finnish poetry automatically. That data still powers the computationally creative system, Poem Machine.

More information and an online UI to browse the data is available on https://mikakalevi.com/semfi/.

Cite as

Hämäläinen, Mika. (2018). Extracting a Semantic Database with Syntactic Relations for Finnish to Boost Resources for Endangered Uralic Languages. In The Proceedings of Logic and Engineering of Natural Language Semantics 15 (LENLS15)

Files

results.zip

Files (4.1 GB)

Name Size Download all
md5:612cec6dc3ff172f8a742fd59f89c953
310.0 MB Preview Download
md5:a0f0da3b2ebea99fd3192ba8c551449c
3.8 GB Download
md5:3a63551e47b27075f38901c12a1a1e5f
8.0 kB Download

Additional details

Related works

Is referenced by
10.5281/zenodo.1454650 (DOI)