LingPy

lingpy.data.derive.compile_model

lingpy.data.derive.compile_model(model)

Function compiles customized sound-class models.

Parameters :

model : str

A string indicating the name of the model which shall be created.

See also

lingpy.data.model.Model, lingpy.data.derive.compile_diacritcs_and_vowels

Notes

A model is defined by a folder placed in data/models directory of the LingPy package. The name of the folder reflects the name of the model. It contains three files: the file converter, the file INFO, and the optional file scorer. The format requirements for these files are as follows:

INFO

The INFO-file serves as a reference for a given sound-class model. It can contain arbitrary information (and also be empty). If one wants to define specific characteristics, like the source, the compiler, the date, or a description of a given model, this can be done by employing a key-value structure in which the key is preceded by an @ and followed by a colon and the value is written right next to the key in the same line, e.g.:

@source: Dolgopolsky (1986)

This information will then be read from the INFO file and rendered when printing the model to screen with help of the print() function.

converter

The converter file contains all sound classes which are matched with their respective sound values. Each line is reserved for one class, precede by the key (preferably an ASCII-letter) representing the class:

B : ɸ, β, f, p͡f, p͜f, ƀ
E : ɛ, æ, ɜ, ɐ, ʌ, e, ᴇ, ə, ɘ, ɤ, è, é, ē, ě, ê, ɚ
D : θ, ð, ŧ, þ, đ
G : x, ɣ, χ
...    
scorer

The scorer file (which is optional) contains the graph of class-transitions which is used for the calculation of the scoring dictionary. Each class is listed in a separate line, followed by the symbols v,``c``, or t (indicating whether the class represents vowels, consonants, or tones), and by the classes it is directly connected to. The strength of this connection is indicated by digits (the smaller the value, the shorter the path between the classes):

A : v, E:1, O:1
C : c, S:2
B : c, W:2
E : v, A:1, I:1
D : c, S:2
...

The information in such a file is automatically converted into a scoring dictionary (see List2012b for details).

Based on the information provided by the files, a dictionary for the conversion of IPA-characters to sound classes and a scoring dictionary are created and stored as a binary. The model can be loaded with help of the Model class and used in the various classes and functions provided by the library.

Previous topic

lingpy.data.derive.compile_diacritics_and_vowels

Next topic

Cluster Algorithms (cluster)