Basic class for handling lexicostatistical datasets.
Parameters : | infile : file
|
---|
Notes
The LexStat class serves as the base class for the handling of lexicostatistical datasets (see Swadesh1955 for a detailed description of the method of lexicostatistics). It provides methods for data conversion, when analyses on cognacy have been conducted in a qualitative way, and also allows to carry out cognate judgments automatically, based on the different methods described in List2012a.
The input data for LexStat is a simple tab-delimited text file with the language names in the first row, an ID in the first column, and the data in the columns corresponding to the language names. Additionally, the file can contain headwords corresponding to the IDs and cognate-IDs, specifying which words in the data are thought to be cognate. This structure is almost the same as the one employed in the Starling database program (see http://starling.rinet.ru). Synonyms are also specified in the same way by simply adding additional rows with the same ID. The following is an example for the possible structure of an input file:
ID Word German COG English COG ...
1 hand hantʰ 1 hæːnd 1 ...
2 fist faustʰ 2 fist 2 ...
... ... ... ... ... ... ...
Methods
analyze(threshold[, score_mode, model, ...]) | Conduct automatic cognate judgments following the method of List2012b. |
output([fileformat, filename]) | Write the data to file. |
pairwise_distances() | Calculate the lexicostatistical distance between all taxa. |