threshold : float
The threshold which is used for the flat cluster analysis.
score_mode : { ‘library’, ‘sca’, ‘turchin’, ‘edit-dist’, ‘edit-tokens’ }
Define the score_mode on which the calculation of pairwise
distances is based. Select between:
- ‘library’ – the distance scores are based on the
language-specific scoring schemes as described in
List2012b (this is the default),
- ‘sca’ – the distance scores are based on the
language-independent SCA distance (see List2012b),
- ‘turchin’ – the distance scores are based on the approach
described in Turchin2010,
- ‘edit-dist”’ – the distance scores are based on the normalized
edit distance (Levenshtein1966), and
- ‘edit-tokens’ – the distance scores are based on the normalized
edit distance, yet the scores are derived from the tokenized
representation of the sequences and not from their raw,
untokenized form.
model : string (default=”sca”)
A string indicating the name of the
Model object that shall be used
for the analysis.
Currently, three models are supported:
- “dolgo” – a sound-class model based on Dolgopolsky1986,
- “sca” – an extension of the “dolgo” sound-class model based on
List2012a, and
- “asjp” – an independent sound-class model which is based on the
sound-class model of Brown2008 and the empirical data of
Brown2011.
merge_vowels : bool (default=True)
Indicate, whether neighboring vowels should be merged into
diphtongs, or whether they should be kept separated during the
analysis.
gop : int (default=-5)
The gap opening penalty (gop) on which the analysis shall be based.
gep_scale : float (default=0.6)
The factor by which the penalty for the extension of gaps (gap
extension penalty, GEP) shall be decreased. This approach is
essentially inspired by the extension of the basic alignment
algorithm for affine gap penalties by Gotoh1982.
scale : tuple or list (default=(3,1,2))
The scaling factors for the modificaton of gap weights. The first
value corresponds to sites of ascending sonority, the second value
to sites of maximum sonority, and the third value corresponds to
sites of decreasing sonority.
factor : float (default=0.3)
The factor by which the initial and the descending position shall
be modified.
restricted_chars : string (default=”T”)
Define which characters of the prosodic string of a sequence
reflect its secondary structure (cf. List2012a) and should
therefore be aligned specifically. This defaults to “T”, since this
is the character that represents tones in the prosodic strings of
sequences.
pairwise_threshold : float (default=0.7)
Only those sequence pairs whose distance is beyond this threshold
will be considered when determining the distribution of attested
segment pairs.
runs : int (default=100)
Define how many times the perturbation method shall be carried out
in order to retrieve the expected distribution of segment pairs.
modes : tuple or list (default = (“global”,”local”))
Define the alignment modes of the pairwise analyses which are
carried out in order to create the language-specific scoring scheme.
ratio : tuple (default=(1,1))
Define the ratio by which the traditional scoring scheme and the
correspondence-based scoring scheme contribute to the actual
library-based scoring scheme.
mode : string (default = “overlap”)
Define the alignment mode which is used in order to calculate
pairwise distance scores from the language-specific scoring
schemes.
|