LingPy

This documentation is for version 2.0.dev, which is not released yet.

lingpy.align.multiple.Multiple.lib_align

Multiple.lib_align(model=None, mode='global', modes=[('global', -2, 0.5), ('local', -1, 0.5)], scale=0.5, factor=0.3, tree_calc='neighbor', gap_weight=0.5, restricted_chars='T_', classes=True, sonar=True, scorer={})

Carry out a library-based progressive alignment analysis of the sequences.

In contrast to traditional progressive multiple sequence alignment approaches such as Feng1981 and Thompson1994, library-based progressive alignment Notredame2000 is based on a pre-processing of the data where the information given in global and local pairwise alignments of the input sequences is used to derive a refined scoring function (library) which is later used in the progressive phase.

Parameters :

model : { ‘dolgo’, ‘sca’, ‘asjp’ }

A string indicating the name of the Model object that shall be used for the analysis. Currently, three models are supported:

  • “dolgo” – a sound-class model based on Dolgopolsky1986,
  • “sca” – an extension of the “dolgo” sound-class model based on List2012b, and
  • “asjp” – an independent sound-class model which is based on the sound-class model of Brown2008 and the empirical data of Brown2011 (see the description in List2012.

mode : { ‘global’, ‘dialign’ }

A string indicating which kind of alignment analysis should be carried out during the progressive phase. Select between:

  • “global” – traditional global alignment analysis based on the Needleman-Wunsch algorithm Needleman1970,
  • “dialign” – global alignment analysis which seeks to maximize local similarities Morgenstern1996.

modes : list (default=[(‘global’,-10,0.6),(‘local’,-1,0.6)])

Indicate the mode, the gap opening penalties (GOP), and the gap extension scale (GEP scale), of the pairwise alignment analyses which are used to create the library.

gop : int (default=-5)

The gap opening penalty (GOP) used in the analysis.

gep_scale : float (default=0.6)

The factor by which the penalty for the extension of gaps (gap extension penalty, GEP) shall be decreased. This approach is essentially inspired by the exension of the basic alignment algorithm for affine gap penalties Gotoh1982.

factor : float (default=1)

The factor by which the initial and the descending position shall be modified.

tree_calc : { ‘neighbor’, ‘upgma’ }

The cluster algorithm which shall be used for the calculation of the guide tree. Select between neighbor, the Neighbor-Joining algorithm (Saitou1987), and upgma, the UPGMA algorithm (Sokal1958).

gap_weight : float (default=0)

The factor by which gaps in aligned columns contribute to the calculation of the column score. When set to 0, gaps will be ignored in the calculation. When set to 0.5, gaps will count half as much as other characters.

restricted_chars : string (default=”T”)

Define which characters of the prosodic string of a sequence reflect its secondary structure (cf. List2012b) and should therefore be aligned specifically. This defaults to “T”, since this is the character that represents tones in the prosodic strings of sequences.

This Page