LingPy

This documentation is for version 2.0.dev, which is not released yet.

lingpy.align.multiple.Multiple.iterate_similar_gap_sites

Multiple.iterate_similar_gap_sites(check='final', mode='global', gop=-3, scale=0.5, factor=0, gap_weight=1, restricted_chars='T_')

Iterative refinement based on the Similar Gap Sites heuristic.

This heuristic is fairly simple. The idea is to try to split a given MSA into partitions with identical gap sites.

Parameters :

check : { ‘final’, ‘immediate’ }

Specify when to check for improved sum-of-pairs scores: After each iteration (“immediate”) or after all iterations have been carried out (“final”).

mode : { ‘global’, ‘overlap’, ‘dialign’ }

A string indicating which kind of alignment analysis should be carried out during the progressive phase. Select between:

  • ‘global’ – traditional global alignment analysis based on the Needleman-Wunsch algorithm Needleman1970,
  • ‘dialign’ – global alignment analysis which seeks to maximize local similarities Morgenstern1996.
  • ‘overlap’ – semi-global alignment, where gaps introduced in the beginning and the end of a sequence do not score.

gop : int (default=-5)

The gap opening penalty (GOP) used in the analysis.

gep_scale : float (default=0.5)

The factor by which the penalty for the extension of gaps (gap extension penalty, GEP) shall be decreased. This approach is essentially inspired by the exension of the basic alignment algorithm for affine gap penalties Gotoh1982.

factor : float (default=0.3)

The factor by which the initial and the descending position shall be modified.

gap_weight : float (default=1)

The factor by which gaps in aligned columns contribute to the calculation of the column score. When, e.g., set to 0, gaps will be ignored in the calculation. When set to 0.5, gaps will count half as much as other characters.

This Page