LingPy

This documentation is for version 2.0.dev, which is not released yet.

lingpy.sequence.sound_classes.pid

lingpy.sequence.sound_classes.pid(almA, almB, mode=2)

Calculate the Percentage Identity (PID) score for aligned sequence pairs.

Parameters :

almA, almB : string or list

The aligned sequences which can be either a string or a list.

mode : { 1, 2, 3, 4, 5 }

Indicate which of the four possible PID scores described in Raghava2006 should be calculated, the fifth possibility is added for linguistic purposes:

  1. identical positions / (aligned positions + internal gap positions),
  2. identical positions / aligned positions,
  3. identical positions / shortest sequence, or
  4. identical positions / shortest sequence (including internal gap pos.)
  5. identical positions / (aligned positions + 2 * number of gaps)
Returns :

score : float

The PID score of the given alignment as a floating point number between 0 and 1.

See also

lingpy.compare.Multiple.get_pid,

Notes

The PID score is a common measure for the diversity of a given alignment. The implementation employed by LingPy follows the description of Raghava2006 where four different variants of PID scores are distinguished. Essentially, the PID score is based on the comparison of identical residue pairs with the total number of residue pairs in a given alignment.

Examples

Load an alignment from the test suite.

>>> from lingpy import *
>>> pairs = PSA(get_file('test.psa'))

Extract the alignments of the first aligned sequence pair.

>>> almA,almB,score = pairs.alignments[0]

Calculate the PID score of the alignment.

>>> pid(almA,almB)
0.44444444444444442

This Page