LingPy

This documentation is for version 2.0.dev, which is not released yet.

lingpy.basic.spreadsheet.Spreadsheet

class lingpy.basic.spreadsheet.Spreadsheet(filename, fileformat=None, dtype=None, comment='#', sep='t', header=0, concepts=0, languages=[], blacklist='', conf='')

Basic class for reading spreadsheet data that has been outputted into a deliminted format, e.g. tab.

# workflow

  1. dump to delimited format
  2. csv2list(fileformat)

2. pass this module arguments (header:linenumber, data:rownumber, default 0:ids, 1:concepts, 2-n:languages) 2. col and row names (range or integer) - can tell the number of languages and concepts, etc. 2. irrelevant order (loop over the dictionary and use alias dictionary; doculect / language / ) 2. header 2. spreadsheet.rc **keywords, use aliases x. define for the output - keyword for output e.g. full rows, or rows >= n, also have to define what we have; want specific languages, specific cognate IDs x. black list parsing (no empty cells, etc.) x. separator for multientries for keywords in the output as ”,” as default, etc. (list of separators) x. parse as a list and do a type check x. then flip into harry potter format x. then flip into wordlist format x. then add tokenization / orthographic parsing

# add stats to harry potter output

Methods

get_matrix_full_rows() Create a 2D matrix from only the full rows in the spreadsheet.
output(fileformat, **keywords) Method to output the spreadsheet data into other formats.
pprint()
print_doculect_character_counts([doculects])
print_matrix(matrix) Print a matrix in tab delimited format
print_matrix_stats() Convenience function to get some stats data about the spreadsheet
print_qlc_format() Print “simple” QLC format.

This Page