Overview
Warning
For all the functions except loadcsv and dumpcsv, the header must be removed for best result.
Info
A table is a list of rows having the same number of column values.
Warning
For tools related to specific tasks, please go to the respective documentation:
Table Transformation¶
Tools for table transformations.
transpose¶
transpose(data) returns the transpose of table data, i.e., switch rows and columns.
Tip
Useful to switch table data from row-based to column-based and backwards.
list(transpose([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])) # [(1, 4, 7), # (2, 5, 8), # (3, 6, 9)]
trim¶
trim(table, blank=None) removes any empty column or row.
blank=Noneis a list of characters denoting empty value. Default to whitespace characters.
Tip
Check the builtin function str.strip for details of the behavior with blank.
list(trim([ ['', 'a', 'b'], ['', '-', '-'], ['', 'c', 'd'] ], blank='-')) # [['a', 'b'], # ['c', 'd']]
mergecols¶
mergecols(cols, default=None, blank=None) merges the columns in cols into a single column. Returns None if there is conflict in any row.
-
A row has conflict if there are more than one valid values, where each valid value is not
Noneor empty.blank=Noneis a list of characters denoting empty value. Default to whitespace characters.
-
defaultis a placeholder when there are no valid value in one row.
Tip
Check the builtin function str.strip for details of the behavior with blank.
cols = list(transpose([ [ 0, None, 2, None], [ 0, 1, None, None], [None, 1, None, None], [ 0, None, None, None] ])) # Merge the last two columns. mergecols(cols[1:]) # [2, # 1, # 1, # None] # Merge all three columns. mergecols(cols) # None
sortedbycol¶
sortedbycol(data, key=None) sorts the columns of table data according to key function key over each column.
list(sortedbycol([ ["c1", "b1", "a2", "d1"], ["c2", "b1", "a1", "d1"], ], key=lambda col: col[0])) # [('a2', 'b1', 'c1', 'd1'), # ('a1', 'b1', 'c2', 'd1')]
filterbycol¶
filterbycol(data, key=None) filters the columns of table data according to key function key over each column.
list(filterbycol([ ["c1", "b1", "a2", "d1"], ["c2", "b1", "a1", "d1"], ], key=lambda col: len(set(col)) > 1)) # [('c1', 'a2'), # ('c2', 'a1')]
Table Join¶
Tools for join tables.
join¶
join(lefttable, righttable, leftkey, rightkey, leftjoin=False, rightjoin=False) joins two tables lefttable and righttable, according to their respective keys leftkey and rightkey.
-
Each key is a tuple of column IDs.
-
leftjoinandrightjoincontrol whether to perform inner join, left outer join, right outer join, or full outer join.
Danger
Both tables must be non-empty.
Info
This function is a thin wrapper of seqtools.join. Please refer to seqtools.join for other join scenarios.
Tip
Please refer to candidatekeys and foreignkeys on how to find primary/foreign-key automatically to join tables.
pt = [ ["a1", "b1", "c1", "d1"], ["a1", "b1", "c2", "d1"], ["a2", "b1", "c1", "d1"], ] pk = (0, 2) ft = [ ["c1", "b1", "a2", "d1"], ["c2", "b1", "a1", "d1"], ] fk = (2, 0) list(join(pt, ft, pk, fk)) # [(['a2', 'b1', 'c1', 'd1'], ['c1', 'b1', 'a2', 'd1']), # (['a1', 'b1', 'c2', 'd1'], ['c2', 'b1', 'a1', 'd1'])] list(join(pt, ft, pk, fk, leftjoin=True)) # [(['a2', 'b1', 'c1', 'd1'], ['c1', 'b1', 'a2', 'd1']), # (['a1', 'b1', 'c2', 'd1'], ['c2', 'b1', 'a1', 'd1']), # (['a1', 'b1', 'c1', 'd1'], [None, None, None, None])] list(join(pt, ft, pk, fk, rightjoin=True)) # [(['a2', 'b1', 'c1', 'd1'], ['c1', 'b1', 'a2', 'd1']), # (['a1', 'b1', 'c2', 'd1'], ['c2', 'b1', 'a1', 'd1'])] list(join(pt, ft, pk, fk, leftjoin=True, rightjoin=True)) # [(['a2', 'b1', 'c1', 'd1'], ['c1', 'b1', 'a2', 'd1']), # (['a1', 'b1', 'c2', 'd1'], ['c2', 'b1', 'a1', 'd1']), # (['a1', 'b1', 'c1', 'd1'], [None, None, None, None])]
CSV¶
Tools for processing CSV.
loadcsv¶
loadcsv(path, delimiter=',') loads a CSV file, from either a file path, a file object, or an iterable of strings.
s = """Los Angeles,34°03′,118°15′ New York City,40°42′46″,74°00′21″ Paris,48°51′24″,2°21′03″""" list(loadcsv(s.split('\n'))) # [['Los Angeles' , '34°03′' , '118°15′' ], # ['New York City', '40°42′46″', '74°00′21″'], # ['Paris' , '48°51′24″', '2°21′03″' ]]
dumpcsv¶
dumpcsv(path, data, delimiter=',') dumps a table data in CSV, to either a file path or a file object.