sortedtools
Danger
For most tools here except issorted, each sequence must already be sorted.
Info
Tools in seqtools can also be applied here. sortedtools only contains tools that either are unique to the concept of sorted sequence or have more efficient implementations.
Sequence Check¶
issorted¶
issorted(seq, key=None) returns if sequence seq is already sorted, optionally according to the key function key.
issorted([1, 2, 2, 3]) # True
Sequence Matching¶
Tools for matching sorted sequences.
issubsorted¶
issubsorted(a, b, key=None) checks if a is a sorted sub-sequence of b, optionally according to the key function key.
- When both
aandbare sorted sets with no duplicate element, equal toset(a) <= set(b)but more efficient.
issubsorted( [1, 2, 2, 3], [1, 2, 2, 3, 4, 4] ) # True
sortedall¶
sortedall(a, b, key=None) returns the elements in either a or b, optionally according to the key function key.
Success
When both a and b are sorted multisets, equal to the union of a and b but more efficient.
list(sortedall( [1, 2, 2, 3], [ 2, 3, 4, 4] )) # [1, 2, 2, 3, 4, 4]
sortedcommon¶
sortedcommon(a, b, key=None) returns the common elements between a and b, optionally according to the key function key.
Success
When both a and b are sorted multisets, equal to the intersection of a and b but more efficient.
list(sortedcommon( [1, 2, 2, 3], [ 2, 3, 4, 4] )) # [2, 3]
sorteddiff¶
sorteddiff(a, b, key=None) returns the elements only in a and not in b, optionally according to the key function key.
Success
When both a and b are sorted multisets, equal to the difference between a and b but more efficient.
list(sorteddiff( [1, 2, 2, 3], [ 2, 3, 4, 4] )) # [1, 2]
sortedalone¶
sortedalone(a, b, key=None) returns the elements not in both a and b, optionally according to the key function key.
Success
When both a and b are sorted multisets, equal to the difference between the union of a and b and the intersection of a and b but more efficient.
list(sortedalone( [1, 2, 2, 3], [ 2, 3, 4, 4] )) # [1, 2, 4, 4]
Sequence Alignment and Join¶
Tools for aligning and joining sorted sequences.
matchingfrequencies¶
matchingfrequencies(*seqs, key=None) returns each item and the respective number of sequences in seqs contains it.
- Optional key function
keycan be specified.
Success
This implementation is space efficient. If there are sequences, only space is used.
sortedtools.matchingfrequencies is more efficient than seqtools.matchingfrequencies.
Tip
For the frequency of each item within a single sequence, use toolz.itertoolz.frequencies.
list(matchingfrequencies( [1, 2, 2, 3], [ 2, 3, 4, 5], [1, 3, 3, 4] )) # [(1, 2), (2, 2), (3, 3), (4, 2), (5, 1)]
sortedmatch¶
sortedmatch(a, b, default=None) matches two sorted sequences a and b in pairs, such that the total number of matching pairs is maximized.
- If there are multiple alignments having the same number, the leftmost one is returned.
Success
sortedmatch is more efficient than seqtools.match.
list(sortedmatch( [1, 2, 2, 3], [ 2, 3, 4, 4] )) # [(1, None), # (2, 2), # (2, None), # (3, 3), # (None, 4), # (None, 4)]
sortedjoin¶
sortedjoin(leftseq, rightseq, leftkey=None, rightkey=None, leftdefault=no_default, rightdefault=no_default) joins two sequences, optionally according to leftkey and rightkey, respectively. Outer join is also supported.
- Two sequences must be already sorted according to
leftkeyandrightkey, respectively.
Success
sortedjoin is more efficient than seqtools.join and its underneath toolz.itertools.join.
list(sortedjoin( [ -1, -1, -2, -4, -5, -6], [0, 1, 1, 2, 3, 4, 5, 5], leftkey=abs, leftdefault=None )) # [(None, 0), # (-1, 1), # (-1, 1), # (-1, 1), # (-1, 1), # (-2, 2), # (None, 3), # (-4, 4), # (-5, 5), # (-5, 5)]