sortedtools
Danger
For most tools here except issorted
, each sequence must already be sorted.
Info
Tools in seqtools
can also be applied here. sortedtools
only contains tools that either are unique to the concept of sorted sequence or have more efficient implementations.
Sequence Check¶
issorted
¶
issorted(seq, key=None)
returns if sequence seq
is already sorted, optionally according to the key function key
.
issorted([1, 2, 2, 3]) # True
Sequence Matching¶
Tools for matching sorted sequences.
issubsorted
¶
issubsorted(a, b, key=None)
checks if a
is a sorted sub-sequence of b
, optionally according to the key function key
.
- When both
a
andb
are sorted sets with no duplicate element, equal toset(a) <= set(b)
but more efficient.
issubsorted( [1, 2, 2, 3], [1, 2, 2, 3, 4, 4] ) # True
sortedall
¶
sortedall(a, b, key=None)
returns the elements in either a
or b
, optionally according to the key function key
.
Success
When both a
and b
are sorted multisets, equal to the union of a
and b
but more efficient.
list(sortedall( [1, 2, 2, 3], [ 2, 3, 4, 4] )) # [1, 2, 2, 3, 4, 4]
sortedcommon
¶
sortedcommon(a, b, key=None)
returns the common elements between a
and b
, optionally according to the key function key
.
Success
When both a
and b
are sorted multisets, equal to the intersection of a
and b
but more efficient.
list(sortedcommon( [1, 2, 2, 3], [ 2, 3, 4, 4] )) # [2, 3]
sorteddiff
¶
sorteddiff(a, b, key=None)
returns the elements only in a
and not in b
, optionally according to the key function key
.
Success
When both a
and b
are sorted multisets, equal to the difference between a
and b
but more efficient.
list(sorteddiff( [1, 2, 2, 3], [ 2, 3, 4, 4] )) # [1, 2]
sortedalone
¶
sortedalone(a, b, key=None)
returns the elements not in both a
and b
, optionally according to the key function key
.
Success
When both a
and b
are sorted multisets, equal to the difference between the union of a
and b
and the intersection of a
and b
but more efficient.
list(sortedalone( [1, 2, 2, 3], [ 2, 3, 4, 4] )) # [1, 2, 4, 4]
Sequence Alignment and Join¶
Tools for aligning and joining sorted sequences.
matchingfrequencies
¶
matchingfrequencies(*seqs, key=None)
returns each item and the respective number of sequences in seqs
contains it.
- Optional key function
key
can be specified.
Success
This implementation is space efficient. If there are sequences, only space is used.
sortedtools.matchingfrequencies
is more efficient than seqtools.matchingfrequencies
.
Tip
For the frequency of each item within a single sequence, use toolz.itertoolz.frequencies
.
list(matchingfrequencies( [1, 2, 2, 3], [ 2, 3, 4, 5], [1, 3, 3, 4] )) # [(1, 2), (2, 2), (3, 3), (4, 2), (5, 1)]
sortedmatch
¶
sortedmatch(a, b, default=None)
matches two sorted sequences a
and b
in pairs, such that the total number of matching pairs is maximized.
- If there are multiple alignments having the same number, the leftmost one is returned.
Success
sortedmatch
is more efficient than seqtools.match
.
list(sortedmatch( [1, 2, 2, 3], [ 2, 3, 4, 4] )) # [(1, None), # (2, 2), # (2, None), # (3, 3), # (None, 4), # (None, 4)]
sortedjoin
¶
sortedjoin(leftseq, rightseq, leftkey=None, rightkey=None, leftdefault=no_default, rightdefault=no_default)
joins two sequences, optionally according to leftkey
and rightkey
, respectively. Outer join is also supported.
- Two sequences must be already sorted according to
leftkey
andrightkey
, respectively.
Success
sortedjoin
is more efficient than seqtools.join
and its underneath toolz.itertools.join
.
list(sortedjoin( [ -1, -1, -2, -4, -5, -6], [0, 1, 1, 2, 3, 4, 5, 5], leftkey=abs, leftdefault=None )) # [(None, 0), # (-1, 1), # (-1, 1), # (-1, 1), # (-1, 1), # (-2, 2), # (None, 3), # (-4, 4), # (-5, 5), # (-5, 5)]