Sub-Sequence without Gap
Sequence Matching without Gap¶
Tools for matching sequences (including strings), without gaps allowed between matching items.
issubseq¶
issubseq(a, b) checks if a is a sub-sequence of b.
Warning
This function reads the first sequence at once.
issubseq( [ 0, 1, 0], [0, 0, 1, 0, 1, 0] ) # True
bestsubseq¶
bestsubseq(a, key) finds the best sub-sequence of a that maximizes the key function key.
Warning
This function reads the sequence at once.
bestsubseq([1, -2, 3, -4, 5, -6], sum) # [5]
findallsubseqs¶
findallsubseqs(a, b, overlap=False) returns all the positions where a is a sub-sequence of b.
-
In default, no overlapping is allowed. You can change this behavior by specify
overlap. -
Unlike other function in
seqtools, empty list is returned whenais empty.
Warning
This function reads the first sequence at once.
list(findallsubseqs( [ 0, 1, 0], [0, 0, 1, 0, 1, 0] )) # [1] list(findallsubseqs( [ 0, 1, 0], # [ 0, 1, 0], [0, 0, 1, 0, 1, 0], overlap=True )) # [1, 3]
findsubseq¶
findsubseq(a, b) returns the first position where a is a sub-sequence of b, or -1 when not found.
Warning
This function reads the first sequence at once.
findsubseq( [ 0, 1, 0], [0, 0, 1, 0, 1, 0] ) # 1
commonsubseq¶
commonsubseq(a, b) finds the longest common sub-sequence among two sequences a and b.
Warning
This function reads all sequences at once.
list(commonsubseq( [ 0, 1, 1, 0, 1], [0, 0, 1, 1, 1] )) # [0, 1, 1]
Sub-Sequence Enumeration without Gap¶
Tools for enumerating sub-sequences without gap.
enumeratesubseqs¶
enumeratesubseqs(seq) enumerates all of seq‘s non-empty sub-sequences in lexicographical order.
- Although
seqis a sub-sequence of itself, it is not returned.
Warning
This function reads the sequence at once.
list(enumeratesubseqs([0, 1, 0, 2])) # [[0], # [0, 1], # [0, 1, 0], # [1], # [1, 0], # [1, 0, 2], # [0], # [0, 2], # [2]]
nonsharingsubseqs¶
nonsharingsubseqs(*seqs, closed=True) finds all the non-sharing non-empty sub-sequences among seqs, such that the item of each sub-sequence only appears in any sequence of seqs containing that respective sub-sequence.
-
Each sub-sequence is a tuple of items.
-
closed=Truecan be specified to return only the longest sub-sequences, i.e. no sub-sequence of is a sub-sequence of another sub-sequence.
Warning
This function reads all sequences at once.
db = [ [0, 1, 2, 3, 4], [1, 1, 1, 3, 4], [2, 1, 2, 2, 0], [1, 1, 1, 2, 2], ] nonsharingsubseqs(db) # {(0,): 2, # (1,): 4, # (2,): 3, # (3, 4): 2} nonsharingsubseqs(db, closed=False) # {(0,): 2, # (1,): 4, # (2,): 3, # (3,): 2, # (3, 4): 2, # (4,): 2}
Sequence Partition¶
Tools for sequence partition.
partitionbysubseqs¶
partitionbysubseqs(subseqs, seq) finds the partitions of sequence seq, according to a known sets of sub-sequences subseqs.
- For unknown sub-sequences, the longest ones are outputted.
Warning
This function reads the sequence seq at once.
list(partitionbysubseqs( [ (0,), (1,), (2,), (3, 4) ], [-1, 0, 1, 1, 5, 6, 3, 4, 7] )) # [[-1], [0], [1], [1], [5, 6], [3, 4], [7]]