Sub-Sequence without Gap
Sequence Matching without Gap¶
Tools for matching sequences (including strings), without gaps allowed between matching items.
issubseq
¶
issubseq(a, b)
checks if a
is a sub-sequence of b
.
Warning
This function reads the first sequence at once.
issubseq( [ 0, 1, 0], [0, 0, 1, 0, 1, 0] ) # True
bestsubseq
¶
bestsubseq(a, key)
finds the best sub-sequence of a
that maximizes the key function key
.
Warning
This function reads the sequence at once.
bestsubseq([1, -2, 3, -4, 5, -6], sum) # [5]
findallsubseqs
¶
findallsubseqs(a, b, overlap=False)
returns all the positions where a
is a sub-sequence of b
.
-
In default, no overlapping is allowed. You can change this behavior by specify
overlap
. -
Unlike other function in
seqtools
, empty list is returned whena
is empty.
Warning
This function reads the first sequence at once.
list(findallsubseqs( [ 0, 1, 0], [0, 0, 1, 0, 1, 0] )) # [1] list(findallsubseqs( [ 0, 1, 0], # [ 0, 1, 0], [0, 0, 1, 0, 1, 0], overlap=True )) # [1, 3]
findsubseq
¶
findsubseq(a, b)
returns the first position where a
is a sub-sequence of b
, or -1
when not found.
Warning
This function reads the first sequence at once.
findsubseq( [ 0, 1, 0], [0, 0, 1, 0, 1, 0] ) # 1
commonsubseq
¶
commonsubseq(a, b)
finds the longest common sub-sequence among two sequences a
and b
.
Warning
This function reads all sequences at once.
list(commonsubseq( [ 0, 1, 1, 0, 1], [0, 0, 1, 1, 1] )) # [0, 1, 1]
Sub-Sequence Enumeration without Gap¶
Tools for enumerating sub-sequences without gap.
enumeratesubseqs
¶
enumeratesubseqs(seq)
enumerates all of seq
‘s non-empty sub-sequences in lexicographical order.
- Although
seq
is a sub-sequence of itself, it is not returned.
Warning
This function reads the sequence at once.
list(enumeratesubseqs([0, 1, 0, 2])) # [[0], # [0, 1], # [0, 1, 0], # [1], # [1, 0], # [1, 0, 2], # [0], # [0, 2], # [2]]
nonsharingsubseqs
¶
nonsharingsubseqs(*seqs, closed=True)
finds all the non-sharing non-empty sub-sequences among seqs
, such that the item of each sub-sequence only appears in any sequence of seqs
containing that respective sub-sequence.
-
Each sub-sequence is a tuple of items.
-
closed=True
can be specified to return only the longest sub-sequences, i.e. no sub-sequence of is a sub-sequence of another sub-sequence.
Warning
This function reads all sequences at once.
db = [ [0, 1, 2, 3, 4], [1, 1, 1, 3, 4], [2, 1, 2, 2, 0], [1, 1, 1, 2, 2], ] nonsharingsubseqs(db) # {(0,): 2, # (1,): 4, # (2,): 3, # (3, 4): 2} nonsharingsubseqs(db, closed=False) # {(0,): 2, # (1,): 4, # (2,): 3, # (3,): 2, # (3, 4): 2, # (4,): 2}
Sequence Partition¶
Tools for sequence partition.
partitionbysubseqs
¶
partitionbysubseqs(subseqs, seq)
finds the partitions of sequence seq
, according to a known sets of sub-sequences subseqs
.
- For unknown sub-sequences, the longest ones are outputted.
Warning
This function reads the sequence seq
at once.
list(partitionbysubseqs( [ (0,), (1,), (2,), (3, 4) ], [-1, 0, 1, 1, 5, 6, 3, 4, 7] )) # [[-1], [0], [1], [1], [5, 6], [3, 4], [7]]