Sub-Sequence with Gap
Sequence Matching with Gap¶
Tools for matching sequences (including strings), with gaps allowed between matching items.
issubseqwithgap
¶
issubseqwithgap(a, b)
checks if a
is a sub-sequence of b
, where gaps are allowed.
list(issubseqwithgap( [0, 1, 1], [0, 0, 1, 0, 1, 0] )) # True
bestsubseqwithgap
¶
bestsubseqwithgap(a, key)
finds the best sub-sequence of a
that maximizes the key function key
, where gaps are allowed.
Warning
This function reads the sequence at once.
bestsubseqwithgap([1, -2, 3, -4, 5, -6], sum) # [1, 3, 5]
findallsubseqswithgap
¶
findallsubseqswithgap(a, b, overlap=False)
returns all the positions where a
is a sub-sequence of b
.
-
In default, no overlapping is allowed. You can change this behavior by specify
overlap
. -
Unlike other function in
seqtools
, empty list is returned whena
is empty.
Warning
This function reads all sequences at once.
list(findallsubseqswithgap( [0, 1, 1], # [ 0, 1, 1], [0, 0, 1, 0, 1, 0, 1, 1] )) # [[0, 2, 4], [1, 6, 7]] # Enumerates all the possible matchings. list(findallsubseqswithgap( [0, 1, 1], # [0, 1, 1], # ... # [ 0, 1, 1], [0, 0, 1, 0, 1, 0, 1, 1], overlap=True )) # [[0, 2, 4], # [0, 2, 6], # [0, 2, 7], # [0, 4, 6], # [0, 4, 7], # [0, 6, 7], # [1, 2, 4], # [1, 2, 6], # [1, 2, 7], # [1, 4, 6], # [1, 4, 7], # [1, 6, 7], # [3, 4, 6], # [3, 4, 7], # [3, 6, 7], # [5, 6, 7]]
findsubseqwithgap
¶
findsubseqwithgap(a, b)
returns the matching positions where a
is a sub-sequence of b
, where gaps are allowed, or None
when not found.
list(findsubseqwithgap( [0, 1, 1], [0, 0, 1, 0, 1, 0] )) # [0, 2, 4]
commonsubseqwithgap
¶
commonsubseqwithgap(a, b)
finds the longest common sub-sequence among two sequences a
and b
, where gaps are allowed.
Warning
This function reads all sequences at once.
Tip
To work on more than two sequences, please refer to PrefixSpan-py using the following snippet.
max((p for _, p in PrefixSpan(seqs).frequent(len(seqs))), key=len)
list(commonsubseqwithgap( [0, 1, 1, 0, 1], [0, 0, 1, 1, 1] )) # [0, 1, 1, 1]
Sub-Sequence Enumeration with Gap¶
Tools for enumerating sub-sequences with gap.
enumeratesubseqswithgap
¶
enumeratesubseqswithgap(seq)
enumerates all of seq
‘s non-empty sub-sequences in lexicographical order.
- Although
seq
is a sub-sequence of itself, it is not returned.
Warning
This function reads the sequence at once.
list(enumeratesubseqswithgap([0, 1, 0, 2])) # [(0,), # (1,), # (0,), # (2,), # (0, 1), # (0, 0), # (0, 2), # (1, 0), # (1, 2), # (0, 2), # (0, 1, 0), # (0, 1, 2), # (0, 0, 2), # (1, 0, 2)]