

Sub-Sequence without Gap

Sequence Matching without Gap¶

Tools for matching sequences (including strings), without gaps allowed between matching items.

`issubseq`¶

issubseq(a, b) checks if a is a sub-sequence of b.

Warning

This function reads the first sequence at once.

issubseq(
    [   0, 1, 0],
    [0, 0, 1, 0, 1, 0]
)
# True

`bestsubseq`¶

bestsubseq(a, key) finds the best sub-sequence of a that maximizes the key function key.

Warning

This function reads the sequence at once.

bestsubseq([1, -2, 3, -4, 5, -6], sum)
# [5]

`findallsubseqs`¶

findallsubseqs(a, b, overlap=False) returns all the positions where a is a sub-sequence of b.

In default, no overlapping is allowed. You can change this behavior by specify overlap.
Unlike other function in seqtools, empty list is returned when a is empty.

Warning

This function reads the first sequence at once.

list(findallsubseqs(
    [   0, 1, 0],
    [0, 0, 1, 0, 1, 0]
))
# [1]

list(findallsubseqs(
    [   0, 1, 0],
  # [         0, 1, 0],
    [0, 0, 1, 0, 1, 0],
    overlap=True
))
# [1, 3]

`findsubseq`¶

findsubseq(a, b) returns the first position where a is a sub-sequence of b, or -1 when not found.

Warning

This function reads the first sequence at once.

findsubseq(
    [   0, 1, 0],
    [0, 0, 1, 0, 1, 0]
)
# 1

`commonsubseq`¶

commonsubseq(a, b) finds the longest common sub-sequence among two sequences a and b.

Warning

This function reads all sequences at once.

list(commonsubseq(
    [   0, 1, 1,   0, 1],
    [0, 0, 1, 1, 1]
))
# [0, 1, 1]

Sub-Sequence Enumeration without Gap¶

Tools for enumerating sub-sequences without gap.

`enumeratesubseqs`¶

enumeratesubseqs(seq) enumerates all of seq‘s non-empty sub-sequences in lexicographical order.

Although seq is a sub-sequence of itself, it is not returned.

Warning

This function reads the sequence at once.

list(enumeratesubseqs([0, 1, 0, 2]))
# [[0],
#  [0, 1],
#  [0, 1, 0],
#  [1],
#  [1, 0],
#  [1, 0, 2],
#  [0],
#  [0, 2],
#  [2]]

`nonsharingsubseqs`¶

nonsharingsubseqs(*seqs, closed=True) finds all the non-sharing non-empty sub-sequences among seqs, such that the item of each sub-sequence only appears in any sequence of seqs containing that respective sub-sequence.

Each sub-sequence is a tuple of items.
closed=True can be specified to return only the longest sub-sequences, i.e. no sub-sequence of is a sub-sequence of another sub-sequence.

Warning

This function reads all sequences at once.

db = [
    [0, 1, 2, 3, 4],
    [1, 1, 1, 3, 4],
    [2, 1, 2, 2, 0],
    [1, 1, 1, 2, 2],
]

nonsharingsubseqs(db)
# {(0,): 2,
#  (1,): 4,
#  (2,): 3,
#  (3, 4): 2}

nonsharingsubseqs(db, closed=False)
# {(0,): 2,
#  (1,): 4,
#  (2,): 3,
#  (3,): 2,
#  (3, 4): 2,
#  (4,): 2}

Sequence Partition¶

Tools for sequence partition.

`partitionbysubseqs`¶

partitionbysubseqs(subseqs, seq) finds the partitions of sequence seq, according to a known sets of sub-sequences subseqs.

For unknown sub-sequences, the longest ones are outputted.

Warning

This function reads the sequence seq at once.

list(partitionbysubseqs(
    [
        (0,),
        (1,),
        (2,),
        (3, 4)
    ],
    [-1, 0, 1, 1, 5, 6, 3, 4, 7]
))
# [[-1], [0], [1], [1], [5, 6], [3, 4], [7]]

Sub-Sequence without Gap

Sequence Matching without Gap¶

issubseq¶

bestsubseq¶

findallsubseqs¶

findsubseq¶

commonsubseq¶

Sub-Sequence Enumeration without Gap¶

enumeratesubseqs¶

nonsharingsubseqs¶

Sequence Partition¶

partitionbysubseqs¶

`issubseq`¶

`bestsubseq`¶

`findallsubseqs`¶

`findsubseq`¶

`commonsubseq`¶

`enumeratesubseqs`¶

`nonsharingsubseqs`¶

`partitionbysubseqs`¶