README
Featured on GitHub’s Trending Python repos on May 25, 2018. Thank you so much for support!
145+ extra higher-level functional tools that go beyond standard library’s itertools
, functools
, etc. and popular third-party libraries like toolz
, fancy
, and more-itertools
.
-
Like
toolz
and others, most of the tools are designed to be efficient, pure, and lazy. Several useful yet non-functional tools are also included. -
While
toolz
and others target basic scenarios, this library targets more advanced and higher-level scenarios. -
A few useful CLI tools for respective functions are also installed. They are available as
extratools-[func]
.
Full documentation is available here.
Why this library?¶
Typical pseudocode has less than 20 lines, where each line is a higher-level description. However, when implementing, many lower-level details have to be filled in.
This library reduces the burden of writing and refining the lower-level details again and again, by including an extensive set of carefully designed general purpose higher-level tools.
Current status and future plans?¶
There are currently 140+ functions among 17 categories, 3 data structures, and 3 CLI tools.
- Currently adopted by TopSim and PrefixSpan-py.
This library is under active development, and new tools are added on weekly basis.
- Any idea or contribution is highly welcome.
Besides many other interesting ideas, I am planning to make the following updates in recent days/weeks/months.
-
Add
dicttools.unflatten
andjsontools.unflatten
. -
Add
trie
andsuffixtree
(according to generalized suffix tree). -
Update
seqtools.align
to support more than two sequences.
No plan to implement tools that are well covered by other popular libraries.
Which tools are available?¶
-
Function Categories:
debugtools
dicttools
gittools
graphtools
htmltools
jsontools
mathtools
misctools
printtools
rangetools
recttools
seqtools
settools
sortedtools
stattools
strtools
tabletools
-
Data Structures:
defaultlist
disjointsets
segmenttree
-
CLI Tools:
dicttools.remap
jsontools.flatten
stattools.teststats
Any example?¶
Here are ten examples out of our hundreds of tools.
jsontools.flatten(data, force=False)
flattens a JSON object by returning all the tuples, each with a path and the respective value.
import json from extratools.jsontools import flatten flatten(json.loads("""{ "name": "John", "address": { "streetAddress": "21 2nd Street", "city": "New York" }, "phoneNumbers": [ { "type": "home", "number": "212 555-1234" }, { "type": "office", "number": "646 555-4567" } ], "children": [], "spouse": null }""")) # {'name': 'John', # 'address.streetAddress': '21 2nd Street', # 'address.city': 'New York', # 'phoneNumbers[0].type': 'home', # 'phoneNumbers[0].number': '212 555-1234', # 'phoneNumbers[1].type': 'office', # 'phoneNumbers[1].number': '646 555-4567', # 'children': [], # 'spouse': None}
rangetools.gaps(covered, whole=(-inf, inf))
computes the uncovered ranges of the whole rangewhole
, given the covered rangescovered
.
from math import inf from extratools.rangetools import gaps list(gaps( [(-inf, 0), (0.1, 0.2), (0.5, 0.7), (0.6, 0.9)], (0, 1) )) # [(0, 0.1), (0.2, 0.5), (0.9, 1)]
recttools.heatmap(rect, rows, cols, points, usepos=False)
computes the heatmap within rectanglerect
by a grid ofrows
rows andcols
columns.
from extratools.recttools import heatmap heatmap( ((1, 1), (3, 4)), 3, 4, [(1.5, 1.25), (1.5, 1.75), (2.75, 2.75), (2.75, 3.5), (3.5, 2.5)] ) # {1: 2, 7: 1, 11: 1, None: 1} heatmap( ((1, 1), (3, 4)), 3, 4, [(1.5, 1.25), (1.5, 1.75), (2.75, 2.75), (2.75, 3.5), (3.5, 2.5)], usepos=True ) # {(0, 1): 2, (1, 3): 1, (2, 3): 1, None: 1}
setcover(whole, covered, key=len)
solves the set cover problem by covering the universe setwhole
as best as possible, using a subset of the covering setscovered
.
from extratools.settools import setcover list(setcover( { 1, 2, 3, 4, 5}, [{1, 2, 3}, {2, 3, 4}, {2, 4, 5}] )) # [{1, 2, 3}, {2, 4, 5}]
seqtools.compress(data, key=None)
compresses the sequencedata
by encoding continuous identical items to a tuple of item and count, according to run-length encoding.
from extratools.seqtools import compress list(compress([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])) # [(1, 1), (2, 2), (3, 3), (4, 4)]
mergeseqs(seqs, default=None, key=None)
merges the sequences of equal length inseqs
into a single sequences. ReturnsNone
if there is conflict in any position.
from extratools.seqtools import mergeseqs seqs = [ (0 , 0 , None, 0 ), (None, 1 , 1 , None), (2 , None, None, None), (None, None, None, None) ] list(mergeseqs(seqs[1:])) # [2, # 1, # 1, # None] list(mergeseqs(seqs)) # None
strtools.smartsplit(s)
finds the best delimiter to automatically split strings
. Returns a tuple of delimiter and split substrings.
from extratools.strtools import smartsplit smartsplit("abcde") # (None, # ['abcde']) smartsplit("a b c d e") # (' ', # ['a', 'b', 'c', 'd', 'e']) smartsplit("/usr/local/lib/") # ('/', # ['', 'usr', 'local', 'lib', '']) smartsplit("a ::b:: c :: d") # ('::', # ['a ', 'b', ' c ', ' d']) smartsplit("{1, 2, 3, 4, 5}") # (', ', # ['{1', '2', '3', '4', '5}'])
strtools.learnrewrite(src, dst, minlen=3)
learns the respective regular expression and template to rewritesrc
todst
.
from extratools.strtools import learnrewrite learnrewrite( "Elisa likes Apple.", "Apple is Elisa's favorite." ) # ('(.*) likes (.*).', # "{1} is {0}'s favorite.")
tabletools.parsebymarkdown(text)
parses a text of multiple lines to a table, according to Markdown format.
from extratools.tabletools import parsebymarkdown list(parsebymarkdown(""" | foo | bar | | --- | --- | | baz | bim | """)) # [['foo', 'bar'], # ['baz', 'bim']]
tabletools.hasheader(data)
returns the confidence (between0
and1
) of whether the first row of the tabledata
is header.
from extratools.tabletools import hasheader t = [ ['Los Angeles' , '34°03′' , '118°15′' ], ['New York City', '40°42′46″', '74°00′21″'], ['Paris' , '48°51′24″', '2°21′03″' ] ] hasheader(t) # 0.0 hasheader([ ['City', 'Latitude', 'Longitude'] ] + t) # 0.6666666666666666 hasheader([ ['C1', 'C2', 'C3'] ] + t) # 1.0
How to install?¶
This package is available on PyPI. Just use pip3 install -U extratools
to install it.
How to cite?¶
When using for research purpose, please cite this library as follows.
@misc{extratools, author = {Chuancong Gao}, title = {{extratools}}, howpublished = "\url{https://github.com/chuanconggao/extratools}", year = {2018} }
Any recommended library?¶
There are several great libraries recommended to use together with extratools
:
regex
sortedcontainers
toolz
sh