Python API¶
This page provides detailed documentation for the tszip
Python API.
Usage example¶
Tszip can be used directly in Python to provide seamless compression and
decompression of tree sequences files. Here, we run an msprime simulation
and write the output to a .trees.tsz
file:
import msprime
import tszip
ts = msprime.simulate(10, random_seed=1)
tszip.compress(ts, "simulation.trees.tsz")
# Later, we load the same tree sequence from the compressed file.
ts = tszip.decompress("simulation.trees.tsz")
Note
For very small simulations like this example, the tszip file may be larger than the original uncompressed file.
API¶
-
tszip.
compress
(ts, destination, variants_only=False)[source]¶ Compresses the specified tree sequence and writes it to the specified path or file-like object. By default, fully lossless compression is used so that tree sequences are identical before and after compression. By specifying the
variants_only
option, a lossy compression can be used, which discards any information that is not needed to represent the variants (which are stored losslessly).Parameters: - ts (tskit.TreeSequence) – The input tree sequence.
- destination (str) – The string,
pathlib.Path
or file-like object we should write the compressed file to. - variants_only (bool) – If True, discard all information not necessary to represent the variants in the input file.
-
tszip.
decompress
(path)[source]¶ Decompresses the tszip compressed file and returns a tskit tree sequence instance.
Parameters: path (str) – The location of the tszip compressed file to load. Return type: tskit.TreeSequence Returns: A tskit.TreeSequence
instance corresponding to the the specified file.