

It used to be tar.gz historically, the switch to stronger compression must have saved a lot of bandwidth on the Linux mirrors. The strongest and slowest algorithms are ideal to compress a single time and decompress many times.įor example, linux packages are distributed as (lzma) for the last few years. It's mostly lzo, lz4 (facebook) and snappy (google). The fast algorithms are around 1 GB/s and above, a whole gigabyte that is correct, at both compression and decompression.Note that deflate is on the lower end while zstd is on the higher end. It's mostly deflate (used by gzip) and zstd (facebook). The medium are in the 10 - 500 MB/s range at compression.It's mostly LZMA derivatives (LZMA, LZMA2, XZ, 7-zip default), bzip2 and brotli (from google). The slow are in the 0 - 10 MB/s range at compression.Let's split the compressors in categories: the slow, the medium and the fast:

There are some bugs or edge cases to account for so you should always test your implementation against your use case.įor instance kafka have offered snappy compression for a few years (off by default) but the buffers are misconfigured and it cannot achieve any meaningful compression. It has similar results to everything else that is based on deflate (particularly the zlib library).
#Snappy compression sles archive#
A C library well-optimized over a decade should do a bit better than a random java lib from github.įor example, gzip designates both the tool and its archive format (specific to that tool) but it's based on deflate. The algorithm family is the most defining characteristic by far, then comes the implementation. (gzip, tar, 7-zip, zlib, liblzma, libdeflate, etc.) A tool or a library, also knows as the implementation.An algorithm, with adjustable settings.It's easier to understand the comparison once you realize that a compressor is just the combination of the following 3 things. The large amount of compressors and the similarity between them can cause confusion. It's tested against the low level C libraries with the available flags. The following benchmark cover the most common compression methods.
