Metrics from Compressing the Human Genome with Six Programs

2017-05-09T21:59:32Z (GMT) by John Lees-Miller

Metrics from running six compression programs, namely brotli, bzip2, gzip, 7z, xz and zstd, on the human genome from the Human Genome project, in 2bit binary format. For each program, the dataset records

  1. the time it takes to compress the data, in seconds,
  2. the resulting compressed size, in bytes,
  3. the time it takes to decompress the data, in seconds, and
  4. peak memory usage, in kilobytes.
Timings are for an Amazon Web Services m3.medium instance.