Fast zips

The goal: “process” a zip file as fast as possible.

There are 3 ways to read zip files in the java standard library

ZipInputStream, dates from Java 1.1

ZipFile, dates from Java 1.1

the “ZipFileSystem”, dates from Java 7 I think

There are 2 ways to write them?

ZipOutputStream

ZipFileSystem (again)

This thing

Someone figured out how to reach into ZipOutputStream internals so you can parallelize the deflating procedure

https://github.com/gregsh/parallel-zip

Basically the trick:

How can it be improved ?

On zip

java’s ZipInputStream is not actually conforming, because the central directory is supposed to be the source of truth for zip files. like, if the central directory doesn’t refer to a file, it’s supposed to be “deleted”. In practice, though: Meh.