jazzlib – an alternative for reading ZIP files in Java

Java had zip-reading capabilities for a long time, naturally because jar files are simply compressed zip files with some meta data. The needed classes reside in the java.util.zip namespace and are ZipInputStream and ZipEntry.

Recently, however, ZipInputStream gave me a huge headache. My use case was as simple as

  • read the zip entries of a list of zip files (each varying in size, but usually around 20MB)
  • skip to the zip entry that has a certain name (a single text file with only two bytes of contents)
  • read the contents of this zip entry and close the zip

Doing this for about 25 files took my Pentium D (2GHz) with 3GB of RAM roughly 20 seconds. Wow, 20 seconds really? I created a test case and profiled the code in question separately with YourKit (which is a really great tool, by the way!):

It got stuck quite a bit in java.util.zip.Inflater.inflateBytes – but that seemed to use native code, so I couldn’t profile any further.

So I went on and searched for an alternative of java.util.zip – and luckily I found one with jazzlib, which provides a pure Java implementation for ZIP compression and decompression. This library is GPL-licensed (with a small exception clause to prevent the pervasiveness of the GPL) and comes in two versions, one that duplicates the single library classes underknees java.util.zip (as a drop-in replacement for JDK versions where this is missing) and one that comes in its own namespace, net.sf.jazzlib.

After I went for the second version, I restarted my test and it only took about 7 seconds this time. At first I thought that there must be some downside to this approach, so I checked the timings for a complete decompression of the archive, but the timings here were on par with the ones from java.util.zip (roughly 5 seconds for a single 20MB file).

I haven’t tested compression speed, because it doesn’t matter much for my use case, but the decompression speed alone is astonishing. I wonder why nobody else stumbled upon these performance problems before…