{"id":1023,"date":"2011-05-03T15:24:17","date_gmt":"2011-05-03T14:24:17","guid":{"rendered":"http:\/\/www.thomaskeller.biz\/blog\/?p=1023"},"modified":"2011-05-03T15:35:35","modified_gmt":"2011-05-03T14:35:35","slug":"jazzliban-alternative-for-reading-zip-files-in-java","status":"publish","type":"post","link":"https:\/\/www.thomaskeller.biz\/blog\/2011\/05\/03\/jazzliban-alternative-for-reading-zip-files-in-java\/","title":{"rendered":"jazzlib &#8211; an alternative for reading ZIP files in Java"},"content":{"rendered":"<p>Java had zip-reading capabilities for a long time, naturally because `jar` files are simply compressed zip files with some meta data. The needed classes reside in the `java.util.zip` namespace and are `ZipInputStream` and `ZipEntry`.<\/p>\n<p>Recently, however, `ZipInputStream` gave me a huge headache. My use case was as simple as<\/p>\n<p>* read the zip entries of a list of zip files (each varying in size, but usually around 20MB)<br \/>\n* skip to the zip entry that has a certain name (a single text file with only two bytes of contents)<br \/>\n* read the contents of this zip entry and close the zip<\/p>\n<p>Doing this for about 25 files took my Pentium D (2GHz) with 3GB of RAM roughly **20 seconds**. Wow, 20 seconds really? I created a test case and profiled the code in question separately with [YourKit](http:\/\/www.yourkit.com) (which is a really great tool, by the way!):<\/p>\n<p><a href=\"http:\/\/www.thomaskeller.biz\/blog\/wp-content\/uploads\/2011\/05\/yourkit.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.thomaskeller.biz\/blog\/wp-content\/uploads\/2011\/05\/yourkit-450x114.png\" alt=\"\" title=\"yourkit\" width=\"450\" height=\"114\" class=\"alignright size-medium wp-image-1024\" srcset=\"https:\/\/www.thomaskeller.biz\/blog\/wp-content\/uploads\/2011\/05\/yourkit-450x114.png 450w, https:\/\/www.thomaskeller.biz\/blog\/wp-content\/uploads\/2011\/05\/yourkit.png 647w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p>It got stuck quite a bit in `java.util.zip.Inflater.inflateBytes` &#8211; but that seemed to use native code, so I couldn&#8217;t profile any further.<\/p>\n<p>So I went on and searched for an alternative of `java.util.zip` &#8211; and luckily I found one with [jazzlib](http:\/\/jazzlib.sourceforge.net), which provides a pure Java implementation for ZIP compression and decompression. This library is GPL-licensed (with a small exception clause to prevent the pervasiveness of the GPL) and comes in two versions, one that duplicates the single library classes underknees `java.util.zip` (as a drop-in replacement for JDK versions where this is missing) and one that comes in its own namespace, `net.sf.jazzlib`.<\/p>\n<p>After I went for the second version, I restarted my test and it only took about **7 seconds** this time. At first I thought that there must be some downside to this approach, so I checked the timings for a complete decompression of the archive, but the timings here were on par with the ones from `java.util.zip` (roughly 5 seconds for a single 20MB file).<\/p>\n<p>I haven&#8217;t tested compression speed, because it doesn&#8217;t matter much for my use case, but the decompression speed alone is astonishing. I wonder why nobody else stumbled upon these performance problems before&#8230;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Java had zip-reading capabilities for a long time, naturally because `jar` files are simply compressed zip files with some meta data. The needed classes reside in the `java.util.zip` namespace and are `ZipInputStream` and `ZipEntry`. Recently, however, `ZipInputStream` gave me a huge headache. My use case was as simple as * read the zip entries of &hellip; <a href=\"https:\/\/www.thomaskeller.biz\/blog\/2011\/05\/03\/jazzliban-alternative-for-reading-zip-files-in-java\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">jazzlib &#8211; an alternative for reading ZIP files in Java<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,11],"tags":[],"class_list":["post-1023","post","type-post","status-publish","format-standard","hentry","category-coding","category-work"],"_links":{"self":[{"href":"https:\/\/www.thomaskeller.biz\/blog\/wp-json\/wp\/v2\/posts\/1023","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.thomaskeller.biz\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.thomaskeller.biz\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.thomaskeller.biz\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.thomaskeller.biz\/blog\/wp-json\/wp\/v2\/comments?post=1023"}],"version-history":[{"count":11,"href":"https:\/\/www.thomaskeller.biz\/blog\/wp-json\/wp\/v2\/posts\/1023\/revisions"}],"predecessor-version":[{"id":1035,"href":"https:\/\/www.thomaskeller.biz\/blog\/wp-json\/wp\/v2\/posts\/1023\/revisions\/1035"}],"wp:attachment":[{"href":"https:\/\/www.thomaskeller.biz\/blog\/wp-json\/wp\/v2\/media?parent=1023"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.thomaskeller.biz\/blog\/wp-json\/wp\/v2\/categories?post=1023"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.thomaskeller.biz\/blog\/wp-json\/wp\/v2\/tags?post=1023"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}