java.util.zip - ZipInputStream v.s. ZipFile
- by lucho
Hello, community!
I have some general questions regarding the java.util.zip library.
What we basically do is an import and an export of many small components. Previously these components were imported and exported using a single big file, e.g.:
<component-type-a id="1"/>
<component-type-a id="2"/>
<component-type-a id="N"/>
<component-type-b id="1"/>
<component-type-b id="2"/>
<component-type-b id="N"/>
Please note that the order of the components during import is relevant.
Now every component should occupy its own file which should be externally versioned, QA-ed, bla, bla. We decided that the output of our export should be a zip file (with all these files in) and the input of our import should be a similar zip file. We do not want to explode the zip in our system. We do not want opening separate streams for each of the small files. My current questions:
Q1. May the ZipInputStream guarantee that the zip entries (the little files) will be read in the same order in which they were inserted by our export that uses ZipOutputStream? I assume reading is something like:
ZipInputStream zis = new ZipInputStream(new BufferedInputStream(fis));
ZipEntry entry;
while((entry = zis.getNextEntry()) != null)
{
//read from zis until available
}
I know that the central zip directory is put at the end of the zip file but nevertheless the file entries inside have sequential order. I also know that relying on the order is an ugly idea but I just want to have all the facts in mind.
Q2. If I use ZipFile (which I prefer) what is the performance impact of calling getInputStream() hundreds of times? Will it be much slower than the ZipInputStream solution? The zip is opened only once and ZipFile is backed by RandomAccessFile - is this correct?
I assume reading is something like:
ZipFile zipfile = new ZipFile(argv[0]);
Enumeration e = zipfile.entries();//TODO: assure the order of the entries
while(e.hasMoreElements()) {
entry = (ZipEntry) e.nextElement();
is = zipfile.getInputStream(entry));
}
Q3. Are the input streams retrieved from the same ZipFile thread safe (e.g. may I read different entries in different threads simultaneously)? Any performance penalties?
Thanks for your answers!