Inconsistent file downloads of (what should be) the same file
- by Austin A.
I'm working on a system that archives large collections of timetstamped images. Part of the system deals with saving an image to a growing .zip file. This morning I noticed that the log system said that an image was successfully downloaded and placed in the zip file, but when I downloaded the .zip (from an apache alias running on our server), the images didn't match the log. For example, although the log said that camera 3484 captured on January 17, 2011, when I download from the apache alias, the downloaded zip file only contains images up to January 14.
So, I sshed onto the server, and unzipped the file in its own directory, and that zip file has images from January 14 to today (January 17). What strikes me as odd is that this should be the exact same file as the one I downloaded from the apache alias.
Other experiments: I scp-ed the file from the server to my local machine, and the zip file has the newer images. But when I use an SCP client (in this case, Fugu for OSX), I get the zip file for the older images.
In short: unzipping a file on the server or after downloading through scp or after downloading through wget gives one zip file, but unzipping a file from Chrome, Firefox, or SCP client gives a different zip file, when they should be exactly the same.
Unzipping on the server...
[user@server ~]$ cd /export1/amos/images/2011/84/3484/00003484/
[user@server 00003484]$ ls -la
total 6180
drwxr-sr-x 2 user groupname 24 Jan 17 11:20 .
drwxr-sr-x 4 user groupname 36 Jan 11 19:58 ..
-rw-r--r-- 1 user groupname 6309980 Jan 17 12:05 2011.01.zip
[user@server 00003484]$ unzip 2011.01.zip
Archive: 2011.01.zip
extracting: 20110114_140547.jpg
extracting: 20110114_143554.jpg
replace 20110114_143554.jpg? [y]es, [n]o, [A]ll, [N]one, [r]ename: y
extracting: 20110114_143554.jpg
extracting: 20110114_153458.jpg
(...bunch of files...)
extracting: 20110117_170459.jpg
extracting: 20110117_173458.jpg
extracting: 20110117_180501.jpg
Using the wget through apache alias.
local:~ user$ wget http://example.com/zipfiles/2011/84/3484/00003484/2011.01.zip
--12:38:13-- http://example.com/zipfiles/2011/84/3484/00003484/2011.01.zip
=> `2011.01.zip'
Resolving example.com... ip.ip.ip.ip
Connecting to example.com|ip.ip.ip.ip|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6,327,747 (6.0M) [application/zip]
100% [=====================================================================================================>] 6,327,747 1.03M/s ETA 00:00
12:38:56 (143.23 KB/s) - `2011.01.zip' saved [6327747/6327747]
local:~ user$ unzip 2011.01.zip
Archive: 2011.01.zip
extracting: 20110114_140547.jpg
(... same as before...)
extracting: 20110117_183459.jpg
Using scp to grab the zip
local:~ user$ scp user@server:/export1/amos/images/2011/84/3484/00003484/2011.01.zip .
2011.01.zip 100% 6179KB 475.3KB/s 00:13
local:~ user$ unzip 2011.01.zip
Archive: 2011.01.zip
extracting: 20110114_140547.jpg
(...same as before...)
extracting: 20110117_183459.jpg
Using Fugu to download 2011.01.zip from /export1/amos/images/2011/84/3484/00003484/ gives images 20110113_090457.jpg through 201100114_010554.jpg
Using Firefox to download 2011.01.zip from http://example.com/zipfiles/2011/84/3484/00003484/2011.01.zip gives images 20110113_090457.jpg through 201100114_010554.jpg
Using Chrome gives same results as Firefox.
Relevant section from apache httpd.conf:
# ScriptAlias: This controls which directories contain server scripts.
# ScriptAliases are essentially the same as Aliases, except that
# documents in the realname directory are treated as applications and
# run by the server when requested rather than as documents sent to the client.
# The same rules about trailing "/" apply to ScriptAlias directives as to
# Alias.
#
ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"
Alias /zipfiles/ /export1/amos/images/