Incorrect gzipping of http requests, can't find who's doing it
- by Ned Batchelder
We're seeing some very strange mangling of HTTP responses, and we can't figure out what is doing it. We have an app server handling JSON requests. Occasionally, the response is returned gzipped, but with incorrect headers that prevent the browser from interpreting it correctly.
The problem is intermittent, and changes behavior over time. Yesterday morning it seemed to fail 50% of the time, and in fact, seemed tied to one of our two load-balanced servers. Later in the afternoon, it was failing only 20 times out of 1000, and didn't correlate with an app server.
The two app servers are running Apache 2.2 with mod_wsgi and a Django app stack. They have identical Apache configs and source trees, and even identical packages installed on Red Hat. There's a hardware load balancer in front, I don't know the make or model.
Akamai is also part of the food chain, though we removed Akamai and still had the problem.
Here's a good request and response:
* Connected to example.com (97.7.79.129) port 80 (#0)
> POST /claim/ HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15
> Host: example.com
> Accept: */*
> Referer: http://example.com/apps/
> Accept-Encoding: gzip,deflate
> Content-Length: 29
> Content-Type: application/x-www-form-urlencoded
>
} [data not shown]
< HTTP/1.1 200 OK
< Server: Apache/2
< Content-Language: en-us
< Content-Encoding: identity
< Content-Length: 47
< Content-Type: application/x-javascript
< Connection: keep-alive
< Vary: Accept-Encoding
<
{ [data not shown]
* Connection #0 to host example.com left intact
* Closing connection #0
{"msg": "", "status": "OK", "printer_name": ""}
And here's a bad one:
* Connected to example.com (97.7.79.129) port 80 (#0)
> POST /claim/ HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15
> Host: example.com
> Accept: */*
> Referer: http://example.com/apps/
> Accept-Encoding: gzip,deflate
> Content-Length: 29
> Content-Type: application/x-www-form-urlencoded
>
} [data not shown]
< HTTP/1.1 200 OK
< Server: Apache/2
< Content-Language: en-us
< Content-Encoding: identity
< Content-Type: application/x-javascript
< Content-Encoding: gzip
< Content-Length: 59
< Connection: keep-alive
< Vary: Accept-Encoding
< X-N: S
<
{ [data not shown]
* Connection #0 to host example.com left intact
* Closing connection #0
?V?-NW?RPR?QP*.I,)-???A??????????T??Z?
??/
There are two things to notice about the bad response:
It has two Content-Encoding headers, and the browsers seem to use the first. So they see an identity encoding header, and gzipped content, so they can't interpret the response.
The bad response has an extra "X-N: S" header.
Perhaps if I could find out what intermediary adds "X-N: S" headers to responses, I could track down the culprit...