How to enable caching on Apache / Ubuntu Linux?
- by Jim Mischel
I have a large (several megabytes) XML file that's updated rather frequently (every 10 minutes or less) and gets a lot of traffic. I'd like to implement some caching to reduce bandwidth and server load. Looking at the Apache documents, I see a dizzying array of configuration options that involve various combinations of mod_expires, mod_headers, and mod_cache (and variants). I end up running in circles and the results aren't what I expect.
I'm comfortable editing the various configuration files if I have some idea what I'm supposed to change. But at the moment I'm poking around in the dark and that's never a comfortable feeling. So, perhaps if I describe what I want, somebody here can take me by the hand and say, "This is what you need to do."
Periodically, this file, call it "stuff.xml" is updated and a new version copied to the directory. The external url would be, for example, http://example.com/stuff.xml. Understand, this part works. Whenever I request the file, I get the expected result. But the file is big and I want to save bandwidth, so first I'd like to implement conditional GET semantics with the If-Modified-Since header. How do I do this? I've enabled mod_headers and mod_expired and added the <FilesMatching> section in my httpd.conf as recommended in countless examples I've seen online, but that didn't change the behavior when made a conditional GET request. I always get a status 200 with the entire document. So how the heck do I implement this?
That'll cut down on neeless transfers. I'd also like to limit the amount of data transferred. Seeing as this is XML, gzipping it should save me 50% or more. My next step would be to somehow gzip the file and, if it's not too difficult, store it in memory. That'll cut down on per-access data transfer, and also reduce disk transfers. So how do I implement this type of caching?
Thanks in advance.