We're using a Squid cache to off-load traffic from our web servers, ie. it's setup as a reverse-proxy responding to inbound requests before they hit our web servers.
When we get blitzed with concurrent requests for the same request that's not in the cache, Squid proxies all the requests through to our web ("origin") servers. For us, this behavior isn't ideal: our origin servers gets bogged down trying to fulfill N identical requests concurrently.
Instead, we'd like the first request to proxy through to the origin server, the rest of the requests to queue at the Squid layer, and then all be fulfilled by Squid when the origin server has responded to that first request.
Does anyone know how to configure Squid to do this?
We've read through the documentation multiple times and thoroughly web-searched the topic, but can't figure out how to do it.
We use Akamai too and, interestingly, this is its default behavior. (However, Akamai has so many nodes that we still see lots of concurrent requests in certain traffic spike scenarios, even with Akamai's super-node feature enabled.)
This behavior is clearly configurable for some other caches, eg. the Ehcache documentation offers the option "Concurrent Cache Misses: A cache miss will cause the filter chain, upstream of the caching filter to be processed. To avoid threads requesting the same key to do useless duplicate work, these threads block behind the first thread."
Some folks call this behavior a "blocking cache," since the subsequent concurrent requests block behind the first request until it's fulfilled or timed-out.
Thx for looking over my noob question!
Oliver