Understand "hit for pass" cache objects
Justin Pasher
justinp at newmediagateway.com
Mon Feb 15 22:56:01 CET 2010
Hello,
I have just started using Varnish 2.0.6 in the past week as a
replacement for Squid. So far, I love the fine grained control you have
over what goes into cache (as opposed to Squid's "I'll cache it when I
feel it's supposed to be cached, but not tell you why" approach). That
said, I'm trying to better understand the "hit for pass" cache objects
that Varnish will sometimes create. Here is basic flow of my vcl (much
of it is based on the concepts on the intro page:
http://varnish-cache.org/wiki/Introduction)
vcl_recv:
Default action is "lookup". Action changes to "pass" if ...
* Cache-Control or Pragma headers has "no-cache"
* HTTP auth is in use (Authorization header)
* Request contains cookie "bypass_cache=true"
* Request type is not GET, HEAD, POST, PUT, TRACE, OPTIONS, DELETE
vcl_fetch:
Default action is "deliver". Action changes to "pass" if ...
* Response is deemed uncacheable (!obj.cacheable)
* Response contains Cache-Control headers that say "no-cache"
* HTTP auth is in use (Authorization header)
* Request contains cookie "bypass_cache=true"
* Response contains Set-Cookie header
Now on to the problem at hand. My understanding (please correct any
errors) of the "hit for pass" object is that any time the action is
"pass" within vcl_fetch, Varnish will create a "hit for pass" object to
make future requests for the same URL hash go straight to the back end
instead of lining them up serially and waiting for a response from the
first request. Until that object's TTL expires, the "hit for pass"
object will remain in cache and never be replaced with a fresh object
from the backend.
Here is what is happening my my example.
Client A visits the URL http://www.example.com/. Since this is the first
time they visit the site, the backend code tries to start a session (PHP
code), which sends a Set-Cookie header in the response. In vcl_fetch,
Varnish sees the Set-Cookie header and issues the "pass" action. Now
there is a "hit for pass" cache object with a TTL based upon the
Cache-Control/Expires headers or the default TTL (let's assume 120 seconds).
Client B visit the same URL http://www.example.com/. Varnish finds a
"hit for pass" object in the cache, so it sends the request directly to
the backend. This same thing will continue for any future clients until
120 seconds have elapsed.
Herein lies my dilemma. A request for the same URL
(http://www.example.com/) is sometimes cacheable and sometimes not
cacheable (it usually depends on whether it's the first time a user
visits the site and the Set-Cookie header has to be sent). What this
means is if I have a very heavy hit URL as a landing page from Google,
most of the time there will be a "hit for pass" cache object in Varnish,
since most people going to that page will have a Set-Cookie header. The
only time it will cache the page is if I'm lucky and someone visits the
page while there is no "hit for pass" cache object and their request
doesn't result in a "pass" action from vcl_fetch.
In my situation, I think I could avoid this problem altogether if I
could make Varnish store a DIFFERENT set of headers in the cache object
than the headers return to the client. For example, if I receive a
response with a Set-Cookie header, I would remove the Set-Cookie header
from the soon-to-be-cached object (so it wouldn't serve that header up
for everyone), but LEAVE the Set-Cookie header for the individual that
made the original request. This would allow the page to cache normally
even if the only requests going to that page result in a Set-Cookie
header. However, from what I've been able to see, there is no way to do
this.
Does anyone have any recommendations to get around this? In a perfect
world, my caching server would work this way:
* vcl_recv: If any criteria from A through D are met, don't pull this
request from cache and go to the backend
* vcl_fetch: If any criteria from E through G are met, send the object
straight to the client without touching the cache.
The "without touching the cache" portion seems to be where I am falling
down.
--
Justin Pasher
More information about the varnish-misc
mailing list