abnormally high load?
Ken Brownfield
kb+varnish at slide.com
Wed Aug 12 21:25:11 CEST 2009
My first guess is that you're seeing varnish spawn a lot of threads
because your back-end isn't keeping up with the miss rate. My second
guess is that these misses are large files that are taking a long time
for clients to download, therefore piling up active client connections
(and thus worker threads).
I'm guessing your load is going high because you're swapping? In top,
are your CPUs saturated, or fairly idle?
If you're seeing CPU saturation, this is possibly an internal Varnish
issue. Your VCL seems sane, but we haven't seen the inline C.
It's a fact of life that you may periodically need more back-end or
worker threads than you would normally see. If you're on Linux (at
least), each of those threads will use 8MB of RAM (the default stack
size) which adds up quickly. You can reduce the thread stack size to
dramatically decrease how much memory Varnish uses as it scales threads.
We run a patch here that adds a startup parameter to change the stack
size of backend and worker pthreads, but you could emulate this by
reducing stack size before running Varnish:
ulimit -s 256
or
limit stacksize 256
We run pretty heavy traffic (including inline C) with 256KB stack with
no problem. This adjustment should decrease memory usage as thread
counts increase, and if you're swapping it might help alleviate the
spikes. But if the spikes are due to a slow backend, that's probably
where I'd look first.
Hope it helps,
--
Ken
On Aug 12, 2009, at 11:08 AM, Jeremy Hinegardner wrote:
> On Wed, Aug 12, 2009 at 05:41:52PM +0100, Rob S wrote:
>> Jeremy Hinegardner wrote:
>>> Hi all,
>>>
>>> I'm trying to figure out if this is a normal situation or not. We
>>> have a
>>> varnish instance in front of 12 tokyo tyrant instances with some
>>> inline C
>>> in the VCL to determine which backend to talk to.
>>>
>>>
>>
>> If you restart varnish during one of these spikes, does it instantly
>> disappear? I've seen this happen (though only spiking to about
>> 12), and
>> this is when Varnish has munched through far more memory than we've
>> allocated it. This problem is one I've been looking into with Ken
>> Brownfield, and touches on
>> http://projects.linpro.no/pipermail/varnish-misc/2009-April/002743.html
>> and
>> http://projects.linpro.no/pipermail/varnish-misc/2009-June/
>> 002840.html
>>
>> Do any of these tie up with your experience?
>
> Possibly, the correlation I can see with those instances is this
> section of our
> VCL
>
> sub vcl_recv {
> ...
> } else if ( req.request == "PUT" || req.request == "PURGE" ) {
> purge( "req.url == " req.url );
> if ( req.request == "PUT" ) {
> pass;
> } else {
> error 200 "PURGE Success";
> }
> }
> ...
> }
>
> We do a consistent stream of PUT operations, its probably 10-15% of
> all
> our operations. So our ban list would get farily large I'm guessing?
>
> I'm not seeing evidence of a memory leak, and the pmap of the
> process does show
> 4G in the varnish_storage.bin mapping.
>
> I've attached the output of 'varnishstat -1' if that helps. This is
> after I've
> diverted some traffic around varnish because of the load.
>
> If this purge() is the culprit, then I should make this change?
>
> sub vcl_recv {
> ...
> } else if ( req.request == "PUT" || req.request == "PURGE" ) {
> lookup;
> }
> ...
> }
>
> sub vcl_hit {
> if ( req.request == "PUT" || req.request == "PURGE" ) {
> set obj.ttl = 0s;
> if ( req.request == "PURGE" ) {
> error 200 "PURGE Success";
> }
> pass;
> }
> }
>
> sub vcl_miss {
> if ( req.request == "PUT" ) {
> pass;
> }
> if ( req.request == "PURGE") {
> error 404 "Not in cache.";
> }
> }
>
> enjoy,
>
> -jeremy
>
> --
> =
> =
> ======================================================================
> Jeremy Hinegardner jeremy at hinegardner.org
>
> <varnishstat.txt>_______________________________________________
> varnish-misc mailing list
> varnish-misc at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-misc
More information about the varnish-misc
mailing list