Varnish Lurker is getting slower / Ban lists keeps increasing

Olivier Hanesse olivier.hanesse at gmail.com
Wed Nov 29 13:53:55 UTC 2017


Hello,

I am still on this issue. Today, I disable the loadbalancing to a specific
varnish server whose ban list was around 500K (n_object is around 600K)

After 4 hours without any requests other than "BAN", the ban list is still
increasing, and I got a system load around 1.5.

A "top" with thread show that the ban_lurker is eating 100% of 1 CPU (8 cpu
computer)

top - 14:47:53 up 91 days, 22:36,  2 users,  load average: 1.57, 1.45, 1.49
Threads: 702 total,   2 running, 700 sleeping,   0 stopped,   0 zombie
%Cpu(s): 11.6 us,  0.2 sy,  0.0 ni, 88.1 id,  0.0 wa,  0.0 hi,  0.1 si,
0.0 st
KiB Mem:   8196832 total,  5532436 used,  2664396 free,    98028 buffers
KiB Swap:   499708 total,        0 used,   499708 free.  1077884 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND



 4606 vcache    20   0 4212992 3.553g  85300 R 99.9 45.5   6991:50
ban-lurker





Is is possible that the ban_lurker is locked or in an infinite loop (I know
it is single threaded) ? What kind of dump can I provide to help understand
this issue ?

Regards

Olivier

2017-09-04 16:54 GMT+02:00 Olivier Hanesse <olivier.hanesse at gmail.com>:

> In this case, that means that as long as the ban lurker is working, no
> statistics are updated right ?
>
> So if I don't see any updates of statistics such as "bans_deleted", or
> "bans_lurker_obj_killed_cutoff" during a long period, it doesn't mean
> that the lurker is sleeping, hanged or waiting for a lock, it  means that
> the lurker worker is working pretty "hard", is that correct ?
>
>
> 2017-09-04 16:11 GMT+02:00 Dridi Boukelmoune <dridi at varni.sh>:
>
>> On Mon, Sep 4, 2017 at 2:12 PM, Olivier Hanesse
>> <olivier.hanesse at gmail.com> wrote:
>> > Are the stats used by varnishstat about the lurker "well" updated
>> "every minute" ? The fact that the statistics was only updated once is
>> kinda strange : the ban list size is higher than the cutoff value everyday
>> :(
>>
>> No, that's a limitation of the statistics, serving HTTP traffic has
>> higher priority than committing updates of the counters.
>>
>> See this for reference:
>>
>> https://github.com/varnishcache/varnish-cache/pull/2290
>>
>> Dridi
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20171129/3f4651f2/attachment.html>


More information about the varnish-misc mailing list