Varnish hangs / requests time out

Bartek Perwenis b.perwenis at gmail.com
Thu Mar 5 12:40:51 CET 2009


Same here. I have encounter this problem after migrating from linux+2.0.1 to
solaris+2.0.2.

With 2.0.2 it happens randomly. Sometimes varnish runs flawlessly for days,
and sometimes locks up couple of times in a short period of time.

Maybe you should try 2.0.1 on one of your test servers and compare how they
behave?

Best regards,
Bartek

2009/3/4 Ross Brown <ross at trademe.co.nz>

> Hi all
>
> We are hoping to use Varnish for serving image content on our reasonably
> busy auction site here in New Zealand, but are having an interesting problem
> during testing.
>
> We are using latest Varnish (2.0.3) on Ubuntu 8.10 server (64-bit) and have
> built two servers for testing - both are located in the same datacentre and
> situated behind an F5 hardware load balancer. We want to keep all images
> cached in RAM and are using Varnish with jemalloc to achieve this. For the
> most part, Varnish is working well for us and performance is great.
>
> However, we have seen both our Varnish servers lock up at precisely the
> same time and stop processing incoming HTTP requests until Varnishd is
> manually restarted. This has happened twice and seems to occur at random -
> the last time was after 5 days of uptime and a significant amount of
> processed traffic (<1TB).
>
> When this problem happens, the backend is still reachable and happily
> serving images. It is not a particularly busy period for us (600
> requests/sec/Varnish server - approx 350Mbps outbound each - we got up to
> nearly 3 times that level without incident previously) but for some reason
> unknown to us, the servers just suddenly stop processing requests and worker
> processes increase dramatically.
>
> After the lockup happened last time, I tried firing up varnishlog and
> hitting the server directly - my requests were not showing up at all. The
> *only* entries in the varnish log were related to worker processes being
> killed over time - no PINGs, PONGs, load balancer healthchecks or anything
> related to 'normal' varnish activity. It's as if varnishd has completely
> locked up, but we can't understand what causes both our varnish servers to
> exhibit this behaviour at exactly the same time, nor why varnish does not
> detect it and attempt a restart. After a restart, varnish is fine and
> behaves itself.
>
> There is nothing to indicate an error with the backend, nor anything in
> syslog to indicate a Varnish problem. Pointers of any kind would be
> appreciated :)
>
> Best regards
>
> Ross Brown
> Trade Me
> www.trademe.co.nz
>
> *** Startup Options (as per hints in wiki for caching millions of objects):
> -a 0.0.0.0:80 -f /usr/local/etc/default.net.vcl -T 0.0.0.0:8021 -t 86400
> -h classic,1200007 -p thread_pool_max=4000 -p thread_pools=4 -p
> listen_depth=4096 -p lru_interval=3600 -p obj_workspace=4096 -s malloc,10G
>
> *** Running VCL:
> backend default {
>        .host = "10.10.10.10";
>        .port = "80";
> }
>
> sub vcl_recv {
>        # Don't cache objects requested with query string in URI.
>        # Needed for newsletter headers (openrate) and health checks.
>        if (req.url ~ "\?.*") {
>                pass;
>        }
>
>        # Force lookup if the request is a no-cache request from the client.
>        if (req.http.Cache-Control ~ "no-cache") {
>                unset req.http.Cache-Control;
>                lookup;
>        }
>
>        # By default, Varnish will not serve requests that come with a
> cookie from its cache.
>        unset req.http.cookie;
>        unset req.http.authenticate;
>
>        # No action here, continue into default vcl_recv{}
> }
>
>
> ***Stats
>      458887  Client connections accepted
>   170714631  Client requests received
>   133012763  Cache hits
>        3715  Cache hits for pass
>    27646213  Cache misses
>    37700868  Backend connections success
>           0  Backend connections not attempted
>           0  Backend connections too many
>          40  Backend connections failures
>    37512808  Backend connections reuses
>    37514682  Backend connections recycles
>           0  Backend connections unused
>        1339  N struct srcaddr
>          16  N active struct srcaddr
>         756  N struct sess_mem
>          12  N struct sess
>      761152  N struct object
>      761243  N struct objecthead
>           0  N struct smf
>           0  N small free smf
>           0  N large free smf
>         322  N struct vbe_conn
>         345  N struct bereq
>          20  N worker threads
>        2331  N worker threads created
>           0  N worker threads not created
>           0  N worker threads limited
>           0  N queued work requests
>       35249  N overflowed work requests
>           0  N dropped work requests
>           1  N backends
>          44  N expired objects
>    26886639  N LRU nuked objects
>           0  N LRU saved objects
>    15847787  N LRU moved objects
>           0  N objects on deathrow
>           3  HTTP header overflows
>           0  Objects sent with sendfile
>   164595318  Objects sent with write
>           0  Objects overflowing workspace
>      458886  Total Sessions
>   170715215  Total Requests
>         306  Total pipe
>    10054413  Total pass
>    37700586  Total fetch
>  49458782160  Total header bytes
> 1151144727614  Total body bytes
>       89464  Session Closed
>           0  Session Pipeline
>           0  Session Read Ahead
>           0  Session Linger
>   170622902  Session herd
>  7875546129  SHM records
>   380705819  SHM writes
>         138  SHM flushes due to overflow
>      763205  SHM MTX contention
>        2889  SHM cycles through buffer
>           0  allocator requests
>           0  outstanding allocations
>           0  bytes allocated
>           0  bytes free
>   101839895  SMA allocator requests
>     1519005  SMA outstanding allocations
>  10736616112  SMA outstanding bytes
> 562900737623  SMA bytes allocated
> 552164121511  SMA bytes free
>          56  SMS allocator requests
>           0  SMS outstanding allocations
>           0  SMS outstanding bytes
>       25712  SMS bytes allocated
>       25712  SMS bytes freed
>    37700490  Backend requests made
>           3  N vcl total
>           3  N vcl available
>           0  N vcl discarded
>           1  N total active purges
>           1  N new purges added
>           0  N old purges deleted
>           0  N objects tested
>           0  N regexps tested against
>           0  N duplicate purges removed
>           0  HCB Lookups without lock
>           0  HCB Lookups with lock
>           0  HCB Inserts
>           0  Objects ESI parsed (unlock)
>           0  ESI parse errors (unlock)
>
>
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-misc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20090305/b82809ed/attachment-0001.html>


More information about the varnish-misc mailing list