Varnish constantly restarting
Alexander Staubo
alex at bengler.no
Fri Jul 2 13:53:39 CEST 2010
Varnish (2.0.6 on Ubuntu Hardy, also tested 2.1.2 from source; Linux
x64) restarts many times per day with the following messages being
logged:
varnishd[9010]: child (9011) Started
varnishd[9010]: Child (9011) said Closed fds: 3 4 5 9 10 12 13
varnishd[9010]: Child (9011) said Child starts
varnishd[9010]: Child (9011) said Ready
varnishd[9010]: Child (9011) not responding to ping, killing it.
varnishd[9010]: Child (9011) not responding to ping, killing it.
varnishd[9010]: Child (9011) died signal=3
varnishd[9010]: Child cleanup complete
varnishd[9010]: child (9315) Started
varnishd[9010]: Child (9315) said Closed fds: 3 4 5 10 11 13 14
varnishd[9010]: Child (9315) said Child starts
varnishd[9010]: Child (9315) said Ready
Tried modifying defaults and switching from file-based cache to
malloc-based; it seems to be hanging less often now, but it still
happens a couple of times a day. Here is my current command line:
varnishd -P /var/run/varnishd.pid \
-a :6081 \
-f /etc/varnish/default.vcl \
-T 127.0.0.1:6082 \
-t 120 \
-w 4,30,120 \
-s malloc,1G \
-p thread_pool_max=300 \
-p ping_interval=5 \
-p cli_timeout=120s \
-u varnish \
-g varnish
At the point when the Varnish child stops responding to ping, CPU
usage blows through the roof. I started a screen running on the box
that logs varnishstat every 10s; this is the last output just abefore
it dies:
uptime 71589 . Child uptime
client_conn 4751731 66.38 Client connections accepted
client_drop 0 0.00 Connection dropped, no sess
client_req 4747499 66.32 Client requests received
cache_hit 4647642 64.92 Cache hits
cache_hitpass 775 0.01 Cache hits for pass
cache_miss 86706 1.21 Cache misses
backend_conn 87236 1.22 Backend conn. success
backend_unhealthy 0 0.00 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 247 0.00 Backend conn. failures
backend_reuse 0 0.00 Backend conn. reuses
backend_toolate 0 0.00 Backend conn. was closed
backend_recycle 0 0.00 Backend conn. recycles
backend_unused 0 0.00 Backend conn. unused
fetch_head 2 0.00 Fetch head
fetch_length 87222 1.22 Fetch with Length
fetch_chunked 0 0.00 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 6 0.00 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 0 0.00 Fetch failed
n_srcaddr 0 . N struct srcaddr
n_srcaddr_act 0 . N active struct srcaddr
n_sess_mem 129 . N struct sess_mem
n_sess 1899 . N struct sess
n_object 28283 . N struct object
n_objecthead 28288 . N struct objecthead
n_smf 0 . N struct smf
n_smf_frag 0 . N small free smf
n_smf_large 0 . N large free smf
n_vbe_conn 18446744073709551598 . N struct vbe_conn
n_bereq 30 . N struct bereq
n_wrk 9 . N worker threads
n_wrk_create 682 0.01 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 0 0.00 N worker threads limited
n_wrk_queue 119 0.00 N queued work requests
n_wrk_overflow 5138 0.07 N overflowed work requests
n_wrk_drop 0 0.00 N dropped work requests
n_backend 1 . N backends
n_expired 39742 . N expired objects
n_lru_nuked 19046 . N LRU nuked objects
n_lru_saved 0 . N LRU saved objects
n_lru_moved 2610711 . N LRU moved objects
n_deathrow 0 . N objects on deathrow
losthdr 0 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 3055715 42.68 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 4751602 66.37 Total Sessions
s_req 4751602 66.37 Total Requests
s_pipe 0 0.00 Total pipe
s_pass 777 0.01 Total pass
s_fetch 87230 1.22 Total fetch
s_hdrbytes 2066945543 28872.39 Total header bytes
s_bodybytes 16985467831 237263.66 Total body bytes
sess_closed 4751602 66.37 Session Closed
sess_pipeline 0 0.00 Session Pipeline
sess_readahead 0 0.00 Session Read Ahead
sess_linger 0 0.00 Session Linger
sess_herd 0 0.00 Session herd
shm_records 215002154 3003.28 SHM records
shm_writes 19146311 267.45 SHM writes
shm_flushes 112 0.00 SHM flushes due to overflow
shm_cont 105277 1.47 SHM MTX contention
shm_cycles 92 0.00 SHM cycles through buffer
sm_nreq 0 0.00 allocator requests
sm_nobj 0 . outstanding allocations
sm_balloc 0 . bytes allocated
sm_bfree 0 . bytes free
sma_nreq 208247 2.91 SMA allocator requests
sma_nobj 55341 . SMA outstanding allocations
sma_nbytes 1073656763 . SMA outstanding bytes
sma_balloc 2560359776 . SMA bytes allocated
sma_bfree 1486703013 . SMA bytes free
sms_nreq 14533 0.20 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 0 . SMS outstanding bytes
sms_balloc 6363722 . SMS bytes allocated
sms_bfree 6363722 . SMS bytes freed
backend_req 87236 1.22 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_purge 1 . N total active purges
n_purge_add 1 0.00 N new purges added
n_purge_retire 0 0.00 N old purges deleted
n_purge_obj_test 0 0.00 N objects tested
n_purge_re_test 0 0.00 N regexps tested against
n_purge_dups 0 0.00 N duplicate purges removed
hcb_nolock 0 0.00 HCB Lookups without lock
hcb_lock 0 0.00 HCB Lookups with lock
hcb_insert 0 0.00 HCB Inserts
esi_parse 0 0.00 Objects ESI parsed (unlock)
esi_errors 0 0.00 ESI parse errors (unlock)
The odd one out is n_vbe_conn, don't know if this is benign.
More information about the varnish-misc
mailing list