Varnish constantly restarting

Alexander Staubo alex at bengler.no
Fri Jul 2 13:53:39 CEST 2010


Varnish (2.0.6 on Ubuntu Hardy, also tested 2.1.2 from source; Linux
x64) restarts many times per day with the following messages being
logged:

varnishd[9010]: child (9011) Started
varnishd[9010]: Child (9011) said Closed fds: 3 4 5 9 10 12 13
varnishd[9010]: Child (9011) said Child starts
varnishd[9010]: Child (9011) said Ready
varnishd[9010]: Child (9011) not responding to ping, killing it.
varnishd[9010]: Child (9011) not responding to ping, killing it.
varnishd[9010]: Child (9011) died signal=3
varnishd[9010]: Child cleanup complete
varnishd[9010]: child (9315) Started
varnishd[9010]: Child (9315) said Closed fds: 3 4 5 10 11 13 14
varnishd[9010]: Child (9315) said Child starts
varnishd[9010]: Child (9315) said Ready

Tried modifying defaults and switching from file-based cache to
malloc-based; it seems to be hanging less often now, but it still
happens a couple of times a day. Here is my current command line:

  varnishd -P /var/run/varnishd.pid \
    -a :6081 \
    -f /etc/varnish/default.vcl \
    -T 127.0.0.1:6082 \
    -t 120 \
    -w 4,30,120 \
    -s malloc,1G \
    -p thread_pool_max=300 \
    -p ping_interval=5 \
    -p cli_timeout=120s \
    -u varnish \
    -g varnish

At the point when the Varnish child stops responding to ping, CPU
usage blows through the roof. I started a screen running on the box
that logs varnishstat every 10s; this is the last output just abefore
it dies:

uptime                  71589          .   Child uptime
client_conn           4751731        66.38 Client connections accepted
client_drop                 0         0.00 Connection dropped, no sess
client_req            4747499        66.32 Client requests received
cache_hit             4647642        64.92 Cache hits
cache_hitpass             775         0.01 Cache hits for pass
cache_miss              86706         1.21 Cache misses
backend_conn            87236         1.22 Backend conn. success
backend_unhealthy            0         0.00 Backend conn. not attempted
backend_busy                0         0.00 Backend conn. too many
backend_fail              247         0.00 Backend conn. failures
backend_reuse               0         0.00 Backend conn. reuses
backend_toolate             0         0.00 Backend conn. was closed
backend_recycle             0         0.00 Backend conn. recycles
backend_unused              0         0.00 Backend conn. unused
fetch_head                  2         0.00 Fetch head
fetch_length            87222         1.22 Fetch with Length
fetch_chunked               0         0.00 Fetch chunked
fetch_eof                   0         0.00 Fetch EOF
fetch_bad                   0         0.00 Fetch had bad headers
fetch_close                 6         0.00 Fetch wanted close
fetch_oldhttp               0         0.00 Fetch pre HTTP/1.1 closed
fetch_zero                  0         0.00 Fetch zero len
fetch_failed                0         0.00 Fetch failed
n_srcaddr                   0          .   N struct srcaddr
n_srcaddr_act               0          .   N active struct srcaddr
n_sess_mem                129          .   N struct sess_mem
n_sess                   1899          .   N struct sess
n_object                28283          .   N struct object
n_objecthead            28288          .   N struct objecthead
n_smf                       0          .   N struct smf
n_smf_frag                  0          .   N small free smf
n_smf_large                 0          .   N large free smf
n_vbe_conn       18446744073709551598          .   N struct vbe_conn
n_bereq                    30          .   N struct bereq
n_wrk                       9          .   N worker threads
n_wrk_create              682         0.01 N worker threads created
n_wrk_failed                0         0.00 N worker threads not created
n_wrk_max                   0         0.00 N worker threads limited
n_wrk_queue               119         0.00 N queued work requests
n_wrk_overflow           5138         0.07 N overflowed work requests
n_wrk_drop                  0         0.00 N dropped work requests
n_backend                   1          .   N backends
n_expired               39742          .   N expired objects
n_lru_nuked             19046          .   N LRU nuked objects
n_lru_saved                 0          .   N LRU saved objects
n_lru_moved           2610711          .   N LRU moved objects
n_deathrow                  0          .   N objects on deathrow
losthdr                     0         0.00 HTTP header overflows
n_objsendfile               0         0.00 Objects sent with sendfile
n_objwrite            3055715        42.68 Objects sent with write
n_objoverflow               0         0.00 Objects overflowing workspace
s_sess                4751602        66.37 Total Sessions
s_req                 4751602        66.37 Total Requests
s_pipe                      0         0.00 Total pipe
s_pass                    777         0.01 Total pass
s_fetch                 87230         1.22 Total fetch
s_hdrbytes         2066945543     28872.39 Total header bytes
s_bodybytes       16985467831    237263.66 Total body bytes
sess_closed           4751602        66.37 Session Closed
sess_pipeline               0         0.00 Session Pipeline
sess_readahead              0         0.00 Session Read Ahead
sess_linger                 0         0.00 Session Linger
sess_herd                   0         0.00 Session herd
shm_records         215002154      3003.28 SHM records
shm_writes           19146311       267.45 SHM writes
shm_flushes               112         0.00 SHM flushes due to overflow
shm_cont               105277         1.47 SHM MTX contention
shm_cycles                 92         0.00 SHM cycles through buffer
sm_nreq                     0         0.00 allocator requests
sm_nobj                     0          .   outstanding allocations
sm_balloc                   0          .   bytes allocated
sm_bfree                    0          .   bytes free
sma_nreq               208247         2.91 SMA allocator requests
sma_nobj                55341          .   SMA outstanding allocations
sma_nbytes         1073656763          .   SMA outstanding bytes
sma_balloc         2560359776          .   SMA bytes allocated
sma_bfree          1486703013          .   SMA bytes free
sms_nreq                14533         0.20 SMS allocator requests
sms_nobj                    0          .   SMS outstanding allocations
sms_nbytes                  0          .   SMS outstanding bytes
sms_balloc            6363722          .   SMS bytes allocated
sms_bfree             6363722          .   SMS bytes freed
backend_req             87236         1.22 Backend requests made
n_vcl                       1         0.00 N vcl total
n_vcl_avail                 1         0.00 N vcl available
n_vcl_discard               0         0.00 N vcl discarded
n_purge                     1          .   N total active purges
n_purge_add                 1         0.00 N new purges added
n_purge_retire              0         0.00 N old purges deleted
n_purge_obj_test            0         0.00 N objects tested
n_purge_re_test             0         0.00 N regexps tested against
n_purge_dups                0         0.00 N duplicate purges removed
hcb_nolock                  0         0.00 HCB Lookups without lock
hcb_lock                    0         0.00 HCB Lookups with lock
hcb_insert                  0         0.00 HCB Inserts
esi_parse                   0         0.00 Objects ESI parsed (unlock)
esi_errors                  0         0.00 ESI parse errors (unlock)

The odd one out is n_vbe_conn, don't know if this is benign.



More information about the varnish-misc mailing list