Varnish 2.0.6 nuking all my objects?

Barry Abrahamson barry at automattic.com
Thu Feb 25 05:15:17 CET 2010


Howdy,

We are finally getting around to upgrading to the latest version of varnish and are running into quite a weird problem.  Everything works fine for a bit (1+day) , then all of a sudden Varnish starts nuking all of the objects from the cache:

About 4 hours ago there were 1 million objects in the cache, now there are just about 172k.  This looks a bit weird to me:

sms_nbytes       18446744073709548694          .   SMS outstanding bytes

Here are the options I am passing to varnishd:

/usr/local/sbin/varnishd -a 0.0.0.0:8888 -f /etc/varnish/varnish.vcl -P /var/run/varnishd.pid -T 0.0.0.0:47200 -t 600 -w 1,200,300 -p thread_pools 4 -p thread_pool_add_delay 2 -p lru_interval 60 -h classic,500009 -p obj_workspace 4096 -s file,/varnish/cache,150G

/varnish is 2 x 80GB Intel X-25M SSDs in a software RAID 0 array.  OS is Debian Lenny 64-bit.  There is plenty of space:

/dev/md0              149G   52G   98G  35% /varnish

Here is the output of varnishstat -1

uptime                 134971          .   Child uptime
client_conn          12051037        89.29 Client connections accepted
client_drop                 0         0.00 Connection dropped, no sess
client_req           12048672        89.27 Client requests received
cache_hit            10161272        75.28 Cache hits
cache_hitpass          133244         0.99 Cache hits for pass
cache_miss            1750857        12.97 Cache misses
backend_conn          1824594        13.52 Backend conn. success
backend_unhealthy            0         0.00 Backend conn. not attempted
backend_busy                0         0.00 Backend conn. too many
backend_fail             3644         0.03 Backend conn. failures
backend_reuse               0         0.00 Backend conn. reuses
backend_toolate             0         0.00 Backend conn. was closed
backend_recycle             0         0.00 Backend conn. recycles
backend_unused              0         0.00 Backend conn. unused
fetch_head               5309         0.04 Fetch head
fetch_length          1816422        13.46 Fetch with Length
fetch_chunked               0         0.00 Fetch chunked
fetch_eof                   0         0.00 Fetch EOF
fetch_bad                   0         0.00 Fetch had bad headers
fetch_close                 0         0.00 Fetch wanted close
fetch_oldhttp               0         0.00 Fetch pre HTTP/1.1 closed
fetch_zero                  0         0.00 Fetch zero len
fetch_failed               16         0.00 Fetch failed
n_srcaddr                   0          .   N struct srcaddr
n_srcaddr_act               0          .   N active struct srcaddr
n_sess_mem                578          .   N struct sess_mem
n_sess                    414          .   N struct sess
n_object               172697          .   N struct object
n_objecthead           173170          .   N struct objecthead
n_smf                  471310          .   N struct smf
n_smf_frag              62172          .   N small free smf
n_smf_large             67978          .   N large free smf
n_vbe_conn       18446744073709551611          .   N struct vbe_conn
n_bereq                   315          .   N struct bereq
n_wrk                      76          .   N worker threads
n_wrk_create             3039         0.02 N worker threads created
n_wrk_failed                0         0.00 N worker threads not created
n_wrk_max                   0         0.00 N worker threads limited
n_wrk_queue                 0         0.00 N queued work requests
n_wrk_overflow          25136         0.19 N overflowed work requests
n_wrk_drop                  0         0.00 N dropped work requests
n_backend                   4          .   N backends
n_expired              771687          .   N expired objects
n_lru_nuked            744693          .   N LRU nuked objects
n_lru_saved                 0          .   N LRU saved objects
n_lru_moved           8675178          .   N LRU moved objects
n_deathrow                  0          .   N objects on deathrow
losthdr                    25         0.00 HTTP header overflows
n_objsendfile               0         0.00 Objects sent with sendfile
n_objwrite           11749415        87.05 Objects sent with write
n_objoverflow               0         0.00 Objects overflowing workspace
s_sess               12051007        89.29 Total Sessions
s_req                12050184        89.28 Total Requests
s_pipe                   2661         0.02 Total pipe
s_pass                 134858         1.00 Total pass
s_fetch               1821721        13.50 Total fetch
s_hdrbytes         3932274894     29134.22 Total header bytes
s_bodybytes      894452020319   6626994.10 Total body bytes
sess_closed          12050925        89.29 Session Closed
sess_pipeline               0         0.00 Session Pipeline
sess_readahead              0         0.00 Session Read Ahead
sess_linger                 0         0.00 Session Linger
sess_herd                 160         0.00 Session herd
shm_records         610011852      4519.58 SHM records
shm_writes           50380066       373.27 SHM writes
shm_flushes              9387         0.07 SHM flushes due to overflow
shm_cont                47763         0.35 SHM MTX contention
shm_cycles                226         0.00 SHM cycles through buffer
sm_nreq               4449213        32.96 allocator requests
sm_nobj                341160          .   outstanding allocations
sm_balloc         11373072384          .   bytes allocated
sm_bfree         116602589184          .   bytes free
sma_nreq                    0         0.00 SMA allocator requests
sma_nobj                    0          .   SMA outstanding allocations
sma_nbytes                  0          .   SMA outstanding bytes
sma_balloc                  0          .   SMA bytes allocated
sma_bfree                   0          .   SMA bytes free
sms_nreq                63997         0.47 SMS allocator requests
sms_nobj                    0          .   SMS outstanding allocations
sms_nbytes       18446744073709548694          .   SMS outstanding bytes
sms_balloc           31161028          .   SMS bytes allocated
sms_bfree            31162489          .   SMS bytes freed
backend_req           1821961        13.50 Backend requests made
n_vcl                       1         0.00 N vcl total
n_vcl_avail                 1         0.00 N vcl available
n_vcl_discard               0         0.00 N vcl discarded
n_purge                     1          .   N total active purges
n_purge_add                 1         0.00 N new purges added
n_purge_retire              0         0.00 N old purges deleted
n_purge_obj_test            0         0.00 N objects tested
n_purge_re_test             0         0.00 N regexps tested against
n_purge_dups                0         0.00 N duplicate purges removed
hcb_nolock                  0         0.00 HCB Lookups without lock
hcb_lock                    0         0.00 HCB Lookups with lock
hcb_insert                  0         0.00 HCB Inserts
esi_parse                   0         0.00 Objects ESI parsed (unlock)
esi_errors                  0         0.00 ESI parse errors (unlock)

Obviously, this isn't good for my cache hit rate :)  It is also using a lot of CPU.  Has anyone seen this happen before?

--
Barry Abrahamson | Systems Wrangler | Automattic
Blog: http://barry.wordpress.com






More information about the varnish-misc mailing list