only 503s after a few hours (trunk-4384)
Fernando Papa
fernando.papa at liveinteractive.net
Wed Dec 9 14:51:04 CET 2009
Happened again. This time I left varnishlog running, and got...
989 TxStatus c 503
989 TxResponse c Service Unavailable
989 TxHeader c Server: Varnish
989 TxHeader c Date: Wed, 09 Dec 2009 00:38:07 GMT
989 TxHeader c Age: 0
989 TxHeader c Connection: close
989 ReqEnd c 1933950257 1260319087.004358053 1260319087.004553556
0.000067711 0.000121117 0.000074387
989 SessionClose c error
989 StatSess c 189.165.32.95 1480 0 1 1 0 0 0 117 0
5 Debug - "Too many open files when accept(2)ing. Sleeping."
<EOF>
That "Too many..." debug message appears 709 times. First one:
<line number 14292239>
1023 TxProtocol c HTTP/1.1
1023 TxStatus c 200
1023 TxResponse c OK
1023 TxHeader c Server: xxx
1023 TxHeader c Cache-Control: public, max-age=3600
1023 TxHeader c Content-Type: application/x-shockwave-flash
1023 TxHeader c Content-Length: 659
1023 TxHeader c Date: Wed, 09 Dec 2009 00:10:05 GMT
1023 TxHeader c Age: 972
1023 TxHeader c Connection: keep-alive
1023 ReqEnd c 1933931697 1260317405.392798424 1260317405.392909527
0.000062466 0.000048637 0.000062466
706 Debug c "herding"
695 Debug c "herding"
1007 Debug c "herding"
1023 Debug c "herding"
5 Debug - "Too many open files when accept(2)ing. Sleeping."
937 ReqStart c 190.13.3.173 51259 1933931698
937 RxRequest c GET
937 RxURL c xxx
937 RxProtocol c HTTP/1.1
937 RxHeader c Accept: */*
937 RxHeader c Referer: xxx
937 RxHeader c Accept-Language: es-co
937 RxHeader c UA-CPU: x86
937 RxHeader c Accept-Encoding: gzip, deflate
937 RxHeader c User-Agent: xxx
937 RxHeader c Host: xxx
937 RxHeader c Connection: Keep-Alive
937 RxHeader c Cookie: xxx=xxx
937 VCL_call c recv
937 VCL_return c lookup
937 VCL_call c hash
937 VCL_return c hash
937 VCL_call c miss
937 VCL_return c fetch
937 FetchError c no backend connection
937 VCL_call c error
937 VCL_return c deliver
Varnish account does not have limits (ulimit returns unlimited). Memory
usage (top): 403m/89m/81m.
# ps -AF | grep varnish
root 19223 1 0 21419 928 2 Dec08 ? 00:00:00 varnishd -a
*:80 -f xxx.vcl -p obj_workspace 4096 -s malloc,100M -p thread_pools 4
nobody 19224 19223 0 103378 91244 3 Dec08 ? 00:06:52 varnishd -a
*:80 -f xxx.vcl -p obj_workspace 4096 -s malloc,100M -p thread_pools 4
# kill 19224
# ps -AF | grep varnish
root 19223 1 0 21419 960 2 Dec08 ? 00:00:00 varnishd -a
*:80 -f xxx.vcl -p obj_workspace 4096 -s malloc,100M -p thread_pools 4
nobody 31319 19223 1 94673 3532 3 12:33 ? 00:00:00 varnishd -a
*:80 -f xxx.vcl -p obj_workspace 4096 -s malloc,100M -p thread_pools 4
That made it work again --as in processing requests--, except for varnishlog
that continues to show nothing.
----- Original Message -----
From: "Fernando Papa" <fernando.papa at liveinteractive.net>
To: <varnish-misc at projects.linpro.no>
Sent: Tuesday, December 08, 2009 2:01 PM
Subject: only 503s after a few hours (trunk-4384)
> Hello,
>
> After a few hours, Varnish trunk-4384 started replying with 503.
> New work-load, first day of use. Previously used Varnish only with
> internal
> traffic (eg no crazy headers, just carefully crafted requests).
>
> Facts:
>
> * Xeon 2.4 GHz (quad-core, 512KBx4 cache) server. CentOS 5.4.
> * Worked fine for ~9 hours, sending to backend 150+ K requests plus
> Varnish
> pings.
> * Backend is an application server. Payload is <1 KB plus minimal
> headers.
> * I can still access backend bypassing Varnish.
> * Varnish continues pinging backend, and receiving a timely 200 OK every
> time (backend logs).
> * req.backend.healthy = true. Tested with a codepath along the lines
> (req.request == "GET") && (req.url == "/") && (req.backend.healthy), that
> does an 'error 200 "OK"'.
> * varnishlog does not output anything at all. Not even the backend pings.
> * Tested every path within vcl_recv. Exiting with "error" works, but
> "lookup" results in a 503.
> * Very low CPU usage. Sorry, didn't check memory usage before restarting
> it.
>
> CFLAGS='-O2 -fomit-frame-pointer -march=pentium4 -mfpmath=sse'
> ./configure --prefix=$HOME/local --disable-largefile
>
> $HOME/local/sbin/varnishd -a *:80 -f $HOME/local/etc/varnish/xxx.vcl -p
> obj_workspace=4096 -s malloc,100M -p thread_pools=`grep -c 'model name'
> /proc/cpuinfo`
>
> backend fd1
> {
> .host="localhost";
> .port="85";
> .connect_timeout = 1s;
> .first_byte_timeout = 20s;
> .between_bytes_timeout =2s;
> .probe = {
> .request =
> "GET /xxx HTTP/1.1"
> "Host: yyy"
> "Connection: close";
> .timeout = 0.5 s;
> .window = 8;
> .threshold = 3;
> }
> }
>
> director fdcluster round-robin
> {
> { .backend = fd1; }
> }
>
> Restarted Varnish, now it works again.
>
> Any idea? Anything wrong in the command line?
>
> Thanks,
> Fernando.
>
More information about the varnish-misc
mailing list