Dropped connections with tcp_tw_recycle=1

Sven Ulland sveniu at opera.com
Sat Sep 19 19:46:19 CEST 2009


I was recently debugging an issue where several clients experienced
sporadic problems connecting to a website cached by varnish. Every now
and then (say, something like every 20-50th TCP connection) would time
out, or sometimes take a few SYNs before being accepted.

Here's a typical example. It's observed at the spot marked 'X' in this
network structure from the client network's perspective:

   [clients] -> [NAT gateway] -> [bridge firewall]X -> [Internet]

  0.00 natgw-extip varni-extip TCP 4292 > http [SYN] TSV=283647429 
TSER=0 WS=6
  2.99 natgw-extip varni-extip TCP 4292 > http [SYN] TSV=283648179 
TSER=0 WS=6
  8.99 natgw-extip varni-extip TCP 4292 > http [SYN] TSV=283649679 
TSER=0 WS=6
20.99 natgw-extip varni-extip TCP 4292 > http [SYN] TSV=283652679 TSER=0 
WS=6
44.99 natgw-extip varni-extip TCP 4292 > http [SYN] TSV=283658679 TSER=0 
WS=6
93.00 natgw-extip varni-extip TCP 4292 > http [SYN] TSV=283670679 TSER=0 
WS=6
93.00 varni-extip natgw-extip TCP http > 4292 [SYN, ACK] TSV=2342207123 
TSER=283670679

Note: The NAT gateway didn't do port translation here. Also, the
timestamp values were not touched by the NAT gateway. The varnish node
is behind LVS-TUN, but the LVS was not the culprit.

After troubleshooting with the website owner, tcpdumping at various
points on both sides, it was clear that the packets were reaching the
varnish node, but except the last SYN, they were all dropped. This
turned out to be because the varnish node had the tcp_tw_recycle sysctl
enabled. Switching it off fixed the problem.

The performance page on the varnish wiki features recommends Linux
sysctl settings, including enabling tcp_tw_recycle, since april 2008.
The recycle setting was removed from that page recently, but I would
think there are a lot of installations around the world that have it
enabled.

I tried to figure out exactly how the recycling mechanism works, but the
code is too complex to figure out without time or kernel network
experience. Recycling was introduced by David Miller in 2.3.15, ref
<URL:http://lxr.linux.no/#linux-old+v2.3.15/net/ipv4/tcp_ipv4.c#L324>
and e.g. <URL:http://lxr.linux.no/#linux+v2.6.31/net/ipv4/tcp_ipv4.c#L1255>.
Do anyone have a good grasp on how it works, its connection to the RFC
1323 PAWS mechanism, and its claimed incompatibility with NAT (ref
<URL:http://lkml.org/lkml/2008/11/15/83>)?

When observing the same issue previously (dropped SYNs), I ditched
tw_recycle in favour of tcp_tw_reuse, which doesn't seem to cause any
problems (this was on a normal Apache system). It too is severely
underdocumented, so I was hoping to shed some light on them both, and
the exact circumstances where they are suitable for use.

Sven


More information about the varnish-misc mailing list