Child panics on OpenSolaris

Paul Wright wrighty+varnishmisc at gmail.com
Wed Feb 17 12:10:58 CET 2010


On 17 February 2010 10:57, Poul-Henning Kamp <phk at phk.freebsd.dk> wrote:
> In message <282e72051002170230k7ae8e0c8hc2d5226ca9288d51 at mail.gmail.com>, Paul
> Wright writes:
>
>>I've compiled with the additional -mt flag, here's my current
>>compilation process:
>
> Please pull a brand new -trunk, I have added a check for errno
> working and I would like to make sure that passes for you also.

I have pulled the latest revision and can confirm that unlink("/")
failed as expected.  I ran it briefly and saw this panic:

Child (28554) died signal=6
Child (28554) Panic message: Assert error in http_copyheader(),
cache_http.c line 647:
  Condition(n < to->shd) not true.
thread = (cache-worker)
ident = -smalloc,-hcritbit,poll
Backtrace:
  447adb: /opt/sbin/varnishd'pan_backtrace+0x1b [0x447adb]
  447de5: /opt/sbin/varnishd'pan_ic+0x1c5 [0x447de5]
  440791: /opt/sbin/varnishd'http_copyheader+0x1c1 [0x440791]
  4423b1: /opt/sbin/varnishd'http_FilterFields+0xdc1 [0x4423b1]
  429c4d: /opt/sbin/varnishd'cnt_fetch+0x11fd [0x429c4d]
  42cf3a: /opt/sbin/varnishd'CNT_Session+0x78a [0x42cf3a]
  44a7ef: /opt/sbin/varnishd'wrk_do_cnt_sess+0x1bf [0x44a7ef]
  449d62: /opt/sbin/varnishd'wrk_thread_real+0x882 [0x449d62]
  44a315: /opt/sbin/varnishd'wrk_thread+0x135 [0x44a315]
  fffffd7ff653acf5: /lib/amd64/libc.so.1'_thrp_setup+0x8d [0xfffffd7ff653acf5]
sp = 18c9b78 {
  fd = 38, id = 38, xid = 789049749,
  client = 82.71.124.65:58190,
  step = STP_FETCH,
  handling = pass,
  err_code = 206, err_reason = (null),
  restarts = 0, esis = 0
  ws = 18c9be8 {
    id = "sess",
    {s,f,r,e} = {18ca8f0,+884,0,+65536},
  },
  http[req] = {
    ws = 18c9be8[sess]
      "GET",
      "/xml/rss.top20.000.xml",
      "HTTP/1.1",
      "Host: www.firebox.com",
      "User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5;
en-GB; rv:1.9.2) Gecko/20100115 Firefox/3.6",
      "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
      "Accept-Language: en-gb,en;q=0.5",
      "Accept-Encoding: gzip,deflate",
      "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7",
      "Keep-Alive: 115",
      "Connection: keep-alive",
      "X-Moz: livebookmarks",
      "Cookie: user_session=XXXX;
locale=company_id%3A0%2Ccurrency_id%3A0%2Clanguage_id%3A0%2Ccountry%3AUnited+Kingdom;
xp_list=52%3D1;
__utma=64137912.1325032789.1266225177.1266397371.1266400910.11;
__utmc=64137912;
__utmz=64137912.1266225177.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none)",
      "Range: bytes=15136-",
      "If-Range: "14b800b-678d-47fc91860e540"",
      "Cache-Control: max-age=0",
      "X-Forwarded-For: 82.71.124.65",
  },
  worker = fffffd7ff69fad30 {
    ws = fffffd7ff69fae78 {
      id = "wrk",
      {s,f,r,e} = {fffffd7ff69e8c40,+323,0,+65536},
    },
    http[bereq] = {
      ws = fffffd7ff69fae78[wrk]
        "GET",
        "/xml/rss.top20.000.xml",
        "HTTP/1.1",
        "Host: www.firebox.com",
        "User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5;
en-GB; rv:1.9.2) Gecko/20100115 Firefox/3.6",
        "Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
        "Accept-Language: en-gb,en;q=0.5",
        "Accept-Encoding: gzip,deflate",
        "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7",
        "X-Moz: livebookmarks",
        "Cookie: user_session=XXXX;
locale=company_id%3A0%2Ccurrency_id%3A0%2Clanguage_id%3A0%2Ccountry%3AUnited+Kingdom;
xp_list=52%3D1;
__utma=64137912.1325032789.1266225177.1266397371.1266400910.11;
__utmc=64137912;
__utmz=64137912.1266225177.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none)",
        "Range: bytes=15136-",
        "If-Range: "14b800b-678d-47fc91860e540"",
        "X-Forwarded-For: 82.71.124.65",
        "X-Varnish: 789049749",
    },
    http[beresp] = {
      ws = fffffd7ff69fae78[wrk]
        "HTTP/1.1",
        "206",
        "Partial Content",
        "Date: Wed, 17 Feb 2010 10:48:30 GMT",
        "Server: Apache",
        "Last-Modified: Wed, 17 Feb 2010 10:13:01 GMT",
        "ETag: "14b800b-678d-47fc91860e540"",
        "Accept-Ranges: bytes",
        "Content-Length: 11373",
        "Content-Range: bytes 15136-26508/26509",
        "Connection: close",
        "Content-Type: application/xml",
    },
    },
    vcl = {
      srcname = {
        "input",
        "Default",
      },
    },
  obj = 33336a0 {
    xid = 789049749,
    ws = 33336c0 {
      id = "obj",
      {s,f,r,e} = {33338a5,33338a5,0,+220},
    },
    http[obj] = {
      ws = 33336c0[obj]
        "HTTP/1.1",
        "206",
        "Partial Content",
        "Date: Wed, 17 Feb 2010 10:48:30 GMT",
        "Server: Apache",
        "Last-Modified: Wed, 17 Feb 2010 10:13:01 GMT",
        "ETag: "14b800b-678d-47fc91860e540"",
    },
    len = 0,
    store = {
    },
  },
},


>>Child (14052) Panic message: Assert error in TCP_nonblocking(), tcp.c line =
>>172:
>>  Condition((ioctl(sock, ((int)((uint32_t)(0x80000000|(((sizeof
>>(int))&0xff)<<16)| ('f'<<8)|126))), &i)) =3D=3D 0) not true.
>>errno =3D 131 (Connection reset by peer)
>
>
> Now, _this_ errno I can actually belive, because that matches
> the packet traces we have seen, and it is a plausible scenario.
>
> The fact that Solaris docs does not mention ECONNRESET as a legal
> error return for ioctl is a minor detail in that context.
>
> The difference here is that the traditional BSD stack does not
> return ECONNRESET until you try to move data on the socket,
> giving you much simpler error checking on socket-state changes
> (ioctl, fcntl, setsockopt, getsockopt etc)
>
> So now that we have reached the root-cause, I need to go through
> and do complex error-checking for all the socket-state calls.
>
> Hopefully done later today...

Grand.

Paul.


More information about the varnish-misc mailing list