Re: Segmentation fault in FwdState::serverClosed

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 09 May 2014 05:49:58 +1200

On 9/05/2014 3:34 a.m., Alex Rousskov wrote:
> On 05/08/2014 07:21 AM, Amos Jeffries wrote:
>
>> This is a side effect of trunk rev.13388 (standby connections)
>
> If this is something you can reproduce, please post/share an ALL,9
> cache.log.
>
>
>> Something is still calling serverConn->close() directly instead of via
>> the FwdState::closeServer() method. This appears to be happening in at
>> least the FwdState methods connectTimeout(), dispatch() and
>> connectDone() error cases.
>>
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x082234fa in FwdState::serverClosed (this=0x8c46488, fd=-1) at
>> ../../src/FwdState.cc:625
>> 625 fwdPconnPool->noteUses(fd_table[fd].pconn.uses);
>> (gdb) bt
>> #0 0x082234fa in FwdState::serverClosed (this=0x8c46488, fd=-1) at
>> ../../src/FwdState.cc:625
>> #1 0x082230e6 in fwdServerClosedWrapper (params=...) at
>> ../../src/FwdState.cc:539
>> #2 0x08373185 in CommCloseCbPtrFun::dial (this=0x8c3e12c) at
>> ../../src/CommCalls.cc:211
>
> Do you know how a connection close callback (fwdServerClosedWrapper) can
> be called with fd set to -1? It feels like there is a shared Connection
> object that two different jobs are updating/closing independently,
> stepping on each other's toes. In theory, any connection owned by
> FwdState should not be in the standby pool but perhaps there is a bug in
> getting some standby connections disassociated from the standby code
> when they are passed to FwdState (e.g., a stale close handler that
> should have been removed)?

For the FD to be -1 it has to be closed by Squid calling X->close()
AFAIK. Yes the serverConn Comm::Connection object is shared between
FwdState, HttpStateData and its Server parent.

The methods I mentioned above are ones where there is a strong potential
of leading to it since they are not using the new server closing
function added by the standby patch. There may be others outside of
FwdState as well.

The replication so far is to load a page that is slowed by many
components and click on links to change the page half-loaded.

cache.log for the below trace is at
<http://master.squid-cache.org/~amosjeffries/patches/rev13388_segfault.log>

Program received signal SIGSEGV, Segmentation fault.
0x0821fefc in FwdState::serverClosed (this=0x8b26190, fd=-1) at
../../src/FwdState.cc:622
622 debugs(17, 2, "FD " << fd << " " << entry->url() << " after " <<
(gdb) bt full
#0 0x0821fefc in FwdState::serverClosed (this=0x8b26190, fd=-1) at
../../src/FwdState.cc:622
        _dbo = @0x8b96948: <incomplete type>
        __FUNCTION__ = "serverClosed"
#1 0x0821fb9e in fwdServerClosedWrapper (params=...) at
../../src/FwdState.cc:539
        fwd = 0x8b26190
#2 0x0836ebe7 in CommCloseCbPtrFun::dial (this=0x8afa474) at
../../src/CommCalls.cc:211
No locals.
#3 0x0836de4a in CommCbFunPtrCallT<CommCloseCbPtrFun>::fire
(this=0x8afa458) at ../../src/CommCalls.h:379
No locals.
#4 0x0835df99 in AsyncCall::make (this=0x8afa458) at
../../../src/base/AsyncCall.cc:32
        __FUNCTION__ = "make"
#5 0x08361440 in AsyncCallQueue::fireNext (this=0x87086c8) at
../../../src/base/AsyncCallQueue.cc:52
        call = {p_ = 0x8afa458}
        __FUNCTION__ = "fireNext"
#6 0x083611d3 in AsyncCallQueue::fire (this=0x87086c8) at
../../../src/base/AsyncCallQueue.cc:38
        made = true
#7 0x081fc43d in EventLoop::dispatchCalls (this=0xbffff658) at
../../src/EventLoop.cc:165
        dispatchedSome = 8
#8 0x081fc2f2 in EventLoop::runOnce (this=0xbffff658) at
../../src/EventLoop.cc:142
        sawActivity = false
        waitingEngine = 0xbffff680
        __FUNCTION__ = "runOnce"
#9 0x081fc174 in EventLoop::run (this=0xbffff658) at
../../src/EventLoop.cc:104
No locals.
#10 0x08270170 in SquidMain (argc=2, argv=0xbffff7c4) at
../../src/main.cc:1510
        WIN32_init_err = 0
        __FUNCTION__ = "SquidMain"
        signalEngine = {<AsyncEngine> = {_vptr.AsyncEngine = 0x842adc8
<vtable for SignalEngine+8>}, <No data fields>}
        store_engine = {<AsyncEngine> = {_vptr.AsyncEngine = 0x842ade0
<vtable for StoreRootEngine+8>}, <No data fields>}
        comm_engine = {<AsyncEngine> = {_vptr.AsyncEngine = 0x8504c78
<vtable for CommSelectEngine+8>}, <No data fields>}
        mainLoop = {errcount = 0, static Running = 0xbffff658, last_loop
= false,
          engines = {<std::_Vector_base<AsyncEngine*,
std::allocator<AsyncEngine*> >> = {
              _M_impl = {<std::allocator<AsyncEngine*>> =
{<__gnu_cxx::new_allocator<AsyncEngine*>> = {<No data fields>}, <No data
fields>},
                _M_start = 0x8ab5688, _M_finish = 0x8ab5698,
_M_end_of_storage = 0x8ab5698}}, <No data fields>}, timeService =
0xbffff67c,
          primaryEngine = 0xbffff680, loop_delay = 0, error = false,
runOnceResult = false}
        time_engine = {_vptr.TimeEngine = 0x843b0d8 <vtable for
TimeEngine+8>}
#11 0x0826f5d6 in SquidMainSafe (argc=2, argv=0xbffff7c4) at
../../src/main.cc:1242
        __FUNCTION__ = "SquidMainSafe"
#12 0x0826f5bb in main (argc=2, argv=0xbffff7c4) at ../../src/main.cc:1234
No locals.
Received on Thu May 08 2014 - 17:50:14 MDT

This archive was generated by hypermail 2.2.0 : Fri May 09 2014 - 12:00:12 MDT