Details

    • Type: Bug Bug
    • Status: Closed
    • Severity: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Target Release Version/s: None
    • Component/s: Core/General
    • Labels:
      None
    • Mantis ID:
      861
    • Regression:
      No

      Description

      This problem was originally reported on the mailing list about a week ago.
      http://lists.digium.com/pipermail/asterisk-users/2004-January/032455.html

      We have a lot more information regarding this issue now and we know how to reproduce it:

      1) from your local box telnet into the asterisk manager (port 5038) and log in
      2) disconnect your local workstation's network
      3) make about 20 phone calls (no matter if internal or voice-mail) and asterisk will hang (no dial-tone, no nothing)
      4) if you re-plug your network and wait (a minute or two) asterisk will wake up again

      Theory: the ast-man tries to send events over the network to the local workstation. Since it was unplugged from the net ast-man can't send it's data, data buffers at the server and once the buffer is full the ast-man thread blocks waiting for the buffer to empty. It does this while it still holds a mutex that cause the other *-threads to block. And the buffer doesn't empty, so asterisk hangs.

      Unfortunately this is only a theory.

                • ADDITIONAL INFORMATION ******

      Attached is a gdb thread trace of the described situation.

        Activity

        Hide
        mjohnston added a comment -

        I had a look at the code, and the problem is, from what I can tell, that it's not easy to drop connections when their buffers fill up. There's a linked list of managers, with a separate thread running to process input. However, output just goes directly to the manager's session. If you just tear down the session when the buffer fills up, the input thread will be stranded, so it has to be alerted somehow.

        I've modified my copy of manager.c to set the socket to non-blocking mode and, when the buffer fills up, to set a write_error flag on the session, but that's as far as I got - I don't know threads in C. The only extra thing needed is a way to send a signal to the input session, telling it that there's been a write error and it should destroy itself. Anyone?

        I'm attaching a quick patch against manager.c that should keep Asterisk from locking hard. It's far from ideal - if the buffer fills up, output will be discarded and the buffer maintained as long as the OS allows, and if the connection is reestablished (not reconnected, but reestablished), you'll get the buffered data, but you won't know that any data was lost. Beats a hang, IMO.

        Also, a proper fix would account for the other ways that a manager connection can be written, for completeness' sake - this is the only one that I can imagine ever hanging, though.

        Show
        mjohnston added a comment - I had a look at the code, and the problem is, from what I can tell, that it's not easy to drop connections when their buffers fill up. There's a linked list of managers, with a separate thread running to process input. However, output just goes directly to the manager's session. If you just tear down the session when the buffer fills up, the input thread will be stranded, so it has to be alerted somehow. I've modified my copy of manager.c to set the socket to non-blocking mode and, when the buffer fills up, to set a write_error flag on the session, but that's as far as I got - I don't know threads in C. The only extra thing needed is a way to send a signal to the input session, telling it that there's been a write error and it should destroy itself. Anyone? I'm attaching a quick patch against manager.c that should keep Asterisk from locking hard. It's far from ideal - if the buffer fills up, output will be discarded and the buffer maintained as long as the OS allows, and if the connection is reestablished (not reconnected, but reestablished), you'll get the buffered data, but you won't know that any data was lost. Beats a hang, IMO. Also, a proper fix would account for the other ways that a manager connection can be written, for completeness' sake - this is the only one that I can imagine ever hanging, though.
        Hide
        markus added a comment -

        I haven't had a chance to really look into the code yet, but I hope to be able to have a glimpse at the manager code. That input session you're talking about: which function implements that thread? Maybe I can come up with some way to notify that thread if a read-error occurs.

        BTW: right now I'm playing around with Asterisk 1.0 on a test-system. Seems to have a few bugs fixed that we found (and fixed quite horribly ourselves in the previous releases .

        Show
        markus added a comment - I haven't had a chance to really look into the code yet, but I hope to be able to have a glimpse at the manager code. That input session you're talking about: which function implements that thread? Maybe I can come up with some way to notify that thread if a read-error occurs. BTW: right now I'm playing around with Asterisk 1.0 on a test-system. Seems to have a few bugs fixed that we found (and fixed quite horribly ourselves in the previous releases .
        Hide
        Brian West added a comment -

        Fixed in CVS

        Show
        Brian West added a comment - Fixed in CVS
        Hide
        Digium Subversion added a comment -

        Repository: asterisk
        Revision: 2138

        U trunk/Makefile
        U trunk/channels/chan_sip.c
        U trunk/configs/mgcp.conf.sample
        U trunk/manager.c

        ------------------------------------------------------------------------
        r2138 | markster | 2008-01-15 14:43:05 -0600 (Tue, 15 Jan 2008) | 5 lines

        Insert blank after REFER (bug ASTERISK-991)
        Correct path to VM sample (bug ASTERISK-988)
        Make manager interface non-blocking (bug ASTERISK-855)
        Don't bork on empty from in SIP (bug ASTERISK-881)

        ------------------------------------------------------------------------

        http://svn.digium.com/view/asterisk?view=rev&revision=2138

        Show
        Digium Subversion added a comment - Repository: asterisk Revision: 2138 U trunk/Makefile U trunk/channels/chan_sip.c U trunk/configs/mgcp.conf.sample U trunk/manager.c ------------------------------------------------------------------------ r2138 | markster | 2008-01-15 14:43:05 -0600 (Tue, 15 Jan 2008) | 5 lines Insert blank after REFER (bug ASTERISK-991 ) Correct path to VM sample (bug ASTERISK-988 ) Make manager interface non-blocking (bug ASTERISK-855 ) Don't bork on empty from in SIP (bug ASTERISK-881 ) ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=2138
        Hide
        Digium Subversion added a comment -

        Repository: asterisk
        Revision: 2139

        U branches/v1-0_stable/Makefile
        U branches/v1-0_stable/channels/chan_sip.c
        U branches/v1-0_stable/configs/mgcp.conf.sample
        U branches/v1-0_stable/manager.c

        ------------------------------------------------------------------------
        r2139 | markster | 2008-01-15 14:43:06 -0600 (Tue, 15 Jan 2008) | 5 lines

        Insert blank after REFER (bug ASTERISK-991)
        Correct path to VM sample (bug ASTERISK-988)
        Make manager interface non-blocking (bug ASTERISK-855)
        Don't bork on empty from in SIP (bug ASTERISK-881)

        ------------------------------------------------------------------------

        http://svn.digium.com/view/asterisk?view=rev&revision=2139

        Show
        Digium Subversion added a comment - Repository: asterisk Revision: 2139 U branches/v1-0_stable/Makefile U branches/v1-0_stable/channels/chan_sip.c U branches/v1-0_stable/configs/mgcp.conf.sample U branches/v1-0_stable/manager.c ------------------------------------------------------------------------ r2139 | markster | 2008-01-15 14:43:06 -0600 (Tue, 15 Jan 2008) | 5 lines Insert blank after REFER (bug ASTERISK-991 ) Correct path to VM sample (bug ASTERISK-988 ) Make manager interface non-blocking (bug ASTERISK-855 ) Don't bork on empty from in SIP (bug ASTERISK-881 ) ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=2139

          People

          • Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development