Summary: | ASTERISK-15642: [patch] Deadlock between dahdi_exception and dahdi_indicate | ||
Reporter: | Michael J. Miller (shin-shoryuken) | Labels: | |
Date Opened: | 2010-02-16 15:16:42.000-0600 | Date Closed: | 2010-10-13 18:52:43 |
Priority: | Major | Regression? | No |
Status: | Closed/Complete | Components: | Channels/chan_dahdi |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) backtrace.01.txt ( 1) callwaiting_deadlock_1.6.1.diff.txt ( 2) callwaiting_deadlock_1.6.2.diff.txt ( 3) callwaiting_deadlock_trunk.diff.txt ( 4) callwaiting_deadlock.txt ( 5) callwaiting_deadlock-2.txt ( 6) coreshowlocks.01.txt ( 7) issue_16847_v1.4.patch ( 8) issue_16847_v1.6.2.patch ( 9) issue_16847_v1.8_v2.patch (10) issue_16847_v1.8.patch | |
Description: | I've encountered a deadlock situation between dahdi_exception and dahdi_indicate, running Asterisk 1.6.1.6 (confirmed as well in 1.6.1.14) on Debian Lenny x86_64. Deadlock occurs roughly once a week, usually under lower call volume (strangely enough). The issue seems to arise mostly when swapping between call-waiting lines on a DAHDI channel. ****** STEPS TO REPRODUCE ****** I've reproduced this most reliably in the shop by transferring calls rapidly from a Sip channel to a DAHDI channel with an active call on it. I've also seen it occur when simply swapping quickly between two lines, though not as frequently. | ||
Comments: | By: Alec Davis (alecdavis) 2010-06-11 21:23:23 I have seen exactly the same with trunk Asterisk SVN-trunk-r270042M, with callwaiting and swapping between the 2 calls. Can very easily be repeated by dialling the same FXS port from a SIP phone with 2 lines. At the FXS connected phone, hook flash between calls, and deadlock. patch for trunk: callwaiting_deadlock_trunk.diff.txt By: Alec Davis (alecdavis) 2010-06-11 23:04:49 patch for 1.6.1 and 1.6.2 callwaiting_deadlock_1.6.1.diff.txt callwaiting_deadlock_1.6.2.diff.txt Confirmed deadlock with Asterisk SVN-branch-1.6.1-r265519M By: Alec Davis (alecdavis) 2010-06-15 20:07:00 Is this still a problem for you? I'd rather have this tested by an additional party, than just commiting to trunk, and the branches. Note: I've tested this, as I was experiencing the problem myself with trunk anyway. By: Michael J. Miller (shin-shoryuken) 2010-06-16 15:49:16 I worked out a quick-and-dirty patch a while back to eliminate the issue, so it's been stable for a while. I'll load up a vanilla install of Asterisk in the lab and test your 1.6.1 patch, and let you know the results. Thanks! By: Alec Davis (alecdavis) 2010-06-23 05:16:01 up for review at https://reviewboard.asterisk.org/r/738/ By: Alec Davis (alecdavis) 2010-06-24 04:50:49 A couple of uploads; callwaiting_deadlock.txt callwaiting_deadlock2.txt Summary of callwaiting_deadlock.txt: === Thread ID: -1223439472 (do_devstate_changes started at [ 723] devicestate.c ast_device_state_engine_init()) === ---> Lock #0 (astobj2.c): MUTEX 657 internal_ao2_callback c 0x84ae300 (1) === ---> Waiting for Lock #1 (channel.c): MUTEX 1389 ast_channel_cmp_cb chan 0x8684c20 (1) === --- ---> Locked Here: channel.c line 3280 (__ast_read) === ------------------------------------------------------------------- === === Thread ID: -1347757168 (do_monitor started at [11266] chan_dahdi.c restart_monitor()) === ---> Lock #0 (chan_dahdi.c): MUTEX 10997 do_monitor &iflock 0xb6e5c060 (1) === ---> Waiting for Lock #1 (chan_dahdi.c): MUTEX 11015 do_monitor &i->lock 0xaffbf5e8 (1) === --- ---> Locked Here: chan_dahdi.c line 8340 (dahdi_exception) === ------------------------------------------------------------------- === === Thread ID: -1348744304 (pbx_thread started at [ 4963] pbx.c ast_pbx_start()) === ---> Lock #0 (channel.c): MUTEX 3280 __ast_read chan 0x8684c20 (1) <b>=== ---> Lock #1 (chan_dahdi.c): MUTEX 8340 dahdi_exception &p->lock 0xaffbf5e8 (1) === ---> Waiting for Lock #2 (channel.c): MUTEX 1138 __ast_queue_frame chan 0x8676508 (1)</b> === --- ---> Locked Here: channel.c line 3810 (ast_indicate_data) === ------------------------------------------------------------------- === === Thread ID: -1348494448 (pbx_thread started at [ 4963] pbx.c ast_pbx_start()) <b>=== ---> Lock #0 (channel.c): MUTEX 3810 ast_indicate_data chan 0x8676508 (1) === ---> Waiting for Lock #1 (chan_dahdi.c): MUTEX 8802 dahdi_indicate &p->lock 0xaffbf5e8 (1)</b> === --- ---> Locked Here: chan_dahdi.c line 8340 (dahdi_exception) === ------------------------------------------------------------------- === ======================================================================= asterix*CLI> By: Alec Davis (alecdavis) 2010-06-24 17:23:54 As has been said on reviewboard, callwaiting_deadlock_trunk.diff.txt is not the right way to fix the issue. Currently callwaiting_deadlock_trunk.diff.txt avoids the deadlock, which helps. By: Richard Mudgett (rmudgett) 2010-10-11 14:30:02 I have not been able to reproduce the deadlock. However, I can see it in the code. The issue_16847_v1.8.patch should fix the deadlock for Asterisk v1.8. I am working on backporting this patch to the earlier supported Asterisk version. By: Richard Mudgett (rmudgett) 2010-10-11 15:07:48 Up for review at https://reviewboard.asterisk.org/r/971/ By: Alec Davis (alecdavis) 2010-10-12 01:57:57 After rechecking out 270042,compiled and intalled. Asterisk SVN-trunk-r270042M built by root @ asterix on a i686 running Linux on 2010-10-12 06:31:03 UTC asterix*CLI> dahdi show version DAHDI Version: SVN-trunk-r9329 Echo Canceller: Symptoms without patch using r270042: dead lock when flashing to answer other incoming call. if not deadlock then SIP phone has continuous tone. experienced both of these back in June, only solution was to disable callwaiting. Now with patch against r270042, could not get to deadlock, and able to swap between calls, from either end, and no continuous tone at SIP phone, correct MOH after each swap. setup: This was using the same SIP phone. XLITE Line 1 -------> Asterisk TDM800P FXS XLITE Line 2 -------+ By: Richard Mudgett (rmudgett) 2010-10-12 14:02:28 I updated the issue_16847_v1.8_v2.patch with some minor changes. I added the issue_16847_v1.6.2.patch. By: Alec Davis (alecdavis) 2010-10-12 14:49:27 I couldn't get trunk to deadlock before the patch. using issue_16847_v1.8.patch: Not yet tested later patch. IIRC this never deadlocked when swapping between calls after the call was sucessfully answered. The key to getting this to deadlock in earlier versions, is to hook flash on the FXS port just as you hear the CW beep. --- Not sure if this is releated, to one of the changes, but the following senario didn't give MOH when I answered. Established call from SIP -> FXS Call 2 from FXO -> (SIP & FXS) used DIAL(SIP/xyz&DAHDI/35) Both SIP phone and FXS port received CW beeps, but when answered by flashing with the FXS port, the SIP phone didn't have MOH. I need to check 'trunk' without the patch. By: Alec Davis (alecdavis) 2010-10-13 03:58:54 Follow on ~127904, trunk with and without issue_16847_v1.8_v2.patch exhibits the no MOH issue to the SIP device. So not related. By: Richard Mudgett (rmudgett) 2010-10-13 18:10:23 Added the issue_16847_v1.4.patch version. By: Digium Subversion (svnbot) 2010-10-13 18:30:00 Repository: asterisk Revision: 291643 U branches/1.4/channels/chan_dahdi.c ------------------------------------------------------------------------ r291643 | rmudgett | 2010-10-13 18:29:59 -0500 (Wed, 13 Oct 2010) | 20 lines Deadlock between dahdi_exception() and dahdi_indicate(). There is a deadlock between dahdi_exception() and dahdi_indicate() for analog ports. The call-waiting and three-way-calling feature can experience deadlock if these features are trying to do something and an event from the bridged channel happens at the same time. Deadlock avoidance code added to obtain necessary channel locks before attemting an operation with call-waiting and three-way-calling. (closes issue ASTERISK-15642) Reported by: shin-shoryuken Patches: issue_16847_v1.4.patch uploaded by rmudgett (license 664) issue_16847_v1.6.2.patch uploaded by rmudgett (license 664) issue_16847_v1.8_v2.patch uploaded by rmudgett (license 664) Tested by: alecdavis, rmudgett Review: https://reviewboard.asterisk.org/r/971/ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=291643 By: Digium Subversion (svnbot) 2010-10-13 18:36:52 Repository: asterisk Revision: 291655 _U branches/1.6.2/ U branches/1.6.2/channels/chan_dahdi.c ------------------------------------------------------------------------ r291655 | rmudgett | 2010-10-13 18:36:52 -0500 (Wed, 13 Oct 2010) | 27 lines Merged revisions 291643 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r291643 | rmudgett | 2010-10-13 18:29:58 -0500 (Wed, 13 Oct 2010) | 20 lines Deadlock between dahdi_exception() and dahdi_indicate(). There is a deadlock between dahdi_exception() and dahdi_indicate() for analog ports. The call-waiting and three-way-calling feature can experience deadlock if these features are trying to do something and an event from the bridged channel happens at the same time. Deadlock avoidance code added to obtain necessary channel locks before attemting an operation with call-waiting and three-way-calling. (closes issue ASTERISK-15642) Reported by: shin-shoryuken Patches: issue_16847_v1.4.patch uploaded by rmudgett (license 664) issue_16847_v1.6.2.patch uploaded by rmudgett (license 664) issue_16847_v1.8_v2.patch uploaded by rmudgett (license 664) Tested by: alecdavis, rmudgett Review: https://reviewboard.asterisk.org/r/971/ ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=291655 By: Digium Subversion (svnbot) 2010-10-13 18:45:12 Repository: asterisk Revision: 291656 _U branches/1.8/ U branches/1.8/channels/chan_dahdi.c U branches/1.8/channels/sig_analog.c U branches/1.8/channels/sig_analog.h ------------------------------------------------------------------------ r291656 | rmudgett | 2010-10-13 18:45:12 -0500 (Wed, 13 Oct 2010) | 34 lines Merged revisions 291655 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ................ r291655 | rmudgett | 2010-10-13 18:36:50 -0500 (Wed, 13 Oct 2010) | 27 lines Merged revisions 291643 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r291643 | rmudgett | 2010-10-13 18:29:58 -0500 (Wed, 13 Oct 2010) | 20 lines Deadlock between dahdi_exception() and dahdi_indicate(). There is a deadlock between dahdi_exception() and dahdi_indicate() for analog ports. The call-waiting and three-way-calling feature can experience deadlock if these features are trying to do something and an event from the bridged channel happens at the same time. Deadlock avoidance code added to obtain necessary channel locks before attemting an operation with call-waiting and three-way-calling. (closes issue ASTERISK-15642) Reported by: shin-shoryuken Patches: issue_16847_v1.4.patch uploaded by rmudgett (license 664) issue_16847_v1.6.2.patch uploaded by rmudgett (license 664) issue_16847_v1.8_v2.patch uploaded by rmudgett (license 664) Tested by: alecdavis, rmudgett Review: https://reviewboard.asterisk.org/r/971/ ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=291656 By: Digium Subversion (svnbot) 2010-10-13 18:52:42 Repository: asterisk Revision: 291658 _U trunk/ U trunk/channels/chan_dahdi.c U trunk/channels/sig_analog.c U trunk/channels/sig_analog.h ------------------------------------------------------------------------ r291658 | rmudgett | 2010-10-13 18:52:42 -0500 (Wed, 13 Oct 2010) | 41 lines Merged revisions 291656 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.8 ................ r291656 | rmudgett | 2010-10-13 18:45:11 -0500 (Wed, 13 Oct 2010) | 34 lines Merged revisions 291655 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ................ r291655 | rmudgett | 2010-10-13 18:36:50 -0500 (Wed, 13 Oct 2010) | 27 lines Merged revisions 291643 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r291643 | rmudgett | 2010-10-13 18:29:58 -0500 (Wed, 13 Oct 2010) | 20 lines Deadlock between dahdi_exception() and dahdi_indicate(). There is a deadlock between dahdi_exception() and dahdi_indicate() for analog ports. The call-waiting and three-way-calling feature can experience deadlock if these features are trying to do something and an event from the bridged channel happens at the same time. Deadlock avoidance code added to obtain necessary channel locks before attemting an operation with call-waiting and three-way-calling. (closes issue ASTERISK-15642) Reported by: shin-shoryuken Patches: issue_16847_v1.4.patch uploaded by rmudgett (license 664) issue_16847_v1.6.2.patch uploaded by rmudgett (license 664) issue_16847_v1.8_v2.patch uploaded by rmudgett (license 664) Tested by: alecdavis, rmudgett Review: https://reviewboard.asterisk.org/r/971/ ........ ................ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=291658 |