[Home]

Summary:ASTERISK-15642: [patch] Deadlock between dahdi_exception and dahdi_indicate
Reporter:Michael J. Miller (shin-shoryuken)Labels:
Date Opened:2010-02-16 15:16:42.000-0600Date Closed:2010-10-13 18:52:43
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_dahdi
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) backtrace.01.txt
( 1) callwaiting_deadlock_1.6.1.diff.txt
( 2) callwaiting_deadlock_1.6.2.diff.txt
( 3) callwaiting_deadlock_trunk.diff.txt
( 4) callwaiting_deadlock.txt
( 5) callwaiting_deadlock-2.txt
( 6) coreshowlocks.01.txt
( 7) issue_16847_v1.4.patch
( 8) issue_16847_v1.6.2.patch
( 9) issue_16847_v1.8_v2.patch
(10) issue_16847_v1.8.patch
Description:I've encountered a deadlock situation between dahdi_exception and dahdi_indicate, running Asterisk 1.6.1.6 (confirmed as well in 1.6.1.14) on Debian Lenny x86_64.  Deadlock occurs roughly once a week, usually under lower call volume (strangely enough).  The issue seems to arise mostly when swapping between call-waiting lines on a DAHDI channel.

****** STEPS TO REPRODUCE ******

I've reproduced this most reliably in the shop by transferring calls rapidly from a Sip channel to a DAHDI channel with an active call on it.  I've also seen it occur when simply swapping quickly between two lines, though not as frequently.
Comments:By: Alec Davis (alecdavis) 2010-06-11 21:23:23

I have seen exactly the same with trunk Asterisk SVN-trunk-r270042M, with callwaiting and swapping between the 2 calls.

Can very easily be repeated by dialling the same FXS port from a SIP phone with 2 lines. At the FXS connected phone, hook flash between calls, and deadlock.

patch for trunk: callwaiting_deadlock_trunk.diff.txt



By: Alec Davis (alecdavis) 2010-06-11 23:04:49

patch for 1.6.1 and 1.6.2
callwaiting_deadlock_1.6.1.diff.txt
callwaiting_deadlock_1.6.2.diff.txt

Confirmed deadlock with Asterisk SVN-branch-1.6.1-r265519M

By: Alec Davis (alecdavis) 2010-06-15 20:07:00

Is this still a problem for you?
I'd rather have this tested by an additional party, than just commiting to trunk, and the branches.

Note: I've tested this, as I was experiencing the problem myself with trunk anyway.



By: Michael J. Miller (shin-shoryuken) 2010-06-16 15:49:16

I worked out a quick-and-dirty patch a while back to eliminate the issue, so it's been stable for a while.  I'll load up a vanilla install of Asterisk in the lab and test your 1.6.1 patch, and let you know the results.  Thanks!

By: Alec Davis (alecdavis) 2010-06-23 05:16:01

up for review at https://reviewboard.asterisk.org/r/738/

By: Alec Davis (alecdavis) 2010-06-24 04:50:49

A couple of uploads;
   callwaiting_deadlock.txt
   callwaiting_deadlock2.txt

Summary of callwaiting_deadlock.txt:
=== Thread ID: -1223439472 (do_devstate_changes  started at [  723] devicestate.c ast_device_state_engine_init())
=== ---> Lock #0 (astobj2.c): MUTEX 657 internal_ao2_callback c 0x84ae300 (1)
=== ---> Waiting for Lock #1 (channel.c): MUTEX 1389 ast_channel_cmp_cb chan 0x8684c20 (1)
=== --- ---> Locked Here: channel.c line 3280 (__ast_read)
=== -------------------------------------------------------------------
===

=== Thread ID: -1347757168 (do_monitor           started at [11266] chan_dahdi.c restart_monitor())
=== ---> Lock #0 (chan_dahdi.c): MUTEX 10997 do_monitor &iflock 0xb6e5c060 (1)
=== ---> Waiting for Lock #1 (chan_dahdi.c): MUTEX 11015 do_monitor &i->lock 0xaffbf5e8 (1)
=== --- ---> Locked Here: chan_dahdi.c line 8340 (dahdi_exception)
=== -------------------------------------------------------------------
===

=== Thread ID: -1348744304 (pbx_thread           started at [ 4963] pbx.c ast_pbx_start())
=== ---> Lock #0 (channel.c): MUTEX 3280 __ast_read chan 0x8684c20 (1)
<b>=== ---> Lock #1 (chan_dahdi.c): MUTEX 8340 dahdi_exception &p->lock 0xaffbf5e8 (1)
=== ---> Waiting for Lock #2 (channel.c): MUTEX 1138 __ast_queue_frame chan 0x8676508 (1)</b>
=== --- ---> Locked Here: channel.c line 3810 (ast_indicate_data)
=== -------------------------------------------------------------------
===

=== Thread ID: -1348494448 (pbx_thread           started at [ 4963] pbx.c ast_pbx_start())
<b>=== ---> Lock #0 (channel.c): MUTEX 3810 ast_indicate_data chan 0x8676508 (1)
=== ---> Waiting for Lock #1 (chan_dahdi.c): MUTEX 8802 dahdi_indicate &p->lock 0xaffbf5e8 (1)</b>
=== --- ---> Locked Here: chan_dahdi.c line 8340 (dahdi_exception)
=== -------------------------------------------------------------------
===
=======================================================================

asterix*CLI>



By: Alec Davis (alecdavis) 2010-06-24 17:23:54

As has been said on reviewboard, callwaiting_deadlock_trunk.diff.txt is not the right way to fix the issue.

Currently callwaiting_deadlock_trunk.diff.txt avoids the deadlock, which helps.

By: Richard Mudgett (rmudgett) 2010-10-11 14:30:02

I have not been able to reproduce the deadlock.  However, I can see it in the code.  The issue_16847_v1.8.patch should fix the deadlock for Asterisk v1.8.  I am working on backporting this patch to the earlier supported Asterisk version.

By: Richard Mudgett (rmudgett) 2010-10-11 15:07:48

Up for review at https://reviewboard.asterisk.org/r/971/

By: Alec Davis (alecdavis) 2010-10-12 01:57:57

After rechecking out 270042,compiled and intalled.
Asterisk SVN-trunk-r270042M built by root @ asterix on a i686 running Linux on 2010-10-12 06:31:03 UTC
asterix*CLI> dahdi show version
DAHDI Version: SVN-trunk-r9329 Echo Canceller:

Symptoms without patch using r270042:
dead lock when flashing to answer other incoming call.
if not deadlock then SIP phone has continuous tone.
experienced both of these back in June, only solution was to disable callwaiting.

Now with patch against r270042, could not get to deadlock, and able to swap between calls, from either end, and no continuous tone at SIP phone, correct MOH after each swap.

setup:
This was using the same SIP phone.
 XLITE Line 1 -------> Asterisk TDM800P FXS
 XLITE Line 2 -------+



By: Richard Mudgett (rmudgett) 2010-10-12 14:02:28

I updated the issue_16847_v1.8_v2.patch with some minor changes.
I added the issue_16847_v1.6.2.patch.

By: Alec Davis (alecdavis) 2010-10-12 14:49:27

I couldn't get trunk to deadlock before the patch.

using issue_16847_v1.8.patch: Not yet tested later patch.

IIRC this never deadlocked when swapping between calls after the call was sucessfully answered.

The key to getting this to deadlock in earlier versions, is to hook flash on the FXS port just as you hear the CW beep.


---

Not sure if this is releated, to one of the changes, but the following senario didn't give MOH when I answered.

Established call from SIP -> FXS
Call 2 from FXO -> (SIP & FXS) used DIAL(SIP/xyz&DAHDI/35)
Both SIP phone and FXS port received CW beeps, but when answered by flashing with the FXS port, the SIP phone didn't have MOH.

I need to check 'trunk' without the patch.

By: Alec Davis (alecdavis) 2010-10-13 03:58:54

Follow on ~127904, trunk with and without issue_16847_v1.8_v2.patch exhibits the no MOH issue to the SIP device. So not related.

By: Richard Mudgett (rmudgett) 2010-10-13 18:10:23

Added the issue_16847_v1.4.patch version.

By: Digium Subversion (svnbot) 2010-10-13 18:30:00

Repository: asterisk
Revision: 291643

U   branches/1.4/channels/chan_dahdi.c

------------------------------------------------------------------------
r291643 | rmudgett | 2010-10-13 18:29:59 -0500 (Wed, 13 Oct 2010) | 20 lines

Deadlock between dahdi_exception() and dahdi_indicate().

There is a deadlock between dahdi_exception() and dahdi_indicate() for
analog ports.  The call-waiting and three-way-calling feature can
experience deadlock if these features are trying to do something and an
event from the bridged channel happens at the same time.

Deadlock avoidance code added to obtain necessary channel locks before
attemting an operation with call-waiting and three-way-calling.

(closes issue ASTERISK-15642)
Reported by: shin-shoryuken
Patches:
     issue_16847_v1.4.patch uploaded by rmudgett (license 664)
     issue_16847_v1.6.2.patch uploaded by rmudgett (license 664)
     issue_16847_v1.8_v2.patch uploaded by rmudgett (license 664)
Tested by: alecdavis, rmudgett

Review: https://reviewboard.asterisk.org/r/971/

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=291643

By: Digium Subversion (svnbot) 2010-10-13 18:36:52

Repository: asterisk
Revision: 291655

_U  branches/1.6.2/
U   branches/1.6.2/channels/chan_dahdi.c

------------------------------------------------------------------------
r291655 | rmudgett | 2010-10-13 18:36:52 -0500 (Wed, 13 Oct 2010) | 27 lines

Merged revisions 291643 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r291643 | rmudgett | 2010-10-13 18:29:58 -0500 (Wed, 13 Oct 2010) | 20 lines
 
 Deadlock between dahdi_exception() and dahdi_indicate().
 
 There is a deadlock between dahdi_exception() and dahdi_indicate() for
 analog ports.  The call-waiting and three-way-calling feature can
 experience deadlock if these features are trying to do something and an
 event from the bridged channel happens at the same time.
 
 Deadlock avoidance code added to obtain necessary channel locks before
 attemting an operation with call-waiting and three-way-calling.
 
 (closes issue ASTERISK-15642)
 Reported by: shin-shoryuken
 Patches:
       issue_16847_v1.4.patch uploaded by rmudgett (license 664)
       issue_16847_v1.6.2.patch uploaded by rmudgett (license 664)
       issue_16847_v1.8_v2.patch uploaded by rmudgett (license 664)
 Tested by: alecdavis, rmudgett
 
 Review: https://reviewboard.asterisk.org/r/971/
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=291655

By: Digium Subversion (svnbot) 2010-10-13 18:45:12

Repository: asterisk
Revision: 291656

_U  branches/1.8/
U   branches/1.8/channels/chan_dahdi.c
U   branches/1.8/channels/sig_analog.c
U   branches/1.8/channels/sig_analog.h

------------------------------------------------------------------------
r291656 | rmudgett | 2010-10-13 18:45:12 -0500 (Wed, 13 Oct 2010) | 34 lines

Merged revisions 291655 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.6.2

................
 r291655 | rmudgett | 2010-10-13 18:36:50 -0500 (Wed, 13 Oct 2010) | 27 lines
 
 Merged revisions 291643 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r291643 | rmudgett | 2010-10-13 18:29:58 -0500 (Wed, 13 Oct 2010) | 20 lines
   
   Deadlock between dahdi_exception() and dahdi_indicate().
   
   There is a deadlock between dahdi_exception() and dahdi_indicate() for
   analog ports.  The call-waiting and three-way-calling feature can
   experience deadlock if these features are trying to do something and an
   event from the bridged channel happens at the same time.
   
   Deadlock avoidance code added to obtain necessary channel locks before
   attemting an operation with call-waiting and three-way-calling.
   
   (closes issue ASTERISK-15642)
   Reported by: shin-shoryuken
   Patches:
         issue_16847_v1.4.patch uploaded by rmudgett (license 664)
         issue_16847_v1.6.2.patch uploaded by rmudgett (license 664)
         issue_16847_v1.8_v2.patch uploaded by rmudgett (license 664)
   Tested by: alecdavis, rmudgett
   
   Review: https://reviewboard.asterisk.org/r/971/
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=291656

By: Digium Subversion (svnbot) 2010-10-13 18:52:42

Repository: asterisk
Revision: 291658

_U  trunk/
U   trunk/channels/chan_dahdi.c
U   trunk/channels/sig_analog.c
U   trunk/channels/sig_analog.h

------------------------------------------------------------------------
r291658 | rmudgett | 2010-10-13 18:52:42 -0500 (Wed, 13 Oct 2010) | 41 lines

Merged revisions 291656 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

................
 r291656 | rmudgett | 2010-10-13 18:45:11 -0500 (Wed, 13 Oct 2010) | 34 lines
 
 Merged revisions 291655 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.6.2
 
 ................
   r291655 | rmudgett | 2010-10-13 18:36:50 -0500 (Wed, 13 Oct 2010) | 27 lines
   
   Merged revisions 291643 via svnmerge from
   https://origsvn.digium.com/svn/asterisk/branches/1.4
   
   ........
     r291643 | rmudgett | 2010-10-13 18:29:58 -0500 (Wed, 13 Oct 2010) | 20 lines
     
     Deadlock between dahdi_exception() and dahdi_indicate().
     
     There is a deadlock between dahdi_exception() and dahdi_indicate() for
     analog ports.  The call-waiting and three-way-calling feature can
     experience deadlock if these features are trying to do something and an
     event from the bridged channel happens at the same time.
     
     Deadlock avoidance code added to obtain necessary channel locks before
     attemting an operation with call-waiting and three-way-calling.
     
     (closes issue ASTERISK-15642)
     Reported by: shin-shoryuken
     Patches:
           issue_16847_v1.4.patch uploaded by rmudgett (license 664)
           issue_16847_v1.6.2.patch uploaded by rmudgett (license 664)
           issue_16847_v1.8_v2.patch uploaded by rmudgett (license 664)
     Tested by: alecdavis, rmudgett
     
     Review: https://reviewboard.asterisk.org/r/971/
   ........
 ................
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=291658