[Home]

Summary:ASTERISK-14611: [patch] Stuck channel using FEATD_MF if caller hangs up at the right time
Reporter:jcromes (jcromes)Labels:
Date Opened:2009-08-06 21:29:14Date Closed:2011-04-11 10:47:19
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Channels/chan_dahdi
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) issue15671.patch
Description:CentOS 5.2 x86_64
Asterisk 1.4.26
Zaptel 1.4.12.1
Digium TE412P

This box's purpose is to be a conferencing server, so it takes almost entirely inbound calls.  All 4 spans of my TE412P are configured as FEATD_MF, and are connected to a Nortel DMS-100 switch.  (I can't do PRI.)  The system has worked great for quite a long time, except every once in a while, a channel would get stuck in "Ring" state. For example:

>> show channels
Channel       Location        State    Application(Data)    
Zap/77-1      s@tdma:1        Ring     (None)              
1 active channel
0 active calls

The DMS-100 would show this channel "idle" while it was stuck in Asterisk. It would remain stuck like this permanently until I restarted Asterisk. Once I did that (and nothing else), the channel would clear again with no problems.

-----

The cause, I discovered, was actually a caller hanging up just at the end of the Feature Group D DTMF tones that setup the call.  The reason for this is a "guard timer" that's implemented using ast_safe_sleep(100) on line 6189 of chan_dahdi.c.  If the caller happens to hang up AFTER the final tone of the DTMF string but BEFORE the end of that ast_safe_sleep(), then ast_safe_sleep() will return non-zero.  This causes the code to bounce to the end of ss_thread, but it does NOT tear down the call properly.

This should be a rare occurrence because the caller has to hang up at EXACTLY the right time.  Nonetheless, it was happening quite regularly on my system.  It's not easily reproducible, unless you purposely increase the guard-time to 2000 or more.  Once do that, you can reproduce it every time by watching the DTMF debug and hanging up just as it ends.

The solution was to change line 6189 from this:
if (ast_safe_sleep(chan,100)) goto quit;

...to this:
if (ast_safe_sleep(chan,100)) { ast_hangup(chan);  goto quit; }

I'm not very familiar with Asterisk channel handling conventions, so there may be a better way fix this problem.  I did, however, put some extra logging code in my local system and I have evidence of the situation occurring and exiting cleanly.  I have 6 weeks of runtime with 0 stuck channels at the moment.

Thanks.
Comments:By: Leif Madsen (lmadsen) 2009-09-18 07:27:03

Thanks for the report and thorough analysis! Hopefully someone can pick this up and resolve it in the near future. Thanks!

By: Paul Belanger (pabelanger) 2010-05-05 15:19:06

Attached patch based on reporters information, please re-test to we can close out this issue.

By: jcromes (jcromes) 2010-06-17 15:49:32

Unfortunately, I no longer have easy access to this FEATD system and really can't test your exact patch.  I CAN say that the code you posted is identical to what's running on the system (I have some past copies of source files for comparison).

The code you posted is exactly what's needed.  I suggest it be committed and the bug closed.  Thanks.

By: Leif Madsen (lmadsen) 2010-06-23 14:22:22

jcromes: thanks for the feedback -- pending some other developer approving the patch, I'd suggest we get this in.

By: Digium Subversion (svnbot) 2011-04-11 10:27:55

Repository: asterisk
Revision: 313188

U   branches/1.4/channels/chan_dahdi.c

------------------------------------------------------------------------
r313188 | rmudgett | 2011-04-11 10:27:54 -0500 (Mon, 11 Apr 2011) | 25 lines

Stuck channel using FEATD_MF if caller hangs up at the right time.

The cause was actually a caller hanging up just at the end of the Feature
Group D DTMF tones that setup the call.  The reason for this is a "guard
timer" that's implemented using ast_safe_sleep(100).  If the caller
happens to hang up AFTER the final tone of the DTMF string but BEFORE the
end of that ast_safe_sleep(), then ast_safe_sleep() will return non-zero.
This causes the code to bounce to the end of ss_thread(), but it does NOT
tear down the call properly.

This should be a rare occurrence because the caller has to hang up at
EXACTLY the right time.  Nonetheless, it was happening quite regularly on
the reporter's system.  It's not easily reproducible, unless you purposely
increase the guard-time to 2000 or more.  Once you do that, you can
reproduce it every time by watching the DTMF debug and hanging up just as
it ends.

Simply add an ast_hangup() before goto quit.

(closes issue ASTERISK-14611)
Reported by: jcromes
Patches:
     issue15671.patch uploaded by pabelanger (license 224)
Tested by: jcromes

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=313188

By: Digium Subversion (svnbot) 2011-04-11 10:32:55

Repository: asterisk
Revision: 313189

_U  branches/1.6.2/
U   branches/1.6.2/channels/chan_dahdi.c

------------------------------------------------------------------------
r313189 | rmudgett | 2011-04-11 10:32:55 -0500 (Mon, 11 Apr 2011) | 32 lines

Merged revisions 313188 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r313188 | rmudgett | 2011-04-11 10:27:52 -0500 (Mon, 11 Apr 2011) | 25 lines
 
 Stuck channel using FEATD_MF if caller hangs up at the right time.
 
 The cause was actually a caller hanging up just at the end of the Feature
 Group D DTMF tones that setup the call.  The reason for this is a "guard
 timer" that's implemented using ast_safe_sleep(100).  If the caller
 happens to hang up AFTER the final tone of the DTMF string but BEFORE the
 end of that ast_safe_sleep(), then ast_safe_sleep() will return non-zero.
 This causes the code to bounce to the end of ss_thread(), but it does NOT
 tear down the call properly.
 
 This should be a rare occurrence because the caller has to hang up at
 EXACTLY the right time.  Nonetheless, it was happening quite regularly on
 the reporter's system.  It's not easily reproducible, unless you purposely
 increase the guard-time to 2000 or more.  Once you do that, you can
 reproduce it every time by watching the DTMF debug and hanging up just as
 it ends.
 
 Simply add an ast_hangup() before goto quit.
 
 (closes issue ASTERISK-14611)
 Reported by: jcromes
 Patches:
       issue15671.patch uploaded by pabelanger (license 224)
 Tested by: jcromes
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=313189

By: Digium Subversion (svnbot) 2011-04-11 10:40:31

Repository: asterisk
Revision: 313190

_U  branches/1.8/
U   branches/1.8/channels/chan_dahdi.c
U   branches/1.8/channels/sig_analog.c

------------------------------------------------------------------------
r313190 | rmudgett | 2011-04-11 10:40:31 -0500 (Mon, 11 Apr 2011) | 39 lines

Merged revisions 313189 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.6.2

................
 r313189 | rmudgett | 2011-04-11 10:32:53 -0500 (Mon, 11 Apr 2011) | 32 lines
 
 Merged revisions 313188 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r313188 | rmudgett | 2011-04-11 10:27:52 -0500 (Mon, 11 Apr 2011) | 25 lines
   
   Stuck channel using FEATD_MF if caller hangs up at the right time.
   
   The cause was actually a caller hanging up just at the end of the Feature
   Group D DTMF tones that setup the call.  The reason for this is a "guard
   timer" that's implemented using ast_safe_sleep(100).  If the caller
   happens to hang up AFTER the final tone of the DTMF string but BEFORE the
   end of that ast_safe_sleep(), then ast_safe_sleep() will return non-zero.
   This causes the code to bounce to the end of ss_thread(), but it does NOT
   tear down the call properly.
   
   This should be a rare occurrence because the caller has to hang up at
   EXACTLY the right time.  Nonetheless, it was happening quite regularly on
   the reporter's system.  It's not easily reproducible, unless you purposely
   increase the guard-time to 2000 or more.  Once you do that, you can
   reproduce it every time by watching the DTMF debug and hanging up just as
   it ends.
   
   Simply add an ast_hangup() before goto quit.
   
   (closes issue ASTERISK-14611)
   Reported by: jcromes
   Patches:
         issue15671.patch uploaded by pabelanger (license 224)
   Tested by: jcromes
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=313190

By: Digium Subversion (svnbot) 2011-04-11 10:47:18

Repository: asterisk
Revision: 313191

_U  trunk/
U   trunk/channels/chan_dahdi.c
U   trunk/channels/sig_analog.c

------------------------------------------------------------------------
r313191 | rmudgett | 2011-04-11 10:47:18 -0500 (Mon, 11 Apr 2011) | 46 lines

Merged revisions 313190 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

................
 r313190 | rmudgett | 2011-04-11 10:40:30 -0500 (Mon, 11 Apr 2011) | 39 lines
 
 Merged revisions 313189 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.6.2
 
 ................
   r313189 | rmudgett | 2011-04-11 10:32:53 -0500 (Mon, 11 Apr 2011) | 32 lines
   
   Merged revisions 313188 via svnmerge from
   https://origsvn.digium.com/svn/asterisk/branches/1.4
   
   ........
     r313188 | rmudgett | 2011-04-11 10:27:52 -0500 (Mon, 11 Apr 2011) | 25 lines
     
     Stuck channel using FEATD_MF if caller hangs up at the right time.
     
     The cause was actually a caller hanging up just at the end of the Feature
     Group D DTMF tones that setup the call.  The reason for this is a "guard
     timer" that's implemented using ast_safe_sleep(100).  If the caller
     happens to hang up AFTER the final tone of the DTMF string but BEFORE the
     end of that ast_safe_sleep(), then ast_safe_sleep() will return non-zero.
     This causes the code to bounce to the end of ss_thread(), but it does NOT
     tear down the call properly.
     
     This should be a rare occurrence because the caller has to hang up at
     EXACTLY the right time.  Nonetheless, it was happening quite regularly on
     the reporter's system.  It's not easily reproducible, unless you purposely
     increase the guard-time to 2000 or more.  Once you do that, you can
     reproduce it every time by watching the DTMF debug and hanging up just as
     it ends.
     
     Simply add an ast_hangup() before goto quit.
     
     (closes issue ASTERISK-14611)
     Reported by: jcromes
     Patches:
           issue15671.patch uploaded by pabelanger (license 224)
     Tested by: jcromes
   ........
 ................
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=313191