[Home]

Summary:ASTERISK-24600: Stuck IAX channels, Asterisk stops responding to most traffic, potential deadlock
Reporter:Jeff Collell (JeffC_NN)Labels:
Date Opened:2014-12-08 14:25:44.000-0600Date Closed:2015-01-20 10:49:12.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_iax2
Versions:13.0.1 Frequency of
Occurrence
Related
Issues:
Environment:Ubuntu 14.04.1 LTS (trusty) x86_64Attachments:( 0) asterisk-bug.PNG
( 1) backtrace-threads.txt
( 2) core-show-locks.txt
( 3) iax_other_side.conf
( 4) iax.conf
( 5) iax.txt
( 6) threads.txt
( 7) threads-of-interest.txt
Description:Setup:

20 endpoints, register locally over SIP (no NAT)
1 IAX trunk to another remote Asterisk box (unauthenticated, IP restricted on firewalls)
Server randomly gets 2 IAX channels "stuck" (most likely after the call has ended since we haven't had complaints of dropped calls), stops responding to IAX, SIP registrations, etc. High CPU usage on 2 threads (see attached).

Issue seems to be happening randomly with calls over trunk (SIP to IAX and IAX to SIP). I can reproduce it by simply creating lots of channels over the trunk (it's not rare, but more channels just means I get the issue to happen sooner, not a load issue). Happens every few dozen calls made by the client (about daily since this is a small office).

Our next steps will be to configure a SIP trunk between sites to isolate if the issue is IAX specific.
Comments:By: Jeff Collell (JeffC_NN) 2014-12-08 14:31:01.261-0600

files needed for debug

By: Rusty Newton (rnewton) 2014-12-09 09:41:13.980-0600

Can we get the iax.conf for both servers and the dialplan used to dial across?

You might see if you can reproduce it by generating calls over the IAX trunk without SIP involved. Say, callfiles, or originations.

By: Rusty Newton (rnewton) 2014-12-09 09:45:09.712-0600

Additionally, see if you can get an [Asterisk log with "DEBUG" logger type turned up to 5|https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information] and an IAX trace running. We would want to see the last few minutes up until Asterisk stops responding. Then after that moment, if you can get the output of "core show channels" and "iax2 show channels".

By: Matt Jordan (mjordan) 2014-12-09 09:46:24.615-0600

I think I know what's happening here. You can't hold the channel lock while reaching across a bridge for a peer:

{code}
iax2_lock_owner(fr->callno);
if (!iaxs[fr->callno]) {
/* The call dissappeared so discard this frame that we could not send. */
iax2_frame_free(fr);
return -1;
}
if ((owner = iaxs[fr->callno]->owner)) {
bridge = ast_channel_bridge_peer(owner);
}
{code}

That's a deadlock waiting to happen. Which, incidentally, happened.

By: Jeff Collell (JeffC_NN) 2014-12-09 15:01:40.371-0600

This is the other IAX conf, as requested. It's on my Asterisk 11.7.0 box on Ubuntu 12.10 x86_64, connected over the internet.

By: Jeff Collell (JeffC_NN) 2014-12-15 14:56:51.625-0600

Here is "core show threads", and "iax2 show channels" (no channels shown this time, but asterisk was at 25% on 2 threads for this occurrence of the bug).

By: Richard Mudgett (rmudgett) 2015-01-14 16:19:33.443-0600

Patch for v13 up on reviewboard: https://reviewboard.asterisk.org/r/4342/

By: Richard Mudgett (rmudgett) 2015-01-14 16:23:42.318-0600

The deadlock would happen on chan_iax2 because the jitter buffer is enabled in the iax.conf.  Please test the patch and report back.

By: Bobby Hakimi (bobbymc) 2017-08-07 11:29:03.647-0500

im having the same issue on version 11.25, is there a way to backport this fix?

By: Richard Mudgett (rmudgett) 2017-08-07 11:45:07.813-0500

[~bobbymc] There is no way you could be having this particular deadlock as it was v13+ specific.  Also v11 does not receive bug fixes [1] any more and in about two months be completely unsupported.

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions