[Home]

Summary:ASTERISK-27911: Deadlock, likely when delegating calls from queue with Redirect
Reporter:Örn Arnarson (orn)Labels:
Date Opened:2018-06-12 06:49:53Date Closed:2020-01-14 11:13:35.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:13.1.0 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Ubuntu 16.04.4 LTS, running on VMwareAttachments:( 0) backtrace-threads.txt
( 1) backtrace-threads.txt
( 2) core-show-locks.txt
( 3) core-show-threads.txt
Description:We have a PBX running with approx. 16 concurrent calls usually. Not a call center, so a fair number of calls that are producing the calls.

We are experiencing several deadlocks per week with this PBX, likely in Chan_SIP, as the PBX stops processing new INVITES, yet the CLI seems to be working and the only way to stop is by using kill -9.

We have a suspicion that this issue is caused when "picking up" calls from a call queue using FOP2, which actually uses Redirect via AMI with an auto-answer SIP header added. This may or may not happen also when we use Originate via AMI calling the PickupChan application. We haven't been able to confirm that.

Asterisk was installed from the Ubuntu packaging system. We believe we have a recorded incident with backtraces and locks after having compiled the ubuntu package with the unoptimized and debug threads flags set. We are not certain though, as this rendered Asterisk too slow for production and we were dropping calls, so maybe it was an unrelated issue. But likely it was the same deadlock.

The reason we haven't upgraded to a newer version as of yet is that we are running from an Ubuntu package and would prefer to keep it that way if possible, but of course are willing to update if it turns out that this is a known and fixed bug.
Comments:By: Asterisk Team (asteriskteam) 2018-06-12 06:49:55.575-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Örn Arnarson (orn) 2018-06-12 06:50:48.092-0500

CLI output from 'core show threads'

By: Örn Arnarson (orn) 2018-06-12 06:51:06.950-0500

CLI output of 'core show locks'

By: Örn Arnarson (orn) 2018-06-12 06:51:29.662-0500

GDB backtrace

By: Joshua C. Colp (jcolp) 2018-06-12 07:18:40.758-0500

It appears the bug you have submitted is against a rather old version of a supported branch of Asterisk. There have been many issues fixed between the version you are using and the current version of your branch. Please test with the latest version in your Asterisk branch and report whether the issue persists.

Please see the Asterisk Versions [1] wiki page for info on which versions of Asterisk are supported.
[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions

Specifically 13.1.0 was released:

Date:   Mon Dec 15 15:37:36 2014 +0000

Trying to track down a specific change from then to now for your particular problem would take quite a long time, so updating is really needed before we would look into such a thing.

By: Örn Arnarson (orn) 2018-06-12 07:26:35.357-0500

I might do that.

I do think, however, as this version of Asterisk is the current version in Ubuntu 16.04 LTS that some investigation should be warranted, if only to make sure that Ubuntu upgrades its package to a version that doesn't have this bug. Or in the very least make sure that we document what causes this behavior so that it's googlable and people can avoid the bug with workarounds.

Just my two cents.

By: Örn Arnarson (orn) 2018-06-12 07:27:44.541-0500

Further, since ceasing to use Redirect, we haven't had a deadlock in 3 working days, so it's looking like it might be related.

By: Joshua C. Colp (jcolp) 2018-06-12 07:29:45.234-0500

The Asterisk project doesn't control or have any say over the Ubuntu packaging of the project. We also don't have a ton of resources to do such things like the investigation you mention. Since 13.1.0 was released there have been a TON of fixed bugs, and generally they all have an issue on here so someone with the time could try to narrow down your issue but that is time taken away from fixing current bugs or helping other users. If someone would like to do such a thing, they can, but I don't guarantee anyone will.

By: Joshua C. Colp (jcolp) 2018-06-12 07:30:14.843-0500

If you believe you've narrowed that down you can search JIRA using that information and see if you find anything for your problem.

By: Örn Arnarson (orn) 2018-06-12 07:37:07.533-0500

Understood. Thanks for your response.

By: Asterisk Team (asteriskteam) 2018-06-26 12:00:02.238-0500

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines