[Home]

Summary:ASTERISK-28561: Asterisk Deadlocks
Reporter:Aheliotech (aheliotech)Labels:
Date Opened:2019-10-02 06:45:23Date Closed:2019-10-14 12:24:30
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_pjsip
Versions:13.29.0 16.3.0 16.4.1 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:VMWare, FreePBX Attachments:( 0) core-asterisk-running-2019-09-30T13-33-22-0400-brief.txt
( 1) core-asterisk-running-2019-09-30T13-33-22-0400-full.txt
( 2) core-asterisk-running-2019-09-30T13-33-22-0400-locks.txt
( 3) core-asterisk-running-2019-09-30T13-33-22-0400-thread1.txt
( 4) crash9302019.pcap
( 5) full.zip
Description:Asterisk becomes unresponsive and stops processing any kind of SIP requests.

We were asked by Sangoma to open a bug report here as they were not able to fix the issue.

Not sure if you have access to this link.

https://support.sangoma.com/index.php?/Default/Tickets/Ticket/View/909954/false/0
Comments:By: Asterisk Team (asteriskteam) 2019-10-02 06:45:24.376-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

By: Aheliotech (aheliotech) 2019-10-02 06:46:09.896-0500

Pcap During Crash

By: Aheliotech (aheliotech) 2019-10-02 06:46:57.386-0500

Core Dump during crash

By: Aheliotech (aheliotech) 2019-10-02 06:47:15.144-0500

Core Dump During Crash

By: Aheliotech (aheliotech) 2019-10-02 06:47:32.656-0500

Core Dump During Crash

By: Aheliotech (aheliotech) 2019-10-02 06:47:44.329-0500

Core Dump During Crash

By: Aheliotech (aheliotech) 2019-10-02 06:48:24.132-0500

Asterisk Full Log Before, During and after Crash

By: George Joseph (gjoseph) 2019-10-02 09:58:31.035-0500

It appears that there's a deadlock with the internal container we use to store channels.   If you still have the actual coredump (probably /tmp/core-asterisk-running-2019-09-30T13-33-22-0400) it would help us greatly if you could run ast_coredumper again with the following options...

{{/var/lib/asterisk/scripts/ast_coredumper --tarball-coredumps --no-default-search /tmp/core-asterisk-running-2019-09-30T13-33-22-0400}}

That packages the actual asterisk binaries installed so we can examine the state of the process further.  The resulting file will probably be too large to attach to this issue so if you host it on Google Drive, DropBox or another sharing site and paste the link here, that would be fine.


By: Aheliotech (aheliotech) 2019-10-02 10:09:06.400-0500

Here is a link to the actual core dump file

https://drive.google.com/file/d/1FiaswrnswzbLTfwYsvozeMN9qbhM5g_2/view?usp=sharing

By: George Joseph (gjoseph) 2019-10-02 14:19:26.140-0500

We need the actual tarball produced by ast_coredumper.   it has the core dump plus the exact asterisk binaries needed to examine it.


By: Aheliotech (aheliotech) 2019-10-02 14:56:38.938-0500

Here is a link the tarball after running the command provided.

https://drive.google.com/file/d/1JIv22J7_8nZZeoYrIakbSaccVsHEj2jb/view?usp=sharing

By: Friendly Automation (friendly-automation) 2019-10-14 06:51:59.088-0500

Change 13032 merged by Friendly Automation:
pbx: deadlock when outgoing dialed channel hangs up too quickly

[https://gerrit.asterisk.org/c/asterisk/+/13032|https://gerrit.asterisk.org/c/asterisk/+/13032]

By: Friendly Automation (friendly-automation) 2019-10-14 06:52:41.155-0500

Change 13031 merged by Friendly Automation:
pbx: deadlock when outgoing dialed channel hangs up too quickly

[https://gerrit.asterisk.org/c/asterisk/+/13031|https://gerrit.asterisk.org/c/asterisk/+/13031]

By: Friendly Automation (friendly-automation) 2019-10-14 06:59:19.610-0500

Change 13033 merged by Friendly Automation:
pbx: deadlock when outgoing dialed channel hangs up too quickly

[https://gerrit.asterisk.org/c/asterisk/+/13033|https://gerrit.asterisk.org/c/asterisk/+/13033]

By: Friendly Automation (friendly-automation) 2019-10-14 07:19:20.476-0500

Change 13034 merged by Friendly Automation:
pbx: deadlock when outgoing dialed channel hangs up too quickly

[https://gerrit.asterisk.org/c/asterisk/+/13034|https://gerrit.asterisk.org/c/asterisk/+/13034]

By: Friendly Automation (friendly-automation) 2019-10-14 07:22:07.961-0500

Change 13052 merged by Joshua Colp:
pbx: deadlock when outgoing dialed channel hangs up too quickly

[https://gerrit.asterisk.org/c/asterisk/+/13052|https://gerrit.asterisk.org/c/asterisk/+/13052]

By: Aheliotech (aheliotech) 2019-10-14 12:12:57.687-0500

Will this resolution be added to an upcoming release? Would our option in the interim be to make the changes to the referenced source , and then recompile ?

By: Asterisk Team (asteriskteam) 2019-10-14 12:12:57.858-0500

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.

By: Kevin Harwell (kharwell) 2019-10-14 12:24:12.411-0500

Yes, this should go out in the next release of Asterisk, which I believe is 13.30.0, and 16.7.0.

And you're correct. If you'd like to use the patch before then you'll need to apply the patch to the source, recompile, reinstall, and restart Asterisk.