[Home]

Summary:ASTERISK-29636: pjsip deadlock
Reporter:Mark Murawski (kobaz)Labels:
Date Opened:2021-09-09 08:59:22Date Closed:2021-12-03 12:00:01.000-0600
Priority:MinorRegression?
Status:Closed/CompleteComponents:Resources/res_pjsip
Versions:16.20.0 18.6.0 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 24054.1637327596.asterisk.core-brief.txt
( 1) 24054.1637327596.asterisk.core-full.txt
( 2) 24054.1637327596.asterisk.core-info.txt
( 3) 24054.1637327596.asterisk.core-locks.txt
( 4) 24054.1637327596.asterisk.core-thread1.txt
( 5) asterisk-18-locks.txt
( 6) thread-all-bt.txt
Description:Deadlocks Occurring under 'normal load'  
Comments:By: Asterisk Team (asteriskteam) 2021-09-09 08:59:25.747-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

By: Mark Murawski (kobaz) 2021-09-09 09:00:30.224-0500

core show locks upload

By: Mark Murawski (kobaz) 2021-09-09 09:03:23.691-0500

thread apply all bt


By: Sean Bright (seanbright) 2021-09-09 09:26:09.198-0500

reattached with extensions so we can view them in-browser

By: George Joseph (gjoseph) 2021-09-09 09:38:26.909-0500

How did you get the output? From the CLI and a live gdb session?

Any chance you have an actual coredump?  If so, can you run ast_coredumper with the --tarball-coredumps option so we can see the full state?  The file will be big because it will contain the actual coredump so you'll need to host it on google drive, dropbox, etc and email us the link.

If you don't have the coredump now, the next time it happens, run ast_coredumper with the --running, --no-default-search and  --tarball-coredumps options while asterisk is locked.


By: George Joseph (gjoseph) 2021-09-09 09:51:06.619-0500

Oh, and make sure that DONT_OPTIMIZE is set in the asterisk compile options.


By: Mark Murawski (kobaz) 2021-09-09 10:19:54.700-0500

Will do.  I'm currently building up a test case in the lab so we can work on consistently reproducing. this.

By: Asterisk Team (asteriskteam) 2021-09-23 12:00:01.169-0500

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines

By: Mark Murawski (kobaz) 2021-11-19 07:30:37.080-0600

Uploaded asterisk coredumper

By: Asterisk Team (asteriskteam) 2021-11-19 07:30:38.757-0600

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.

By: Mark Murawski (kobaz) 2021-11-19 07:34:23.350-0600

This is what I have so far.. two lockups this morning.

Asterisk console was responsive 'core show channels' was functional
AMI was functional.. the only thing being passed around however were outside-asterisk generated userevents (not from dialplan)

Dialplan executing completely halted.

By: Joshua C. Colp (jcolp) 2021-11-19 07:36:14.844-0600

The backtrace on this doesn't really make sense, there are impossible code paths in it. It also has references to "/home/markm/asterisk/16.16.1-custom".

By: Mark Murawski (kobaz) 2021-11-19 07:41:44.604-0600

I saw some corrupted stack frame messages in the core dump... something else going on?

Unfortunately I've only caught this in production so far... I'm going to do a concerted effort to replicate this in the lab and get a non-optimized debug locks capture on this.

I'm sorry that the dumping has not gone well for this issue.

By: Joshua C. Colp (jcolp) 2021-11-19 07:50:48.498-0600

Some things are optimized out, some threads just make no sense. It's suspect and incomplete enough to only say that it seems like a channel is deadlocked. That's about it.

By: Kevin Harwell (kharwell) 2021-11-19 09:35:30.986-0600

We do not support, and debug custom builds as well as older releases of Asterisk:
{quote}
Asterisk 16.16.1-intellasoft-2021-09-22-fdba00d2e17dd30fa2e500cf3ab7a284d501eb84
{quote}
In order for us to further investigate you'll need to replicate the issue in a current release of Asterisk that is free of any custom code/patches. Once you have done that then please use {{ast_coredumper}} to [get a backtrace|https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace], and upload the results here.

Thanks!

By: Mark Murawski (kobaz) 2021-11-19 09:40:20.117-0600

Hi Kevin,

We tag our builds even if it doesn't have any actual code changes, so we can tie the configuration and build options back to the exact source build.

As far as the version goes... I did notice after the upload that this was an older version..  We've updated this box and we'll continue to monitor and try and reproduce this in a lab environment.



By: Asterisk Team (asteriskteam) 2021-12-03 12:00:01.188-0600

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines