[Home]

Summary:ASTERISK-26686: res_pjsip: Lock inversion in transport management
Reporter:Ross Beer (rossbeer)Labels:pjsip
Date Opened:2017-01-03 01:44:23.000-0600Date Closed:2018-07-09 06:56:08
Priority:MajorRegression?
Status:Closed/CompleteComponents:Resources/res_pjsip
Versions:13.13.1 Frequency of
Occurrence
Frequent
Related
Issues:
is related toASTERISK-27347 [patch] pjproject_bundled: Disable TCP/TLS keep-alives.
Environment:Fedora Server 23 SQLLite 3.11.0Attachments:( 0) backtrace_20160103.txt
Description:Asterisk lock inversion in the PJSIP transport management code for keeping transports alive.

Workaround is to set 'keep_alive_interval=0'
Comments:By: Asterisk Team (asteriskteam) 2017-01-03 01:44:25.193-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Joshua C. Colp (jcolp) 2017-01-03 06:12:28.134-0600

This has nothing to do with astdb. It's a lock inversion in the PJSIP transport management code we have for keeping transports alive. One thread has our lock and is trying to get a transport lock, another thread has the transport lock and is trying to get our lock.

By: Ross Beer (rossbeer) 2017-01-03 08:07:18.891-0600

As a temporary fix, would the following setting resolve the issue:

keep_alive_interval=0

By: Joshua C. Colp (jcolp) 2017-01-03 08:12:03.365-0600

Yes, that should disable the functionality which causes the problem.

By: Ross Beer (rossbeer) 2017-02-13 10:13:30.667-0600

The temporary fix has stopped the issue, however, the underlying issue remains.

By: Richard Mudgett (rmudgett) 2018-07-03 11:22:10.343-0500

Thread 70 and 71 are deadlocked.  The locks involved are the pjproject transport manager group lock and the monitored_transports container lock.  The deadlocking code is still present in 13.21.0 even though that code moved to a new file.

By: Ross Beer (rossbeer) 2018-07-04 06:35:52.522-0500

Does the PJSIP config PJSIP_TCP_KEEP_ALIVE_INTERVAL and PJSIP_TLS_KEEP_ALIVE_INTERVAL need to be defined and set to 0 to stop PJSIP also sending keepalives every 90 seconds?

According to the documentation, these values have a default of 90 and will, therefore, send keepalives also, see:

http://www.pjsip.org/pjsip/docs/html/group__PJSIP__CONFIG.htm#ga02217f4919a7c575d71eed407be63d04


By: Richard Mudgett (rmudgett) 2018-07-06 14:30:07.922-0500

https://blogs.asterisk.org/2018/01/27/wanted-dead-or-alive/

By: Ross Beer (rossbeer) 2018-07-06 17:17:46.234-0500

My point exactly... Does the PJSIP implementation of the keep alive need to be disabled with the bundled version?

By: Ross Beer (rossbeer) 2018-07-09 04:49:21.986-0500

Looking through the issue tracker, there is also an open ticket regarding PJSIP also sending keepalives which means that there are more keepalives sent than expected. See ASTERISK-27347

By: Friendly Automation (friendly-automation) 2018-07-09 06:56:10.944-0500

Change 9330 merged by Jenkins2:
res_pjsip/pjsip_transport_management.c: Fix deadlock with transport keep alive.

[https://gerrit.asterisk.org/9330|https://gerrit.asterisk.org/9330]

By: Friendly Automation (friendly-automation) 2018-07-09 07:11:49.810-0500

Change 9331 merged by Joshua Colp:
res_pjsip/pjsip_transport_management.c: Fix deadlock with transport keep alive.

[https://gerrit.asterisk.org/9331|https://gerrit.asterisk.org/9331]

By: Friendly Automation (friendly-automation) 2018-07-09 07:16:10.471-0500

Change 9332 merged by Joshua Colp:
res_pjsip/pjsip_transport_management.c: Fix deadlock with transport keep alive.

[https://gerrit.asterisk.org/9332|https://gerrit.asterisk.org/9332]

By: Friendly Automation (friendly-automation) 2018-08-28 11:58:40.842-0500

Change 10003 merged by Kevin Harwell:
res_pjsip/pjsip_transport_management.c: Fix deadlock with transport keep alive.

[https://gerrit.asterisk.org/10003|https://gerrit.asterisk.org/10003]