[Home]

Summary:ASTERISK-27090: PJSIP: Deadlock using TCP transport
Reporter:Richard Mudgett (rmudgett)Labels:
Date Opened:2017-06-28 13:16:16Date Closed:2017-07-07 11:10:26
Priority:MajorRegression?
Status:Closed/CompleteComponents:Resources/res_pjsip
Versions:13.16.0 Frequency of
Occurrence
Occasional
Related
Issues:
Environment:Attachments:( 0) backtrace.txt
Description:Deadlock due to a lock inversion. In the attached back trace you can see that thread 177 is holding a lock on the transport:
{noformat}
#14 0x05a11a62 in ioqueue_dispatch_read_event (ioqueue=0x971de70, h=0x9716588) at ../src/pj/ioqueue_common_abs.c:591
       read_op = 0xa94e499c
       bytes_read = 3278
       has_lock = 1
       rc = 0
{noformat}
This same thread then requests a lock on the dialog 0xaf3a428c:
{noformat}
#5  0x08ed4468 in pjsip_dlg_inc_lock (dlg=0xaf3a428c) at ../src/pjsip/sip_dialog.c:847
{noformat}
However the dialog's (0xaf3a428c) lock is being held by thread 53 by the following:
{noformat}
#17 0x08ed53ba in pjsip_dlg_send_response (dlg=0xaf3a428c, tsx=0xb6684ec4, tdata=0xb180be84) at ../src/pjsip/sip_dialog.c:1478
       status = -1234852344
......
#25 0x08ed58c9 in pjsip_dlg_on_rx_request (dlg=0xaf3a428c, rdata=0xb18c7e3c) at ../src/pjsip/sip_dialog.c:1660
       status = 0
       tsx = 0xb6684ec4
       processed = 0
       i = 1
{noformat}
Thread 53 is then waiting on the transport, which is held by thread 177.

This deadlock happens because the SIP transport being used is TCP and a message on a SIP dialog is being received at the same time as a message on the same SIP dialog is being sent.  If the transport were UDP the deadlock won't happen.
Comments:By: Richard Mudgett (rmudgett) 2017-06-28 13:21:48.147-0500

[^backtrace.txt] - The backtrace only shows the two threads involved in the deadlock.

By: Ross Beer (rossbeer) 2017-06-29 03:41:44.021-0500

Is issue ASTERISK-26686 related?

By: Richard Mudgett (rmudgett) 2017-06-29 10:45:44.725-0500

[~rossbeer] No.  Different locks are involved.  The ASTERISK-26686 deadlock appears to involve the transport lock and an Asterisk ao2 container lock.  I wasn't able to find the thread holding the transport lock in my brief perusal of the backtrace.  The deadlock in this issue involves the transport lock and a call's dialog lock.