[Home]

Summary:ASTERISK-28098: ATTENDED_TRANSFER_COMPLETE_SOUND deadlocks in Local channels (Asterisk 11)
Reporter:Walter Doekes (wdoekes)Labels:
Date Opened:2018-10-11 04:12:04Date Closed:2020-01-14 11:24:14.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/Transfers
Versions:11.25.3 Frequency of
Occurrence
Constant
Related
Issues:
Environment:Attachments:
Description:Hi. Just filing this here for the sake of reference. _As Asterisk 11 is EOL, I'm not suggesting that anyone should fix this. Please close this ticket._

After enabling ATTENDED_TRANSFER_COMPLETE_SOUND today, Asterisken started locking up. It appears to be two threads waiting on each other:

{noformat}
[Switching to thread 12 (Thread 0x7f44d4118700 (LWP 10318))]

(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f45015b2c06 in __GI___pthread_mutex_lock (mutex=0x56414ab24110) at ../nptl/pthread_mutex_lock.c:115
#2  0x00005641494fdad0 in ast_do_masquerade (original=original@entry=0x7f44b42174c8) at channel.c:6914
#3  0x000056414950473d in __ast_read (chan=0x7f44b42174c8, dropaudio=0) at channel.c:3887
#4  0x000056414954cf92 in waitstream_core (c=<optimized out>, c@entry=0x7f44b42174c8, breakon=breakon@entry=0x7f44be87918d "", forward=0x5641495f3bda "", forward@entry=0x0, reverse=0x5641495f3bda "", reverse@entry=0x0, skip_ms=skip_ms@entry=0, audiofd=audiofd@entry=-1, cmdfd=<optimized out>, context=0x0, cb=0x0) at file.c:1326
#5  0x000056414954f62c in ast_waitstream (c=c@entry=0x7f44b42174c8, breakon=breakon@entry=0x7f44be87918d "") at file.c:1436
#6  0x00007f44be822ad8 in local_attended_transfer (transferer=transferer@entry=0x7f449c0664c8, current=current@entry=0x7f44d41158e0, seqno=seqno@entry=104, nounlock=nounlock@entry=0x7f44d4117274, req=0x7f44d4117340) at chan_sip.c:26550
#7  0x00007f44be832f0d in handle_request_refer (p=p@entry=0x7f449c0664c8, req=req@entry=0x7f44d4117340, seqno=104, nounlock=nounlock@entry=0x7f44d4117274) at chan_sip.c:26886
#8  0x00007f44be8684ad in handle_incoming (p=p@entry=0x7f449c0664c8, req=req@entry=0x7f44d4117340, addr=addr@entry=0x7f44d41172b0, recount=recount@entry=0x7f44d4117270, nounlock=nounlock@entry=0x7f44d4117274) at chan_sip.c:28798
#9  0x00007f44be86a853 in handle_request_do (req=req@entry=0x7f44d4117340, addr=addr@entry=0x7f44d41172b0) at chan_sip.c:28984
#10 0x00007f44be86bfa3 in sipsock_read (id=<optimized out>, fd=<optimized out>, events=<optimized out>, ignore=<optimized out>) at chan_sip.c:28915

(gdb) print *(pthread_mutex_t*)0x56414ab24110
$5 = {__data = {__lock = 2, __count = 1, __owner = 26036, __nusers = 1, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\001\000\000\000\264e\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}
{noformat}
^-- 12 waiting on 236 (LWP 26036)

{noformat}
[Switching to thread 236 (Thread 0x7f43eaf0b700 (LWP 26036))]

(gdb) bt
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f45015b2c06 in __GI___pthread_mutex_lock (mutex=0x7f44b0cddfa0) at ../nptl/pthread_mutex_lock.c:115
#2  0x00005641494fdaf5 in ast_do_masquerade (original=original@entry=0x7f44b0cddff8) at channel.c:6920
#3  0x000056414950473d in __ast_read (chan=0x7f44b0cddff8, dropaudio=0) at channel.c:3887
#4  0x000056414950777a in ast_read (chan=0x7f44b0cddff8) at channel.c:4411
#5  ast_generic_bridge (rc=0x7f43eaf05708, fo=0x7f43eaf05700, config=0x7f43eaf05c60, c1=0x7f44b42174c8, c0=0x7f44b0cddff8) at channel.c:7648
#6  ast_channel_bridge (c0=c0@entry=0x7f44b0cddff8, c1=c1@entry=0x7f44b42174c8, config=config@entry=0x7f43eaf05c60, fo=fo@entry=0x7f43eaf05700, rc=rc@entry=0x7f43eaf05708) at channel.c:8123
#7  0x00005641495456f0 in ast_bridge_call (chan=chan@entry=0x7f44b0cddff8, peer=peer@entry=0x7f44b42174c8, config=config@entry=0x7f43eaf05c60) at features.c:4487
#8  0x00007f44c8caa0e6 in dial_exec_full (chan=0x7f44b0cddff8, data=<optimized out>, peerflags=peerflags@entry=0x7f43eaf066e0, continue_exec=continue_exec@entry=0x0) at app_dial.c:3065
#9  0x00007f44c8cab1c6 in dial_exec (chan=<optimized out>, data=<optimized out>) at app_dial.c:3148
#10 0x0000564149594dfe in pbx_exec (c=c@entry=0x7f44b0cddff8, app=app@entry=0x56414acd20f0, data=data@entry=0x7f43eaf06c60 "Local/IDxxx@osvpi_route_phoneaccount,20") at pbx.c:1680

(gdb) print *(pthread_mutex_t*)0x7f44b0cddfa0
$3 = {__data = {__lock = 2, __count = 1, __owner = 10318, __nusers = 1, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\001\000\000\000N(\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}
{noformat}
^-- 236 waiting on 12 (LWP 10318)

Without diving too deeply into this, our theory is:
- both threads are trying to do the same masquerade, and/or
- the refer (sipsock_read) thread should not call ast_waitstream while it still has channel locks

The deadlock happens when the audio is played on a Local channel:
{noformat}
# grep Playing.*xfer debug.log   | sed -e 's/.*-- <//;s/\/.*//' | uniq -c
    30 SIP
     1 Local
{noformat}
^-- one Local channel -> deadlock

The relevant Asterisk 11 code looks like this:
{noformat}
               if (target.chan2 && !ast_strlen_zero(xfersound) && ast_streamfile(target.chan2, xfersound, ast_channel_language(target.chan2)) >= 0) {
                       ast_waitstream(target.chan2, ""); // deadlock here
               }
...
               ast_channel_ref(current->chan1);
               ast_channel_unlock(current->chan1); /* current.chan1 is p->owner before the masq, it was locked by socket_read()*/
               ast_channel_unlock(target.chan1);
               *nounlock = 1;  /* we just unlocked the dialog's channel and have no plans of locking it again. */
               sip_pvt_unlock(targetcall_pvt);
               sip_pvt_unlock(transferer);
{noformat}
^-- lots of unlocks _after_ the the playing, instead of before

In Asterisk 13, the relevant code looks like this:
{noformat}
               ast_channel_unlock(to_transfer_target);
               if (!target_complete_sound) {
                       ast_channel_lock(to_transferee);
...
                       ast_channel_unlock(to_transferee);
               }
               if (target_complete_sound) {
                       ast_bridge_channel_write_playfile(to_target_bridge_channel, NULL,
                               target_complete_sound, NULL);
               }
{noformat}
^-- unlocking of channel target _before_ playing

We'll assume things are good in 13. Will report back if it isn't.
Comments:By: Asterisk Team (asteriskteam) 2018-10-11 04:12:05.569-0500

Per the Asterisk versions page [1], the maintenance (bug fix) support for the Asterisk branch you are using has ended. For continued maintenance support please move to a supported branch of Asterisk. After testing with a supported branch, if you find this problem has not been resolved, please open a new issue against the latest version of that Asterisk branch.

Thanks!

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions

By: Asterisk Team (asteriskteam) 2018-10-11 04:12:06.942-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].