[Home]

Summary:ASTERISK-05926: Calling MixMonitor twice drives Asterisk into deadlock
Reporter:jcomellas (jcomellas)Labels:
Date Opened:2005-12-29 09:06:04.000-0600Date Closed:2006-01-03 13:30:04.000-0600
Priority:BlockerRegression?No
Status:Closed/CompleteComponents:Applications/app_mixmonitor
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 20051229__bug6084.diff.txt
( 1) app_mixmonitor_stopmixmonitor.patch
Description:If I call MixMonitor from the dialplan, stop MixMonitor via the command line (mixmonitor stop <channel>) and then start MixMonitor again, Asterisk goes into an infinite loop because a lock is still present on the channel and it shows this message in what looks like an infinite loop:

Dec 29 11:09:23 DEBUG[26451]: chan_sip.c:11110 sipsock_read: Failed to grab lock, trying again...



****** ADDITIONAL INFORMATION ******

Here are the backtraces I could get from the channels with activity while this is going on:



--- Backtrace for MixMonitor thread
#0  0xffffe410 in ?? ()
No symbol table info available.
#1  0xb69a9458 in ?? ()
No symbol table info available.
#2  0x00504a5b in ?? ()
No symbol table info available.
#3  0x00000000 in ?? ()
No symbol table info available.
#4  0xb7f9dd86 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/i686/cmov/libpthread.so.0
No symbol table info available.
ASTERISK-1  0x08066447 in ast_channel_spy_trigger_wait (spy=0xfffffffc) at lock.h:416
       __PRETTY_FUNCTION__ = "ast_channel_spy_trigger_wait"
ASTERISK-2  0xb6c8c054 in mixmonitor_thread (obj=0x81d03d0) at app_mixmonitor.c:222
       next = (struct ast_frame *) 0x0
       spy = {list = {next = 0xb69a9330}, lock = {mutex = {__m_reserved = 0, __m_count = 0, __m_owner = 0x0, __m_kind = 1, __m_lock = {__status = 1,
       __spinlock = 0}}, file = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, lineno = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, reentrancy = 0, func = {
     0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, thread = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}}, trigger = {__c_lock = {__status = 0,
     __spinlock = 5261915}, __c_waiting = 0x28252e,
   __padding = "\000\000\000\000-%(\000\000\000\000\000-%(\000\000\000\000\0004\223\232ΒΆ\002\000\000", __align = 0}, read_queue = {head = 0x8232a90,
   samples = 4000, format = 64}, write_queue = {head = 0x0, samples = 0, format = 64}, flags = 25, status = CHANSPY_RUNNING,
 type = 0xb6c8f3dc "MixMonitor", read_vol_adjustment = 0, write_vol_adjustment = 0}
       fs = (struct ast_filestream *) 0x8232860
       ext = 0x0
       f = (struct ast_frame *) 0x0
       post_process = '\0' <repeats 1023 times>
       __PRETTY_FUNCTION__ = "mixmonitor_thread"
ASTERISK-3  0xb7f9be70 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
No symbol table info available.
ASTERISK-4  0xb7e75c0e in clone () from /lib/tls/i686/cmov/libc.so.6



--- Backtrace for thread that is sending frames to the channel spy:
#0  0x0806c595 in queue_frame_to_spies (chan=0x81ec638, f=0x81e9d20, dir=SPY_READ) at channel.c:1168
       translated_frame = (struct ast_frame *) 0x81f64f0
       spy = (struct ast_channel_spy *) 0xb69a9330
       queue = (struct ast_channel_spy_queue *) 0xb69a9420
       trans = (struct channel_spy_trans *) 0x8181350
       last = (struct ast_frame *) 0x82a49a8
       __PRETTY_FUNCTION__ = "queue_frame_to_spies"
#1  0x08070da4 in ast_read (chan=0x81ec638) at channel.c:1909
       f = (struct ast_frame *) 0x81e9d20
       blah = 0
       prestate = 6
       null_frame = {frametype = 5, subclass = 0, datalen = 0, samples = 0, mallocd = 0, offset = 0, src = 0x0, data = 0x0, delivery = {tv_sec = 0,
   tv_usec = 0}, prev = 0x0, next = 0x0}
       __PRETTY_FUNCTION__ = "ast_read"
#2  0xb6c41657 in aptela_event_loop (session=0x81ee3c8) at src/event.c:99
       rc = -1
       chan = (struct ast_channel *) 0x81ec638
       frame = (struct ast_frame *) 0x81e9d20
       waitfd = {28, 26}
       exception = 0
       outfd = -99999
       inactivity_timeout = 20000
       timeout = 19982
       __PRETTY_FUNCTION__ = "aptela_event_loop"
#3  0xb6c4c5a5 in aptela_exec (chan=0x81ec638, data=0xb6924fc8) at src/app_aptela.c:189
       rc = -1
       u = (struct localuser *) 0x81bfa00
       session = (aptela_session *) 0x81ee3c8
       __PRETTY_FUNCTION__ = "aptela_exec"
#4  0x0809d91e in pbx_extension_helper (c=0x81ec638, con=Variable "con" is not available.
) at pbx.c:544
       e = Variable "e" is not available.
Comments:By: jcomellas (jcomellas) 2005-12-29 09:44:32.000-0600

I made a mistake when reporting one of the backtraces. I have changed the apps/app_mixmonitor.c to add the StopMixMonitor() application (patch attached to the bug), but it was not used to trigger the problem. The backtrace reports line 222 for the call to ast_channel_spy_trigger_wait() in mixmonitor_thread(). In the unpatched file this call is in line 212.

By: jcomellas (jcomellas) 2005-12-29 11:02:01.000-0600

You can easily trigger this bug by using the following dialplan:

exten => _ZXX,1,Answer()
exten => _ZXX,2,DigitTimeout(5)
exten => _ZXX,3,ResponseTimeout(10)
exten => _ZXX,4,MixMonitor(mon1.wav)
exten => _ZXX,5,Wait(15)
exten => _ZXX,6,MixMonitor(mon2.wav)
exten => _ZXX,7,Wait(15)
exten => _ZXX,8,Hangup()

During the first call to Wait() execute the following command:

mixmonitor stop <channel_name>

Where 'channel_name' is the name of the newly created channel. After the second call to MixMonitor() (i.e. after you see the '== Begin MixMonitor Recording mon2.wav' message) hangup your phone. You should now see Asterisk deadlock, entering and infinite loop when trying to lock a channel mutex.

By: Tilghman Lesher (tilghman) 2005-12-29 11:12:12.000-0600

Try this patch and see if it resolves your issue.

By: jcomellas (jcomellas) 2005-12-30 09:31:07.000-0600

Yes, this change apparently fixes the problem. All the tests that were failing before now work.

Thanks!

By: Tilghman Lesher (tilghman) 2006-01-02 00:42:17.000-0600

Patch committed to 1.2, merged to trunk

By: Kevin P. Fleming (kpfleming) 2006-01-03 10:23:47.000-0600

The patch applied to SVN was not the right fix... the actual issue was that CHANSPY_DONE was used in two different scenarios, and caused the spy to not be removed from the channel when the 'stop' CLI command was used. This bug also existed in app_chanspy.c.

I have reverted the patch that was applied and put in what I believe is the proper fix; please retest and ensure that your problem does not occur with this new code.

By: jcomellas (jcomellas) 2006-01-03 13:28:19.000-0600

I have retested and Asterisk with your fix does not present this problem.

By: Kevin P. Fleming (kpfleming) 2006-01-03 13:29:54.000-0600

OK, thanks for the quick report. I'll re-close this bug now...