[Home]

Summary:ASTERISK-17388: [patch] Deadlock sip_read check_rtp_timeout #16608
Reporter:Gregory Hinton Nietsky (irroot)Labels:
Date Opened:2011-02-11 04:14:34.000-0600Date Closed:2011-05-16 08:12:45
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:1.6.2.15 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) chan_sip.rtptimeout.patch
( 1) chan_sip-1.6.2_rtptimeout.patch
( 2) chan_sip-1.6.2_rtptimeout2.patch
Description:This is alternate patch to ASTERISK-15432 and this is still a issue in 1.6.2.

****** ADDITIONAL INFORMATION ******

=======================================================================

=== Currently Held Locks ==============================================

=======================================================================

===

=== <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)

===

=== Thread ID: -1335886992 (do_monitor           started at [22802] chan_sip.c restart_monitor())

=== ---> Lock #0 (astobj2.c): MUTEX 164 ao2_lock &p->priv_data.lock 0x959bdf8 (1)

       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]

       /usr/sbin/asterisk() [0x8080295]

       /usr/sbin/asterisk(ao2_lock+0x4c) [0x8080df0]

       /usr/sbin/asterisk() [0x8081931]

       /usr/sbin/asterisk(_ao2_callback+0x56) [0x8081cd9]

       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x62a1c) [0xb0ae0a1c]

       /usr/sbin/asterisk() [0x817cdf8]

       /lib/libpthread.so.0(+0x5afe) [0xb72e0afe]

       /lib/libc.so.6(clone+0x5e) [0xb751e64e]

=== ---> Tried and failed to get Lock #1 (chan_sip.c): MUTEX 22682 check_rtp_timeout (channel lock) 0xcf3e000 (0)

       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]

       /usr/sbin/asterisk() [0x809a9f8]

       /usr/sbin/asterisk(__ast_channel_trylock+0xa1) [0x80ad13e]

       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x62692) [0xb0ae0692]

       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x40a4d) [0xb0abea4d]

       /usr/sbin/asterisk() [0x80819fb]

       /usr/sbin/asterisk(_ao2_callback+0x56) [0x8081cd9]

       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x62a1c) [0xb0ae0a1c]

       /usr/sbin/asterisk() [0x817cdf8]

       /lib/libpthread.so.0(+0x5afe) [0xb72e0afe]

       /lib/libc.so.6(clone+0x5e) [0xb751e64e]

=== -------------------------------------------------------------------

===

=== Thread ID: -1371231376 (pbx_thread           started at [ 4627] pbx.c ast_pbx_start())

=== ---> Lock #0 (channel.c): MUTEX 2744 __ast_read (channel lock) 0xcf3e000 (1)

       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]

       /usr/sbin/asterisk() [0x809a9f8]

       /usr/sbin/asterisk(__ast_channel_trylock+0xa1) [0x80ad13e]

       /usr/sbin/asterisk() [0x80a1e8e]

       /usr/sbin/asterisk(ast_read+0x19) [0x80a3b66]

       /usr/sbin/asterisk(ast_waitfordigit_full+0x1ed) [0x80a16c0]

       /usr/sbin/asterisk(ast_waitfordigit+0x28) [0x80a1366]

       /usr/sbin/asterisk() [0x8121a31]

       /usr/sbin/asterisk() [0x8122a01]

       /usr/sbin/asterisk() [0x812354d]

       /usr/sbin/asterisk() [0x817cdf8]

       /lib/libpthread.so.0(+0x5afe) [0xb72e0afe]

       /lib/libc.so.6(clone+0x5e) [0xb751e64e]

=== -------------------------------------------------------------------

===

=======================================================================
Comments:By: Stefan Schmidt (schmidts) 2011-02-11 15:54:12.000-0600

hello greg,

the second part of this patch is a backport from 1.8 right?

cause this could lead to dialogs which wil exists longer than they should. 1.6 checks each dialog on every loop of do_monitor also for rtp_timeout, 1.8 use a own container for dialogs which should be rtp_timeout checked so its much faster to get back to one dialog than 1.6.

but it should work and do what you expact ;)

best regards

stefan

By: Gregory Hinton Nietsky (irroot) 2011-02-12 02:58:28.000-0600

Sorry aint looked at 1.8 just yet ... this deadlock was causing me untold grief when upgrading from 1.4[36] to 1.6 im a bit slow of late will try move to 1.8 as soon as possible ... looks promising ....

By: Gregory Hinton Nietsky (irroot) 2011-02-12 03:36:27.000-0600

The first part of this patch is not 100% correct its working for me at the moment. working on better solution ASTERISK-15432 is more correct than not updateing on ast_frame_null.

By: Gregory Hinton Nietsky (irroot) 2011-02-12 04:16:24.000-0600

New patch takes rtp bridging into account.

By: Gregory Hinton Nietsky (irroot) 2011-02-15 09:15:20.000-0600

the underlying cause of this seems to be ASTERISK-17407 this patch does resolve this deadlock and removes a deadlock path so may be usefull to keep it arround ...

By: Gregory Hinton Nietsky (irroot) 2011-02-17 06:19:42.000-0600

Ok makes sense that the channel is locked on faxdetect reworked the patch to only not update rtp time on a null frame when not bridged as well as not waiting for the lock on rtp timeout if the channel has blown up no use locking up monitor on a bad channel.

By: Gregory Hinton Nietsky (irroot) 2011-02-28 04:07:06.000-0600

Ok this path is removed in trunk but not other branches should it be closed or applied to other branches ??

By: Digium Subversion (svnbot) 2011-05-06 14:46:50

Repository: asterisk
Revision: 317865

U   branches/1.8/channels/chan_sip.c

------------------------------------------------------------------------
r317865 | russell | 2011-05-06 14:46:49 -0500 (Fri, 06 May 2011) | 11 lines

chan_sip: fix a deadlock in check_rtp_timeout.

Don't block doing silly deadlock avoidance.  Just return and try again later.
The funciton gets called often enough that it's fine.  Also, this change was
already made in trunk.

(closes issue ASTERISK-17388)
Reported by: irroot
Patches:
     chan_sip.rtptimeout.patch uploaded by irroot (license 52)

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=317865

By: Digium Subversion (svnbot) 2011-05-06 14:48:07

Repository: asterisk
Revision: 317866

_U  trunk/
U   trunk/channels/chan_sip.c

------------------------------------------------------------------------
r317866 | russell | 2011-05-06 14:48:07 -0500 (Fri, 06 May 2011) | 18 lines

Merged revisions 317865 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

........
 r317865 | russell | 2011-05-06 14:46:49 -0500 (Fri, 06 May 2011) | 11 lines
 
 chan_sip: fix a deadlock in check_rtp_timeout.
 
 Don't block doing silly deadlock avoidance.  Just return and try again later.
 The funciton gets called often enough that it's fine.  Also, this change was
 already made in trunk.
 
 (closes issue ASTERISK-17388)
 Reported by: irroot
 Patches:
       chan_sip.rtptimeout.patch uploaded by irroot (license 52)
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=317866