[Home]

Summary:ASTERISK-17666: [patch] Deadlock in ast_remove_hint Race with ao2_callback
Reporter:Gregory Hinton Nietsky (irroot)Labels:
Date Opened:2011-04-11 01:34:00Date Closed:2011-07-09 03:58:25
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Channels/General
Versions:1.8.3 Frequency of
Occurrence
Related
Issues:
is related toASTERISK-17760 [patch] deadlock in chan_sip
Environment:Attachments:( 0) deadlocks.patch
( 1) deadlocks2.patch
Description:Abridged Core Show Locks
=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
===
=== Thread ID: 0xb3c41b70 (tps_processing_function started at [  451] taskprocessor.c ast_taskprocessor_get())
=== ---> Lock #0 (pbx.c): MUTEX 4269 handle_statechange hints 0x97a4b48 (1)
=== ---> Waiting for Lock #1 (pbx.c): MUTEX 4270 handle_statechange hint 0xb04cf7c0 (1)
=== --- ---> Locked Here: pbx.c line 4523 (ast_remove_hint)
=== -------------------------------------------------------------------
===
=== Thread ID: 0xb09f6b70 (do_monitor           started at [24621] chan_sip.c restart_monitor())
=== ---> Lock #0 (chan_sip.c): MUTEX 24119 handle_request_do &netlock 0xb309fdc0 (1)
=== ---> Lock #1 (chan_sip.c): MUTEX 7593 find_call p 0xffb8d48 (1)
=== ---> Waiting for Lock #2 (astobj2.c): MUTEX 657 internal_ao2_callback c 0x97a4b48 (1)
=== --- ---> Locked Here: pbx.c line 4269 (handle_statechange)
=== -------------------------------------------------------------------
===
=== Thread ID: 0xb05c3b70 (netconsole           started at [ 1344] asterisk.c listener())
=== ---> Lock #0 (loader.c): MUTEX 673 ast_module_reload &reloadlock 0x81f5280 (1)
=== ---> Lock #1 (loader.c): MUTEX 711 ast_module_reload &(&module_list)->lock 0x81f5208 (1)
=== ---> Lock #2 (pbx.c): MUTEX 4523 ast_remove_hint hint 0xb04cf7c0 (1)
=== ---> Waiting for Lock #3 (astobj2.c): MUTEX 657 internal_ao2_callback c 0x97a4b48 (1)
=== --- ---> Locked Here: pbx.c line 4269 (handle_statechange)
=== -------------------------------------------------------------------
===
=======================================================================
Comments:By: Gregory Hinton Nietsky (irroot) 2011-04-13 09:34:43

More appropriate patch

By: Richard Mudgett (rmudgett) 2011-07-08 11:12:57.690-0500

Look to ASTERISK-17760 for a better patch.

By: Gregory Hinton Nietsky (irroot) 2011-07-08 15:10:03.864-0500

Yeah awesome thx ill be testing it soonest

By: Gregory Hinton Nietsky (irroot) 2011-07-09 03:57:21.025-0500

Im closing this issue due to the new patch been a patch and the patch i put up is more a partial hack that is not a good idea.

By: Gregory Hinton Nietsky (irroot) 2011-07-09 03:58:25.879-0500

The patch was a nasty hack at best

By: Richard Mudgett (rmudgett) 2011-07-22 15:48:36.495-0500

Committed to -r329332 trunk.

Merged revisions 329331 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/10

................
 r329331 | rmudgett | 2011-07-22 15:43:07 -0500 (Fri, 22 Jul 2011) | 55 lines

 Merged revisions 329299 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.8

 ........
   r329299 | rmudgett | 2011-07-22 10:44:58 -0500 (Fri, 22 Jul 2011) | 48 lines

   Deadlocks dealing with dialplan hints during reload.

   There are two remaining different deadlocks reported dealing with dialplan
   hints.

   The deadlock in ASTERISK-17666 is caused by invalid locking order in
   ast_remove_hint().  The hints container must be locked before the hint
   object.

   The deadlock in ASTERISK-17760 is caused by a catch-22 situation in
   handle_statechange().  The deadlock is caused by not having the conlock
   before calling the watcher callbacks.  Unfortunately, having that lock
   causes a different deadlock as reported in ASTERISK-16961.

   * Fixed ast_remove_hint() locking order.

   * Made handle_statechange() no longer call the watcher callbacks holding
   any locks that matter.

   * Made hint ao2 destructor do the watcher callbacks for extension
   deactivation to guarantee that they get called.

   * Fixed hint reference leak in ast_add_hint() if the callback container
   constructor failed.

   * Fixed hint reference leak in complete_core_show_hint() for every hint it
   found for CLI tab completion.

   * Adjusted locking in ast_merge_contexts_and_delete() for safety.

   * Added context_merge_lock to prevent ast_merge_contexts_and_delete() and
   handle_statechange() from interfering with each other.

   * Fixed ast_change_hint() not taking into account that the extension is
   used for the hash key.

   (closes issue ASTERISK-17666)
   Reported by: irroot
   Tested by: irroot
   JIRA SWP-3318

   (closes issue ASTERISK-17760)
   Reported by: Byron Clark
   Tested by: irroot
   JIRA SWP-3393

   Review: https://reviewboard.asterisk.org/r/1313/
 ........
................