[Home]

Summary:ASTERISK-13784: [patch] Deadlock when manipulating module_list over AMI and CLI
Reporter:James McCoy (jamessan)Labels:
Date Opened:2009-03-19 13:36:50Date Closed:2009-04-09 13:55:08
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/ManagerInterface
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 20090320__bug14705.diff.txt
( 1) blacklist_module_reload.diff
Description:AMI "Action: Command" commands funnel through ast_cli_command as do actual commands from the cli.  The difference being that when the command is initiated from the AMI, process_message is holding the actionlock before ast_cli_command is called.

A "module reload" command from the AMI first locks actionlock (process_message)  and then locks the module_list (ast_module_reload).
A "module load"/"module unload" command from the CLI first locks the module_list (ast_load_resource/ast_unload_resource) and then locks actionlock (ast_manager_register*/ast_manager_unregister).

Therefore a properly timed "module load"/"module unload" issued from the CLI and a "module reload" from the AMI can cause the AMI and CLI threads to deadlock.

Releasing actionlock in process_message as soon as we find the command we need to run is, unfortunately, not a viable solution.  If the AMI thread were to get interrupted just after releasing the lock but before calling the function it found, that could be removed out from under us by a CLI command.

It would seem that blacklisting all module commands from the AMI, instead of purely ones that would have the potential to modify the manager_action list is the only surefire fix.

****** ADDITIONAL INFORMATION ******

=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <file> <line num> <function> <lock name> <lock addr> (times locked)
===
=== Thread ID: 376855 (session_do           started at [ 2449] manager.c accept_thread())
=== ---> Lock #0 (manager.c): RDLOCK 2209 process_message &actionlock 0x155b2c (1)
=== ---> Lock #1 (loader.c): MUTEX 551 ast_module_reload &reloadlock 0x154f00 (1)
=== ---> Waiting for Lock #2 (loader.c): MUTEX 570 ast_module_reload &(&module_list)->lock 0x154da8 (1)
=== --- ---> Locked Here: loader.c line 709 (ast_load_resource)
=== -------------------------------------------------------------------
===
=== Thread ID: 409625 (netconsole           started at [ 1026] asterisk.c listener())
=== ---> Lock #0 (loader.c): MUTEX 709 ast_load_resource &(&module_list)->lock 0x154da8 (1)
=== ---> Waiting for Lock #1 (manager.c): WRLOCK 2570 ast_manager_register_struct &actionlock 0x155b2c (1)
=== -------------------------------------------------------------------
===
=======================================================================
Comments:By: James McCoy (jamessan) 2009-03-19 15:14:44

Attached patch to blacklist "module reload".

By: Tilghman Lesher (tilghman) 2009-03-20 17:58:14

Timed locking may be the way to fix this potential deadlock.  If while registering or unregistering, a command is unable to obtain the lock after a period of time, the register/unregister will simply fail, thus releasing the locks.

By: Tilghman Lesher (tilghman) 2009-03-26 17:42:06

jamessan: does this patch fix this potential deadlock for you?

By: Leif Madsen (lmadsen) 2009-04-03 14:07:13

I think a crossed out name means he is no longer registered in the system...

By: James McCoy (jamessan) 2009-04-08 08:21:37

Sorry, for the delay.  I was on vacation and had to catch up on things.  I haven't been able to reproduce the lockup with this patch.

By: Leif Madsen (lmadsen) 2009-04-09 12:32:01

jamesson: thanks for the update!

Marking this as Ready For Review!

By: Digium Subversion (svnbot) 2009-04-09 13:08:21

Repository: asterisk
Revision: 187428

U   branches/1.4/include/asterisk/lock.h
U   branches/1.4/main/manager.c

------------------------------------------------------------------------
r187428 | tilghman | 2009-04-09 13:08:21 -0500 (Thu, 09 Apr 2009) | 8 lines

Race condition between ast_cli_command() and 'module unload' could cause a deadlock.
Add lock timeouts to avoid this potential deadlock.
(closes issue ASTERISK-13784)
Reported by: jamessan
Patches:
      20090320__bug14705.diff.txt uploaded by tilghman (license 14)
Tested by: jamessan

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=187428

By: Digium Subversion (svnbot) 2009-04-09 13:40:03

Repository: asterisk
Revision: 187483

_U  trunk/
U   trunk/include/asterisk/linkedlists.h
U   trunk/include/asterisk/lock.h
U   trunk/main/manager.c

------------------------------------------------------------------------
r187483 | tilghman | 2009-04-09 13:40:02 -0500 (Thu, 09 Apr 2009) | 15 lines

Merged revisions 187428 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r187428 | tilghman | 2009-04-09 13:08:20 -0500 (Thu, 09 Apr 2009) | 8 lines
 
 Race condition between ast_cli_command() and 'module unload' could cause a deadlock.
 Add lock timeouts to avoid this potential deadlock.
 (closes issue ASTERISK-13784)
  Reported by: jamessan
  Patches:
        20090320__bug14705.diff.txt uploaded by tilghman (license 14)
  Tested by: jamessan
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=187483

By: Digium Subversion (svnbot) 2009-04-09 13:52:58

Repository: asterisk
Revision: 187485

_U  branches/1.6.0/
U   branches/1.6.0/include/asterisk/linkedlists.h
U   branches/1.6.0/include/asterisk/lock.h
U   branches/1.6.0/main/manager.c

------------------------------------------------------------------------
r187485 | tilghman | 2009-04-09 13:52:58 -0500 (Thu, 09 Apr 2009) | 22 lines

Merged revisions 187483 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r187483 | tilghman | 2009-04-09 13:40:01 -0500 (Thu, 09 Apr 2009) | 15 lines
 
 Merged revisions 187428 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r187428 | tilghman | 2009-04-09 13:08:20 -0500 (Thu, 09 Apr 2009) | 8 lines
   
   Race condition between ast_cli_command() and 'module unload' could cause a deadlock.
   Add lock timeouts to avoid this potential deadlock.
   (closes issue ASTERISK-13784)
    Reported by: jamessan
    Patches:
          20090320__bug14705.diff.txt uploaded by tilghman (license 14)
    Tested by: jamessan
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=187485

By: Digium Subversion (svnbot) 2009-04-09 13:54:08

Repository: asterisk
Revision: 187486

_U  branches/1.6.1/
U   branches/1.6.1/include/asterisk/linkedlists.h
U   branches/1.6.1/include/asterisk/lock.h
U   branches/1.6.1/main/manager.c

------------------------------------------------------------------------
r187486 | tilghman | 2009-04-09 13:54:08 -0500 (Thu, 09 Apr 2009) | 22 lines

Merged revisions 187483 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r187483 | tilghman | 2009-04-09 13:40:01 -0500 (Thu, 09 Apr 2009) | 15 lines
 
 Merged revisions 187428 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r187428 | tilghman | 2009-04-09 13:08:20 -0500 (Thu, 09 Apr 2009) | 8 lines
   
   Race condition between ast_cli_command() and 'module unload' could cause a deadlock.
   Add lock timeouts to avoid this potential deadlock.
   (closes issue ASTERISK-13784)
    Reported by: jamessan
    Patches:
          20090320__bug14705.diff.txt uploaded by tilghman (license 14)
    Tested by: jamessan
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=187486

By: Digium Subversion (svnbot) 2009-04-09 13:55:07

Repository: asterisk
Revision: 187487

_U  branches/1.6.2/
U   branches/1.6.2/include/asterisk/linkedlists.h
U   branches/1.6.2/include/asterisk/lock.h
U   branches/1.6.2/main/manager.c

------------------------------------------------------------------------
r187487 | tilghman | 2009-04-09 13:55:07 -0500 (Thu, 09 Apr 2009) | 22 lines

Merged revisions 187483 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r187483 | tilghman | 2009-04-09 13:40:01 -0500 (Thu, 09 Apr 2009) | 15 lines
 
 Merged revisions 187428 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r187428 | tilghman | 2009-04-09 13:08:20 -0500 (Thu, 09 Apr 2009) | 8 lines
   
   Race condition between ast_cli_command() and 'module unload' could cause a deadlock.
   Add lock timeouts to avoid this potential deadlock.
   (closes issue ASTERISK-13784)
    Reported by: jamessan
    Patches:
          20090320__bug14705.diff.txt uploaded by tilghman (license 14)
    Tested by: jamessan
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=187487