Summary: | ASTERISK-13784: [patch] Deadlock when manipulating module_list over AMI and CLI | ||
Reporter: | James McCoy (jamessan) | Labels: | |
Date Opened: | 2009-03-19 13:36:50 | Date Closed: | 2009-04-09 13:55:08 |
Priority: | Major | Regression? | No |
Status: | Closed/Complete | Components: | Core/ManagerInterface |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) 20090320__bug14705.diff.txt ( 1) blacklist_module_reload.diff | |
Description: | AMI "Action: Command" commands funnel through ast_cli_command as do actual commands from the cli. The difference being that when the command is initiated from the AMI, process_message is holding the actionlock before ast_cli_command is called. A "module reload" command from the AMI first locks actionlock (process_message) and then locks the module_list (ast_module_reload). A "module load"/"module unload" command from the CLI first locks the module_list (ast_load_resource/ast_unload_resource) and then locks actionlock (ast_manager_register*/ast_manager_unregister). Therefore a properly timed "module load"/"module unload" issued from the CLI and a "module reload" from the AMI can cause the AMI and CLI threads to deadlock. Releasing actionlock in process_message as soon as we find the command we need to run is, unfortunately, not a viable solution. If the AMI thread were to get interrupted just after releasing the lock but before calling the function it found, that could be removed out from under us by a CLI command. It would seem that blacklisting all module commands from the AMI, instead of purely ones that would have the potential to modify the manager_action list is the only surefire fix. ****** ADDITIONAL INFORMATION ****** ======================================================================= === Currently Held Locks ============================================== ======================================================================= === === <file> <line num> <function> <lock name> <lock addr> (times locked) === === Thread ID: 376855 (session_do started at [ 2449] manager.c accept_thread()) === ---> Lock #0 (manager.c): RDLOCK 2209 process_message &actionlock 0x155b2c (1) === ---> Lock #1 (loader.c): MUTEX 551 ast_module_reload &reloadlock 0x154f00 (1) === ---> Waiting for Lock #2 (loader.c): MUTEX 570 ast_module_reload &(&module_list)->lock 0x154da8 (1) === --- ---> Locked Here: loader.c line 709 (ast_load_resource) === ------------------------------------------------------------------- === === Thread ID: 409625 (netconsole started at [ 1026] asterisk.c listener()) === ---> Lock #0 (loader.c): MUTEX 709 ast_load_resource &(&module_list)->lock 0x154da8 (1) === ---> Waiting for Lock #1 (manager.c): WRLOCK 2570 ast_manager_register_struct &actionlock 0x155b2c (1) === ------------------------------------------------------------------- === ======================================================================= | ||
Comments: | By: James McCoy (jamessan) 2009-03-19 15:14:44 Attached patch to blacklist "module reload". By: Tilghman Lesher (tilghman) 2009-03-20 17:58:14 Timed locking may be the way to fix this potential deadlock. If while registering or unregistering, a command is unable to obtain the lock after a period of time, the register/unregister will simply fail, thus releasing the locks. By: Tilghman Lesher (tilghman) 2009-03-26 17:42:06 jamessan: does this patch fix this potential deadlock for you? By: Leif Madsen (lmadsen) 2009-04-03 14:07:13 I think a crossed out name means he is no longer registered in the system... By: James McCoy (jamessan) 2009-04-08 08:21:37 Sorry, for the delay. I was on vacation and had to catch up on things. I haven't been able to reproduce the lockup with this patch. By: Leif Madsen (lmadsen) 2009-04-09 12:32:01 jamesson: thanks for the update! Marking this as Ready For Review! By: Digium Subversion (svnbot) 2009-04-09 13:08:21 Repository: asterisk Revision: 187428 U branches/1.4/include/asterisk/lock.h U branches/1.4/main/manager.c ------------------------------------------------------------------------ r187428 | tilghman | 2009-04-09 13:08:21 -0500 (Thu, 09 Apr 2009) | 8 lines Race condition between ast_cli_command() and 'module unload' could cause a deadlock. Add lock timeouts to avoid this potential deadlock. (closes issue ASTERISK-13784) Reported by: jamessan Patches: 20090320__bug14705.diff.txt uploaded by tilghman (license 14) Tested by: jamessan ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=187428 By: Digium Subversion (svnbot) 2009-04-09 13:40:03 Repository: asterisk Revision: 187483 _U trunk/ U trunk/include/asterisk/linkedlists.h U trunk/include/asterisk/lock.h U trunk/main/manager.c ------------------------------------------------------------------------ r187483 | tilghman | 2009-04-09 13:40:02 -0500 (Thu, 09 Apr 2009) | 15 lines Merged revisions 187428 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r187428 | tilghman | 2009-04-09 13:08:20 -0500 (Thu, 09 Apr 2009) | 8 lines Race condition between ast_cli_command() and 'module unload' could cause a deadlock. Add lock timeouts to avoid this potential deadlock. (closes issue ASTERISK-13784) Reported by: jamessan Patches: 20090320__bug14705.diff.txt uploaded by tilghman (license 14) Tested by: jamessan ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=187483 By: Digium Subversion (svnbot) 2009-04-09 13:52:58 Repository: asterisk Revision: 187485 _U branches/1.6.0/ U branches/1.6.0/include/asterisk/linkedlists.h U branches/1.6.0/include/asterisk/lock.h U branches/1.6.0/main/manager.c ------------------------------------------------------------------------ r187485 | tilghman | 2009-04-09 13:52:58 -0500 (Thu, 09 Apr 2009) | 22 lines Merged revisions 187483 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r187483 | tilghman | 2009-04-09 13:40:01 -0500 (Thu, 09 Apr 2009) | 15 lines Merged revisions 187428 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r187428 | tilghman | 2009-04-09 13:08:20 -0500 (Thu, 09 Apr 2009) | 8 lines Race condition between ast_cli_command() and 'module unload' could cause a deadlock. Add lock timeouts to avoid this potential deadlock. (closes issue ASTERISK-13784) Reported by: jamessan Patches: 20090320__bug14705.diff.txt uploaded by tilghman (license 14) Tested by: jamessan ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=187485 By: Digium Subversion (svnbot) 2009-04-09 13:54:08 Repository: asterisk Revision: 187486 _U branches/1.6.1/ U branches/1.6.1/include/asterisk/linkedlists.h U branches/1.6.1/include/asterisk/lock.h U branches/1.6.1/main/manager.c ------------------------------------------------------------------------ r187486 | tilghman | 2009-04-09 13:54:08 -0500 (Thu, 09 Apr 2009) | 22 lines Merged revisions 187483 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r187483 | tilghman | 2009-04-09 13:40:01 -0500 (Thu, 09 Apr 2009) | 15 lines Merged revisions 187428 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r187428 | tilghman | 2009-04-09 13:08:20 -0500 (Thu, 09 Apr 2009) | 8 lines Race condition between ast_cli_command() and 'module unload' could cause a deadlock. Add lock timeouts to avoid this potential deadlock. (closes issue ASTERISK-13784) Reported by: jamessan Patches: 20090320__bug14705.diff.txt uploaded by tilghman (license 14) Tested by: jamessan ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=187486 By: Digium Subversion (svnbot) 2009-04-09 13:55:07 Repository: asterisk Revision: 187487 _U branches/1.6.2/ U branches/1.6.2/include/asterisk/linkedlists.h U branches/1.6.2/include/asterisk/lock.h U branches/1.6.2/main/manager.c ------------------------------------------------------------------------ r187487 | tilghman | 2009-04-09 13:55:07 -0500 (Thu, 09 Apr 2009) | 22 lines Merged revisions 187483 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r187483 | tilghman | 2009-04-09 13:40:01 -0500 (Thu, 09 Apr 2009) | 15 lines Merged revisions 187428 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r187428 | tilghman | 2009-04-09 13:08:20 -0500 (Thu, 09 Apr 2009) | 8 lines Race condition between ast_cli_command() and 'module unload' could cause a deadlock. Add lock timeouts to avoid this potential deadlock. (closes issue ASTERISK-13784) Reported by: jamessan Patches: 20090320__bug14705.diff.txt uploaded by tilghman (license 14) Tested by: jamessan ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=187487 |