Summary: | ASTERISK-22854: [patch] - Deadlock between cel_pgsql unload and core_event_dispatcher taskprocessor thread | ||
Reporter: | Etienne Lessard (hexanol) | Labels: | |
Date Opened: | 2013-11-13 07:17:38.000-0600 | Date Closed: | 2013-12-31 15:27:03.000-0600 |
Priority: | Major | Regression? | |
Status: | Closed/Complete | Components: | CEL/cel_pgsql |
Versions: | 11.6.0 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Attachments: | ( 0) cel_pgsql_fix_deadlock_event.patch | |
Description: | A deadlock can happens between a thread unloading or reloading the cel_pgsql module and the core_event_dispatcher taskprocessor thread.
When the core_event_dispatcher taskprocessor thread is deadlocked, bad things follow, like: * queue member status are not updated * BLF on SIP phones are not updated * etc, i.e. everything that use the event system... Observed and reproducible on asterisk 11.6.0. Description of what is happening: Thread 1 (for example, a netconsole thread): # a "module reload cel_pgsql" is launched # the thread enter the "my_unload_module" function (cel_pgsql.c) # the thread acquire the write lock on psql_columns # the thread enter the "ast_event_unsubscribe" function (event.c) # the thread try to acquire the write lock on ast_event_subs[sub->type] Thread 2 (core_event_dispatcher taskprocessor thread): # the taskprocessor pop a CEL event # the thread enter the "handle_event" function (event.c) # the thread acquire the read lock on ast_event_subs[sub->type] # the thread callback the "pgsql_log" function (cel_pgsql.c), since it's a subscriber of CEL events # the thread try to acquire a read lock on psql_columns To reproduce the problem, I use sipp to generate calls on asterisk, and at the same time, I do a 'while sleep 0.1; do echo "$(date) Reloading..."; asterisk -rx "module reload cel_pgsql.so"; done' | ||
Comments: | By: Etienne Lessard (hexanol) 2013-11-13 07:19:29.140-0600 I've attached a patch fixing the problem. |