[Home]

Summary:ASTERISK-22976: app_queue function queue_show() and find_queue_by_name_rt() cause deadlock
Reporter:Aaron An (aaron)Labels:
Date Opened:2013-12-11 21:07:04.000-0600Date Closed:2021-01-04 04:56:42.000-0600
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Applications/app_queue
Versions:1.8.24.0 13.18.4 Frequency of
Occurrence
Frequent
Related
Issues:
is duplicated byASTERISK-29155 app_queue: Deadlock between queues container and individual queues
Environment:CentOS5.8 X64 DellR410 Asterisk 1.8 trunk versionAttachments:
Description:I use "queue show xxxx" to monitor queue status, and use realtime queue. concurrency is about 100 calls. deadlock will be occur after 10-30minutes.
analysis result:
in find_queue_by_name_rt() first lock single queue "ao2_lock(q);" and then lock global queues "queues_t_unlink(queues, q, "Unused; removing from container");";
in __queues_show() first lock global queues "ao2_lock(queues);" then lock single queue "ao2_lock(q);"

so it causes dead lock.
Comments:By: Paul Belanger (pabelanger) 2014-01-10 15:24:26.507-0600

Debugging deadlocks: Please select DEBUG_THREADS and DONT_OPTIMIZE in the Compiler Flags section of menuselect. Recompile and install Asterisk (i.e. make install).  This will then give you the console command "core show locks." When the symptoms of the deadlock present themselves again, please provide output of the deadlock via:

# asterisk -rx "core show locks" | tee /tmp/core-show-locks.txt
# gdb -se "asterisk" <pid of asterisk> | tee /tmp/backtrace.txt
gdb> bt
gdb> bt full
gdb> thread apply all bt

Then attach the core-show-locks.txt and backtrace.txt files to this issue. Thanks!



By: Joshua C. Colp (jcolp) 2017-12-18 11:13:04.539-0600

Have you had this occur under recent versions of asterisk?

By: Aaron An (aaron) 2017-12-19 02:39:43.914-0600

hi, Joshua Colp, I don't use app_queue any more for past 2 years, so I don't know whether it is resolved or not in recent version. But I guess the dead lock is remain there especially using realtime queue. Realtime have I/O outside so the rate of race condition will larger than static queue.

By: Leandro Dardini (ldardini) 2019-03-11 13:03:53.637-0500

I confirm the problem is still happening at least on asterisk 13.24.1. After 4 hours of queue monitoring with "queue show xxxx", there is a lock and the command gives no output.

By: Leif Einar Aune (leifeinar) 2020-09-14 13:04:32.812-0500

We are running 13.29.2, and also observe a deadlock in app_queue.c while busy hour of incoming calls in combination with  extensive use of queue_show.

Seeing that __queues_show() locks the global queues pointer mutex for a long time (effectively blocking other threads from iterating or finding queues), we tried to remove the lock and use a locking iterator instead of ao2_iterator_init(queues, AO2_ITERATOR_DONTLOCK);

The deadlock problem now vanished!

It seems safe to use the locking iterator in the __queues_show() function instead of locking the queues mutex during the complete iteration, and we are considering providing a patch with this fix via gerrit.

By: Leif Einar Aune (leifeinar) 2020-12-31 09:11:23.235-0600

Probably a duplicate of ASTERISK-29155