[Home]

Summary:ASTERISK-22835: pbx_realtime: deadlock with channel in autoservice while calling realtime switch
Reporter:Walter Klomp (walter.klomp)Labels:
Date Opened:2013-11-10 03:17:33.000-0600Date Closed:2013-11-10 20:33:06.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:PBX/pbx_realtime
Versions:11.6.0 Frequency of
Occurrence
Frequent
Related
Issues:
duplicatesASTERISK-21040 Deadlock involving chan_sip.c, pbx.c and autoservice.c, locking on chan and &conclock
is duplicated byASTERISK-21228 Deadlock in pbx_find_extension when attempting an autoservice stop due to holding the context lock
Environment:Ubuntu 12.04 64 bit - latest updatesAttachments:( 0) asterisk-11.6.0.random.crash.txt
( 1) backtrace.txt
( 2) core-show-locks.txt
Description:Asterisk will regularly just stop processing calls, it happens at random intervals during the day. At that time we can still connect but no more issue any command - it just doesn't respond.  Only a killall -9 asterisk will kill the process and agi processes associated with it.

Needless to say this is very disruptive to my service (about 2000 registered users)

Tried different versions all with the same result.
Comments:By: Walter Klomp (walter.klomp) 2013-11-10 03:19:46.149-0600

Backtrace attached when the system is in an unusable state.

By: Matt Jordan (mjordan) 2013-11-10 18:16:32.176-0600

Debugging deadlocks: Please select DEBUG_THREADS and DONT_OPTIMIZE in the Compiler Flags section of menuselect. Recompile and install Asterisk (i.e. make install).  This will then give you the console command "core show locks." When the symptoms of the deadlock present themselves again, please provide output of the deadlock via:

# asterisk -rx "core show locks" | tee /tmp/core-show-locks.txt
# gdb -se "asterisk" <pid of asterisk> | tee /tmp/backtrace.txt
gdb> bt
gdb> bt full
gdb> thread apply all bt

Then attach the core-show-locks.txt and backtrace.txt files to this issue. Thanks!



By: Walter Klomp (walter.klomp) 2013-11-10 20:08:02.247-0600

Attached as requested...

By: Walter Klomp (walter.klomp) 2013-11-10 20:12:40.329-0600

Took only 20 minutes to crash - not very good for a production server. The only thing that has changed is I have updated all the packages on ubuntu 12.04 ... But recompiled after that so should not be an issue, even then it did not start showing this behaviour until 4 days after that... Meanwhile the customer-list keeps growing and we have now more than 2000 subscribers... is there a limit to what asterisk can handle ?

By: Matt Jordan (mjordan) 2013-11-10 20:33:00.398-0600

This is a duplicate of ASTERISK-21228. There's more information on that issue; suffice to say, using {{pbx_realtime}} can, on a sufficiently busy system, result in a deadlock. As this is a duplicate of that issue, I'm going to close it out to that issue.

By: Walter Klomp (walter.klomp) 2013-11-11 05:20:59.963-0600

So, is there no way to fix this?  I have now removed the realtime extensions, so not using "switch" anymore. Still have the sip clients in the SQL database though.  Crashing has greatly reduced, but still happens.