Summary: | ASTERISK-22859: Deadlock random | ||
Reporter: | Walter Klomp (walter.klomp) | Labels: | |
Date Opened: | 2013-11-16 20:44:11.000-0600 | Date Closed: | 2013-12-07 20:55:10.000-0600 |
Priority: | Major | Regression? | |
Status: | Closed/Complete | Components: | Channels/chan_sip/Subscriptions |
Versions: | 11.6.0 | Frequency of Occurrence | Frequent |
Related Issues: | |||
Environment: | Ubuntu 12.04 latest updates | Attachments: | ( 0) backtrace-threads-201311162357.txt ( 1) backtrace-threads-201311170115.txt ( 2) backtrace-threads-201311170140.txt ( 3) backtrace-threads-201311170208.txt ( 4) backtrace-threads-201311170305.txt ( 5) backtrace-threads-201311170413.txt ( 6) core-show-locks-201311162357.txt ( 7) core-show-locks-201311170115.txt ( 8) core-show-locks-201311170140.txt ( 9) core-show-locks-201311170208.txt (10) core-show-locks-201311170305.txt (11) core-show-locks-201311170413.txt |
Description: | Happened with 11.6.0.x and now also svn of asterisk-11... | ||
Comments: | By: Walter Klomp (walter.klomp) 2013-11-16 20:48:35.503-0600 I have auto-backtrace and core show locks generated when the server doesn't write anything in full.log anymore (indicating a deadlock as I register with sipsak every minute to check...) When this happens (at the weirdest hours of the day/night) customers can't register anymore. Previous crashes occured due to locks due to rtupdate in res_config_mysql - have since changed to config_odbc and set rtupdate to "no" - deadlocks are not going away. Please have a look at a few reports attached. By: Walter Klomp (walter.klomp) 2013-11-16 20:51:33.068-0600 Except for the deadlock at 23:57 (due to res_config_mysql I think) - all the other ones seem to have a common denominator at chan_sip.so ... But I may be wrong. By: Walter Klomp (walter.klomp) 2013-11-18 00:02:30.704-0600 Just now it happened again and this time I saw something in the full.log... [Nov 18 13:49:00] ERROR[25224][C-000029b7] res_timing_timerfd.c: Failed to create timerfd timer: Too many open files [Nov 18 13:49:00] NOTICE[25224][C-000029b7] chan_sip.c: Unable to create/find SIP channel for this INVITE and a lot of these: [Nov 18 13:49:00] ERROR[25224] acl.c: Cannot create socket [Nov 18 13:49:00] ERROR[25224] acl.c: Cannot create socket [Nov 18 13:49:00] ERROR[25224] acl.c: Cannot create socket [Nov 18 13:49:01] ERROR[25224] acl.c: Cannot create socket [Nov 18 13:49:01] ERROR[25224] acl.c: Cannot create socket [Nov 18 13:49:01] ERROR[25224] acl.c: Cannot create socket [Nov 18 13:49:01] ERROR[25224] acl.c: Cannot create socket [Nov 18 13:49:01] ERROR[25224] acl.c: Cannot create socket Here is my ulimit -a output root@asterisk1:/var/log/asterisk# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 31501 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 31501 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited What can I do ? By: Matt Jordan (mjordan) 2013-11-18 09:00:38.474-0600 Hi Walter: My guess is this is not a deadlock. {noformat} [Nov 18 13:49:00] ERROR[25224][C-000029b7] res_timing_timerfd.c: Failed to create timerfd timer: Too many open files {noformat} There are two scenarios that could be occurring here: # You have a rather small number of file descriptors allowed on your system, 1024. {quote} open files (-n) 1024 {quote} Asterisk tends to require a rather substantial number of file descriptors. You should probably increase this limit. # If increasing the limit of file descriptors does not resolve the problem, then you may be running into a more systemic problem wherein the file descriptors are not getting released by Asterisk. With such a low limit currently in place, however, it is difficult to tell. I would increase the number of open files allowed and see if this resolves your problem. If not, we can look at how to get information about what is occurring. By: Walter Klomp (walter.klomp) 2013-11-18 09:31:33.823-0600 Yes, I did some snooping around after I saw the error messages in full.log I also noticed the ulimit to be small - I increased it to 99999 ... Let's monitor what happens next... Meanwhile asterisk insists on staying in the stream, taking up CPU, not doing transcoding, but 20 concurrent calls take 100% CPU... would directrtpsetup help in this matter ? Clients are behind NAT... By: Matt Jordan (mjordan) 2013-12-07 20:54:48.165-0600 Suspended due to lack of activity. Please request a bug marshal in #asterisk-bugs on the IRC network irc.freenode.net to reopen the issue should you have the additional information requested. Further information can be found at http://www.asterisk.org/developers/bug-guidelines |