Summary: | ASTERISK-18166: Deadlock: asterisk isn't responding to any sip package anymore | ||||||||
Reporter: | Jacco van Tuijl (jacco) | Labels: | |||||||
Date Opened: | 2011-07-22 03:23:08 | Date Closed: | 2011-08-17 14:19:31 | ||||||
Priority: | Major | Regression? | |||||||
Status: | Closed/Complete | Components: | |||||||
Versions: | 1.8.5.0 | Frequency of Occurrence | |||||||
Related Issues: |
| ||||||||
Environment: | Attachments: | ( 0) astdebug_2011.07.17_1333.log ( 1) bt2011-07-28T221834+0200.txt ( 2) tshark-log_2011.07.17_1333.rar | |||||||
Description: | the asterisk process is still running. asterisk CLI is still responding to comands. asterisk isn't responding to any sip package anymore. | ||||||||
Comments: | By: Jacco van Tuijl (jacco) 2011-07-22 03:25:56.059-0500 this file contains: gdb, netstat,asterisk log,top,process list, df, monit log By: Jacco van Tuijl (jacco) 2011-07-22 03:31:35.221-0500 this is a wireshark trace (look at the end and see how asterisk is not responding any more) By: Gregory Hinton Nietsky (irroot) 2011-07-22 05:11:23.340-0500 Ive looked at the BT there is no "core show locks" or a "apply thread all bt full" so not easy to see but it does looke possibly like statechange issue see reviewboard 1313 By: Ole Kaas (ole.kaas) 2011-07-24 04:52:15.375-0500 This could be the same as bug 18142 By: Leif Madsen (lmadsen) 2011-07-26 09:53:43.635-0500 We'll definitely need a 'core show locks' to move this forward. Please supply. By: Leif Madsen (lmadsen) 2011-07-26 09:56:17.377-0500 This is likely a res_timing_timerfd issue, so the work around for now is to use res_timing_dahdi for now. Hopefully we'll get res_timing_timerfd fixed up here shortly. By: caspy (caspy) 2011-07-27 07:48:14.943-0500 i'm getting a deadlock like this with res_timing_dahdi. next lock i'll try to suply bt. By: Ole Kaas (ole.kaas) 2011-07-28 15:59:53.922-0500 After adding "noload => res_timing_timerfd.so" asterisk now uses dahdi for timing. No crash/deadlock for almost 2 days until now. This seems to be another issue though. BT attached (asterisk compiled with all the debug stuff). EDIT: My memory is a bit vague about this right now, but to clarify. Asterisk failed to respond to sip requests as before, but "core show channels" reported active calls. No time to verify if there were rtp streams - the process was killed with -11 to have a core dump to make a backtrace from. I was hoping the bt could reveal somthing. EDIT2: I suspect this to be a "full reload under load" issue. This server is not quite as busy as the other server where I've posted a backtrace (bug 18142 backtrace2.txt) - so the deadlock is "unreliable" on this server. By: Leif Madsen (lmadsen) 2011-08-05 15:38:09.827-0500 I don't understand. If there is no crash/deadlock after using res_timing_dahdi how is there a backtrace being provided? What is the issue? By: Leif Madsen (lmadsen) 2011-08-11 13:22:21.204-0500 Assigned to reported for feedback. By: Jacco van Tuijl (jacco) 2011-08-16 07:28:22.086-0500 After deleting res_timing_timerfd.so from disk I've have had no problems with astrisk no longer reponding to sip packages By: Terry Wilson (twilson) 2011-08-17 14:19:31.688-0500 Fix committed to 1.8 r332320 |