Summary: | ASTERISK-25388: chan_sip partial unresponsive, one permanent lock | ||
Reporter: | Stefan Tichy (st) | Labels: | |
Date Opened: | 2015-09-10 09:34:36 | Date Closed: | 2020-01-14 11:13:52.000-0600 |
Priority: | Major | Regression? | No |
Status: | Closed/Complete | Components: | Channels/chan_sip/Registration Resources/res_config_odbc |
Versions: | 13.5.0 | Frequency of Occurrence | Frequent |
Related Issues: | |||
Environment: | Debian Wheezy amd64; PostgreSQL 9.1; ODBC used for database connection; chan_sip loaded, pjsip not loaded | Attachments: | ( 0) core-show-locks-2015-09-19.txt ( 1) issue25388example.tar.gz ( 2) lock-one.txt ( 3) lock-two.txt |
Description: | Nagios/sipsak is sending SIP option requests to UDP 5060;
Another Asterisk Instance (11.18.0) registers using transport tls (realtime config); Local Asterisk (13.5.0) registers to another Asterisk 13.5.0 using UDP; no other SIP traffic and no phone calls. Scenario A: Static configuration for second peer (outgoing registration): works fine Scenario B: realtime configuration (rtcachefriends=yes) for second peer: after some time asterisk stops responding to the monitoring requests, outgoing registration stops, Recv-Q for UDP 5060 fills up, but incoming registration still works. "core show locks" constantly shows one lock (see attachement). Phone calls from the first peer do work. "sip reload" will never finish. "core show locks" output in the second attachement. | ||
Comments: | By: Asterisk Team (asteriskteam) 2015-09-10 09:34:38.309-0500 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: Stefan Tichy (st) 2015-09-10 09:39:33.925-0500 Output from "core show locks" when SIP option answers has stopped. By: Stefan Tichy (st) 2015-09-10 09:41:40.895-0500 Tried "sip reload" - another lock showed up By: Rusty Newton (rnewton) 2015-09-11 08:33:04.426-0500 [~st] can you provide the configuration and describe how to set it up to reproduce the scenario where the locks occur? By: Stefan Tichy (st) 2015-09-14 07:57:56.193-0500 Sorry for the delay, but serveral attempts where necessary until I could reproduce the problem with a small configuration. There is at least one error in it. The peer that registers to another asterisk server has a mailbox configuration. It was a mistake in the original configuration and it seems to be neccessary to reproduce the problem. The names for hosts, users and secrets just have dummy values. With incoming and outgoing registration and incoming option requests the problem should show up within a few hours (max 4 or 5 in my tests, sometimes only a few minutes). By: Stefan Tichy (st) 2015-09-14 08:00:04.651-0500 tar.gz containing asterisk config and sql statements By: Stefan Tichy (st) 2015-09-19 09:48:17.149-0500 Locked again - it did show up when dialing from IAX chanel to SIP/TLS. By: Rusty Newton (rnewton) 2015-10-16 16:44:54.245-0500 reattaching lock output as .txt for accessibility. By: Rusty Newton (rnewton) 2015-10-16 17:14:06.967-0500 Stefan - I thought we had a debug log on here showing the traffic right up until the lock. Can you reproduce the lock and post a debug log that includes "sip set debug on" output? Preferably it has verbose and debug, both turned up. We don't need everything, just the last several minutes before the lock happens and a few minutes into it. By: Rusty Newton (rnewton) 2015-10-16 17:17:45.916-0500 Oh, for that same debug log it would be great to have the corresponding "core show locks" output and a backtrace if possible of course. By: Stefan Tichy (st) 2015-10-26 05:59:18.749-0500 I had to install a test system on some smaller box, because the server is in use for other tests. There seem to be other problems, but until now no dead lock situation. Asterisk causes high CPU load with just two peers. When more information is available I will upload the files. By: Rusty Newton (rnewton) 2015-10-30 17:19:25.979-0500 Okay. Also if you can demonstrate how to reproduce the high CPU load issue with two peers... open a separate issue for that. We can link the two as related. By: Asterisk Team (asteriskteam) 2015-11-14 12:00:22.271-0600 Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1]. [1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines |