[Home]

Summary:ASTERISK-25388: chan_sip partial unresponsive, one permanent lock
Reporter:Stefan Tichy (st)Labels:
Date Opened:2015-09-10 09:34:36Date Closed:2020-01-14 11:13:52.000-0600
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/Registration Resources/res_config_odbc
Versions:13.5.0 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Debian Wheezy amd64; PostgreSQL 9.1; ODBC used for database connection; chan_sip loaded, pjsip not loadedAttachments:( 0) core-show-locks-2015-09-19.txt
( 1) issue25388example.tar.gz
( 2) lock-one.txt
( 3) lock-two.txt
Description:Nagios/sipsak is sending SIP option requests to UDP 5060;
Another Asterisk Instance (11.18.0) registers using transport tls (realtime config);
Local Asterisk (13.5.0) registers to another Asterisk 13.5.0 using UDP;
no other SIP traffic and no phone calls.

Scenario A: Static configuration for second peer (outgoing registration): works fine

Scenario B: realtime configuration (rtcachefriends=yes) for second peer: after some time asterisk stops responding to the monitoring requests, outgoing registration stops, Recv-Q for UDP 5060 fills up, but incoming registration still works.
"core show locks" constantly shows one lock (see attachement).

Phone calls from the first peer do work. "sip reload" will never finish. "core show locks" output in the second attachement.

Comments:By: Asterisk Team (asteriskteam) 2015-09-10 09:34:38.309-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Stefan Tichy (st) 2015-09-10 09:39:33.925-0500

Output from "core show locks" when SIP option answers has stopped.

By: Stefan Tichy (st) 2015-09-10 09:41:40.895-0500

Tried "sip reload" - another lock showed up

By: Rusty Newton (rnewton) 2015-09-11 08:33:04.426-0500

[~st] can you provide the configuration and describe how to set it up to reproduce the scenario where the locks occur?



By: Stefan Tichy (st) 2015-09-14 07:57:56.193-0500

Sorry for the delay, but serveral attempts where necessary until I could reproduce the problem with a small configuration. There is at least one error in it. The peer that registers to another asterisk server has a mailbox configuration. It was a mistake in the original configuration and it seems to be neccessary to reproduce the problem.

The names for hosts, users and secrets just have dummy values. With incoming and outgoing registration and incoming option requests the problem should show up within a few hours (max 4 or 5 in my tests, sometimes only a few minutes).

By: Stefan Tichy (st) 2015-09-14 08:00:04.651-0500

tar.gz containing asterisk config and sql statements

By: Stefan Tichy (st) 2015-09-19 09:48:17.149-0500

Locked again - it did show up when dialing from IAX chanel to SIP/TLS.

By: Rusty Newton (rnewton) 2015-10-16 16:44:54.245-0500

reattaching lock output as .txt for accessibility.

By: Rusty Newton (rnewton) 2015-10-16 17:14:06.967-0500

Stefan - I thought we had a debug log on here showing the traffic right up until the lock.

Can you reproduce the lock and post a debug log that includes "sip set debug on" output?

Preferably it has verbose and debug, both turned up. We don't need everything, just the last several minutes before the lock happens and a few minutes into it.

By: Rusty Newton (rnewton) 2015-10-16 17:17:45.916-0500

Oh, for that same debug log it would be great to have the corresponding "core show locks" output and a backtrace if possible of course.

By: Stefan Tichy (st) 2015-10-26 05:59:18.749-0500

I had to install a test system on some smaller box, because the server is in use for other tests. There seem to be other problems, but until now no dead lock situation. Asterisk causes high CPU load with just two peers. When more information is available I will upload the files.

By: Rusty Newton (rnewton) 2015-10-30 17:19:25.979-0500

Okay. Also if you can demonstrate how to reproduce the high CPU load issue with two peers... open a separate issue for that. We can link the two as related.

By: Asterisk Team (asteriskteam) 2015-11-14 12:00:22.271-0600

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines