[Home]

Summary:ASTERISK-22750: SIP TLS calls stop working after a period of no SIP TLS calls to a destination
Reporter:Dwayne Hubbard (dwayne)Labels:
Date Opened:2013-10-23 14:58:10Date Closed:
Priority:MajorRegression?
Status:Open/NewComponents:Channels/chan_sip/TCP-TLS
Versions:SVN 1.8.23.1 13.18.4 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Asterisk 1.8.23.1 CentOS 6.4 x86_64 SIP TLS / SRTPAttachments:( 0) dw-asterisk-1.8.23.1-sip-tls.patch
( 1) dw-asterisk-trunk-r401662-sip-tls.patch
Description:SIP TLS/SRTP calls to a SIP TLS destination will setup a tcptls connection to the SIP TLS destination which is viewable using Asterisk CLI 'sip show tcp'.  Calls to a SIP TLS destination will work until there is a period (~30 minutes) of no activity to the SIP TLS destination at which point the tcptls _sip_tcp_helper_thread function will become blocked in the ast_poll() function with a timeout of -1.  Once this happens, SIP TLS calls to the SIP TLS destination will not succeed until one of the following occurs:

 1)  Asterisk Restarted
 2)  The chan_sip.so module is reloaded
 3)  A SSL_shutdown failed: 5 ERROR occurs

The patch provided change the _sip_tcp_helper_thread function timeout to 10 seconds.  If the ast_poll() function returns 0 (timeout) AND the tcptls AO2 reference count is greater than 2, then continue will be called to return to the ast_poll() function for another timeout period.  If the ast_poll() function returns 0 (timeout) AND the tcptls AO2 reference count is 2 (or less), then the tcptls session will be destroyed.
Comments:By: Dwayne Hubbard (dwayne) 2013-10-23 15:04:33.249-0500

Change the _sip_tcp_helper_thread timeout period from -1 to 10 seconds so the tcptls session does not become stale after a period of inactivity.

This (dw-asterisk-1.8.23.1-sip-tls.patch) is not a proper fix because, as pointed out by Mark Michelson on IRC, "In general, if you're having to check refcounts in order to know how to proceed, things are being done in a suboptimal way."

This patch at least provides a work around until the proper solution is available.

By: Dwayne Hubbard (dwayne) 2013-10-23 15:09:57.090-0500

Add the trunk r401662 equivalent of the dw-asterisk-1.8.23.1-sip-tls.patch

By: Rusty Newton (rnewton) 2013-10-31 09:46:15.044-0500

Dwayne, if you can post some Asterisk full logs, with verbose, debug, sip debug, at least showing the flow of things, that may help some that look into this issue. Thanks!

By: Dwayne Hubbard (dwayne) 2013-11-10 08:18:04.914-0600

Rusty,
 OK, I will add the requested information this week.

By: Alexander Traud (traud) 2020-11-03 03:27:39.846-0600

While investigating the remaining SDES-sRTP related issues, I tried to reproduce this one here. I was not able to do so with Asterisk 13.37, Ubuntu 20.10, and chan_sip. Do you as one of the watchers of this issue still face this? If yes:
A) Before you call the first time, do you have a TLS client created by a registration {{sip show registry}}? Or created the very first call that TLS client?
B) In Wireshark, do you see any packets within that 30 minutes when you filter for {{tcp.port == 5061}}?
C) Could this be related to your Firewall? Some firewalls close the external port for unused TCP connections early; they do not wait the 7440 seconds which [are recommended…|https://stackoverflow.com/a/30386134]

In that latter case, if you cannot change your firewall (through a setting, port forwarding, or replacing it with another product) the SIP channel driver ‘chan_sip’ does not offer any means to keep-alive a TCP/TLS client transport connection. The various configurations options are just for the case, when Asterisk is the server side. However, you are able to workaround such a firewall like yours by changing a system wide default: {{sudo sysctl -w net.ipv4.tcp_keepalive_time=295​}}

Anyway, that is just a guess, just for case C. If you still face that issue, please, reply because I have some spare time right now to look into this.