[Home]

Summary:ASTERISK-25364: [patch]Issue a TCP connection(kernel) and thread of asterisk is not released
Reporter:Hiroaki Komatsu (Hiroaki Komatsu)Labels:
Date Opened:2015-09-01 00:36:34Date Closed:2015-12-11 12:10:57.000-0600
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:11.2.0 11.6.0 11.20.0 13.5.0 13.6.0 Frequency of
Occurrence
Related
Issues:
Environment:・centOS 6.5(kernel-2.6.32-431.3.1.el6.x86_64) ・certified-asterisk-11.2-cert1Attachments:( 0) sequence.ppt
( 1) sysctl.txt
( 2) tcp_connection_and_threads.txt
( 3) tcp_keep_alive_on_service.diff
( 4) tcp_keepalive.patch
Description:I have done tls communication using the asterisk at the wifi and 3G environment.Among them, the problem of unused tcp connection and threads is not released has occurred.
Specifically, the resource state on the asterisk environment was not released in the following situations.
 
・Command executed
 #netstat -a | grep https
 →TCP connection is present.(kernel)
 #asterisk -rx "core show tcp"
 →TCP connection is not present.(asterisk)
 #asterisk -rx "core show threads" | grep ast_tcptls_server_root
 →Thread exists.(asterisk)
 
I will attach the results of each command
(Attached: tcp_connection_and_threads.txt)
After that, I check the following points first.
 
・Opportunity of event occurrence
 →TCP/TLS connection sequence has been interrupted(※)
   halfway for some reason(Attached: sequence.ppt)
   (※)After the TCP three handshake,
     client hello message from the terminal has not reached to the asterisk.
  
・The presence or absence of TCP keepalive delivery
 →TCP keepalive has not been sent.
   Kernel configuration attached.(Attached: sysctl.txt)
  
Then from the above results, we focus on tcp connection
and tcp keepalive setting opportunity of asterisk,

it was done trace source, the actual confirmation.
From this result, I think the cause of this problem as follows.

・asterisk it seems to have done the "addition of TCP connection" and "TCP Enable keepalive" after the connection establishment.
 
 Therefore, if the TCP/TLS connection sequence is interrupted in
 the middle,asterisk is on to be a wait state in the thread, tcp keepalive also not been enabled.

The deal, I think should be done to enable the TCP keepalive
immediately after tcp/tls socket generation.how is it.(Attached: keepalive.patch)
Comments:By: Asterisk Team (asteriskteam) 2015-09-01 00:36:36.610-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Rusty Newton (rnewton) 2015-09-03 18:13:10.084-0500

[~Hiroaki Komatsu] You set the affects version field to: 11.2.0

Can you verify whether this affects the latest 11 release (or even better the latest head of the Git branch) ?

Based on the nature of the fix I will presume that it does. However you understand the issue better than I, so I wanted to ask you.

Thank you!

By: Hiroaki Komatsu (Hiroaki Komatsu) 2015-09-04 01:02:51.442-0500

The following version I downloaded trial.
http://www.asterisk.org/downloads/asterisk/all-asterisk-versions
・asterisk-13.5.0
・certified-asterisk-11.6-cert11

Create a pseudo client(※) and sent a signal to the TLS port.
(※)Tool for performing only "three-way handshake"
A result, similar event has occurred in both version.


By: Rusty Newton (rnewton) 2015-09-04 09:51:46.780-0500

Thank you for the additional testing and verification!

By: Hiroaki Komatsu (Hiroaki Komatsu) 2015-10-06 04:37:52.411-0500

For the following, I would like to receive your opinion.
・For the cause that we think, please point out if there is a mistake.
・Processing logic that causes this event, what is in the other.

By: Jonathan Rose (jrose) 2015-11-11 18:13:09.676-0600

I don't think the accept_fd on the sip_tcp_desc and sip_tls_desc structs are using the same file descriptor that was already being used to set the tcptls session file descriptor.

It might be the case that we need to be setting the keep alive option on both the sip_tcp_desc accept_fd and the associated session file descriptor, but I'm not sure. I don't think that this is simply a matter of when we are setting the option as the issue suggests though because the file descriptor numbers are definitely different.

Right now I'm wondering if the errors you are running into are being solved by removing the keep alive option from the session file descriptor or if they are being solved by adding the keep alive option to the session args accept_fd.

By: Jonathan Rose (jrose) 2015-11-17 13:22:37.680-0600

Mr. Komatsu, while I haven't been able to fully reproduce your problem, I've spent a good deal of time studying the issue you saw and have a patch that I hope does solve your problem.  Would you be able to test and report if it works?

I've been studying this issue and my effort to reproduce the exact set of problems you experienced under lab conditions didn't get the same results. However, the patch you posted seems correct in that it sets the TCP keep alive socket option on a TCP socket that previously wasn't using keep alive. The patch seems to assume that this is the same socket that was in use for active sessions and removes the keep alive option from another socket though.

I've made a small change to the patch -- it no longer removes the TCP keep alive option from active sessions. Since I wasn't fully able to reproduce the issue, I request that you all review the new patch and make sure it solves your problem. There is an small chance that it was actually the presence of the keep alive option on the active session socket that caused your problem instead of the lack of a keep alive option on the TCP server socket and I want to be certain that the revised patch fixes your issue before I push it into the repository.

See:
https://issues.asterisk.org/jira/secure/attachment/53224/tcp_keep_alive_on_service.diff

By: Hiroaki Komatsu (Hiroaki Komatsu) 2015-11-20 01:04:10.807-0600

Dear Jonathan Rose
Thank you for confirmation.I am very grateful.
Apply your patch, I'll make sure the occurrence or non-occurrence of the problem.
As soon as it is confirmed, I will make a report


By: Jonathan Rose (jrose) 2015-12-09 11:55:54.056-0600

Greetings Mr Komatsu, I'm just checking in to see if you all have had an opportunity to test the patch yet.

By: Hiroaki Komatsu (Hiroaki Komatsu) 2015-12-09 19:19:13.965-0600

Dear Jonathan.
I'm sorry, my answer is now late.
By your patch, I was sure that the problem is resolved.thank you.