[Home]

Summary:ASTERISK-17852: [patch] ast_tcptls_server_start fails second attempt to bind
Reporter:Walter Doekes (wdoekes)Labels:
Date Opened:2011-05-13 07:14:30Date Closed:2011-05-23 11:28:17
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/Netsock
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) issue_19289.diff
( 1) issue19289_delay_old_address_setting_tcptls_2.patch
( 2) issue19289_delay_old_address_setting_tcptls.patch
Description:Hi,

just now I had to kill -9 asterisk in a nightly job because of a deadlock. (Infinite trylock restart situation to be exact.) I suspected init.d/asterisk stop wouldn't complete and I was right.. kill -9 solved that.

But, unfortunately, it looked like SO_REUSEADDR didn't do its job properlywhen immediately starting asterisk after that. (I don't know how that is possible, but it was.) I missed out on my two TCP listening sockets: manager at 5038 and tcp sip at 5060.

This is when I found out that doing 'manager reload' or 'sip reload' manually at a later point in time doesn't work.

Why is that? Because the ast_tcptls_server_start() thinks that "nothing has changed".

****** STEPS TO REPRODUCE ******

# /etc/init.d/asterisk stop ; sleep 1 ; nc -l -p 5038 & nc -l -p 5060 & /etc/init.d/asterisk start
Stopping Asterisk PBX: asterisk.
[1] 22151
[2] 22152
Starting Asterisk PBX: asterisk.

# grep ERR /var/log/asterisk/debug.log | tail -n3
[2011-05-13 13:52:49] ERROR[22161] tcptls.c: Unable to bind AMI server to 0.0.0.0:5038: Address already in use
[2011-05-13 13:52:49] ERROR[22161] tcptls.c: Unable to bind SIP TCP server to 0.0.0.0:5060: Address already in use
[2011-05-13 13:52:49] ERROR[22161] chan_sip.c: SIP TCP Server start failed. Not listening on TCP socket.

# netstat -apnAinet | grep asterisk
udp        0      0 0.0.0.0:5060            0.0.0.0:*                           22161/asterisk  

# kill %1 ; kill %2

# asterisk -rx 'manager reload' ; asterisk -rx 'sip reload'

# netstat -apnAinet | grep asterisk
udp        0      0 0.0.0.0:5060            0.0.0.0:*                           22161/asterisk  

(while the expected output should be this)

# netstat -apnAinet | grep asterisk
tcp        0      0 0.0.0.0:5060            0.0.0.0:*               LISTEN      22161/asterisk  
tcp        0      0 0.0.0.0:5038            0.0.0.0:*               LISTEN      22161/asterisk  
udp        0      0 0.0.0.0:5060            0.0.0.0:*                           22161/asterisk  


****** ADDITIONAL INFORMATION ******

I changed ast_tcptls_server_start and ast_tcptls_client_start to only set sin_family=0 at the beginning of the function. First if the connection/listening-socket is set up completely do I overwrite old_address with local_address (or remote_address).

With the attached patch, manager reload and sip reload work like expected.

Regards,
Walter Doekes
OSSO B.V.
Comments:By: David Vossel (dvossel) 2011-05-20 11:27:06

There are already goto escapes in those function to handle errors.  It seems like the old_address should just be cleared there.  Does my patch do what you'd expect as well?

By: Walter Doekes (wdoekes) 2011-05-22 09:18:16

18:42 <@The_Boy_Wonder> wdoekes2: issue ASTERISK-17852  does it look like my change will work?
22:56 < wdoekes2> The_Boy_Wonder: for my particular case it would work indeed (failure on bind). if you would manage to get an error on the earlier socket call, it wouldn't. (I don't know if there are common socket errors, apart from running out of fd's)
23:11 <@The_Boy_Wonder> wdoekes2: ah alright. We'll go with your version
23:34 < wdoekes2> The_Boy_Wonder: we might want to add a desc->master = AST_PTHREADT_NULL after the if().. otherwise that code could be run several times and I'm not sure if it's wise to pthread_join() a thread that is already gone
23:35 < wdoekes2> or does someone else clear ->master?
23:41 <@The_Boy_Wonder> no clue


Ok. I checked, and no one clears ->master, but they should. In the unpatched version, you can pthread_cancel a thread multiple times by binding to (configuring) different used ports after it was bound to a valid one once. In the patched version, it's even easier, but the problem already existed.

Adding a desc->master = AST_PTHREADT_NULL solves this. (It probably works fine without, until that one moment when you actually do have a valid thread handle by accident and you shoot down a completely wrong thread. Note that there are other places in the source with a similar lack of AST_PTHREADT_NULL setting.)

New patch (issue19289_delay_old_address_setting_tcptls_2.patch) is against trunk, btw.

By: Digium Subversion (svnbot) 2011-05-23 09:31:05

Repository: asterisk
Revision: 320271

U   branches/1.6.2/main/tcptls.c

------------------------------------------------------------------------
r320271 | dvossel | 2011-05-23 09:31:05 -0500 (Mon, 23 May 2011) | 8 lines

Fixes issue with ast_tcptls_server_start failing on second attempt to bind.

(closes issue ASTERISK-17852)
Reported by: wdoekes
Patches:
     issue19289_delay_old_address_setting_tcptls.patch uploaded by wdoekes (license 717)


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=320271

By: Digium Subversion (svnbot) 2011-05-23 09:32:20

Repository: asterisk
Revision: 320338

_U  branches/1.8/
U   branches/1.8/main/tcptls.c

------------------------------------------------------------------------
r320338 | dvossel | 2011-05-23 09:32:20 -0500 (Mon, 23 May 2011) | 14 lines

Merged revisions 320271 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.6.2

........
 r320271 | dvossel | 2011-05-20 16:24:48 -0500 (Fri, 20 May 2011) | 8 lines
 
 Fixes issue with ast_tcptls_server_start failing on second attempt to bind.
 
 (closes issue ASTERISK-17852)
 Reported by: wdoekes
 Patches:
       issue19289_delay_old_address_setting_tcptls.patch uploaded by wdoekes (license 717)
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=320338

By: Digium Subversion (svnbot) 2011-05-23 09:32:25

Repository: asterisk
Revision: 320340

_U  trunk/
U   trunk/main/tcptls.c

------------------------------------------------------------------------
r320340 | dvossel | 2011-05-23 09:32:24 -0500 (Mon, 23 May 2011) | 21 lines

Merged revisions 320338 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

................
 r320338 | dvossel | 2011-05-20 16:39:36 -0500 (Fri, 20 May 2011) | 14 lines
 
 Merged revisions 320271 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.6.2
 
 ........
   r320271 | dvossel | 2011-05-20 16:24:48 -0500 (Fri, 20 May 2011) | 8 lines
   
   Fixes issue with ast_tcptls_server_start failing on second attempt to bind.
   
   (closes issue ASTERISK-17852)
   Reported by: wdoekes
   Patches:
         issue19289_delay_old_address_setting_tcptls.patch uploaded by wdoekes (license 717)
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=320340

By: Digium Subversion (svnbot) 2011-05-23 11:15:19

Repository: asterisk
Revision: 320562

U   branches/1.6.2/main/tcptls.c

------------------------------------------------------------------------
r320562 | dvossel | 2011-05-23 11:15:18 -0500 (Mon, 23 May 2011) | 9 lines

Adds missing part to the ast_tcptls_server_start fails second attempt to bind patch.

(closes issue ASTERISK-17852)
Reported by: wdoekes
Patches:
     issue19289_delay_old_address_setting_tcptls_2.patch uploaded by wdoekes (license 717)



------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=320562

By: Digium Subversion (svnbot) 2011-05-23 11:18:35

Repository: asterisk
Revision: 320568

_U  branches/1.8/
U   branches/1.8/main/tcptls.c

------------------------------------------------------------------------
r320568 | dvossel | 2011-05-23 11:18:35 -0500 (Mon, 23 May 2011) | 14 lines

Merged revisions 320562 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.6.2

........
 r320562 | dvossel | 2011-05-23 11:15:18 -0500 (Mon, 23 May 2011) | 9 lines
 
 Adds missing part to the ast_tcptls_server_start fails second attempt to bind patch.
 
 (closes issue ASTERISK-17852)
 Reported by: wdoekes
 Patches:
       issue19289_delay_old_address_setting_tcptls_2.patch uploaded by wdoekes (license 717)
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=320568

By: Digium Subversion (svnbot) 2011-05-23 11:28:17

Repository: asterisk
Revision: 320606

_U  trunk/
U   trunk/main/tcptls.c

------------------------------------------------------------------------
r320606 | dvossel | 2011-05-23 11:28:17 -0500 (Mon, 23 May 2011) | 21 lines

Merged revisions 320568 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

................
 r320568 | dvossel | 2011-05-23 11:18:33 -0500 (Mon, 23 May 2011) | 14 lines
 
 Merged revisions 320562 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.6.2
 
 ........
   r320562 | dvossel | 2011-05-23 11:15:18 -0500 (Mon, 23 May 2011) | 9 lines
   
   Adds missing part to the ast_tcptls_server_start fails second attempt to bind patch.
   
   (closes issue ASTERISK-17852)
   Reported by: wdoekes
   Patches:
         issue19289_delay_old_address_setting_tcptls_2.patch uploaded by wdoekes (license 717)
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=320606