Summary: | ASTERISK-17852: [patch] ast_tcptls_server_start fails second attempt to bind | ||
Reporter: | Walter Doekes (wdoekes) | Labels: | |
Date Opened: | 2011-05-13 07:14:30 | Date Closed: | 2011-05-23 11:28:17 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Core/Netsock |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) issue_19289.diff ( 1) issue19289_delay_old_address_setting_tcptls_2.patch ( 2) issue19289_delay_old_address_setting_tcptls.patch | |
Description: | Hi, just now I had to kill -9 asterisk in a nightly job because of a deadlock. (Infinite trylock restart situation to be exact.) I suspected init.d/asterisk stop wouldn't complete and I was right.. kill -9 solved that. But, unfortunately, it looked like SO_REUSEADDR didn't do its job properlywhen immediately starting asterisk after that. (I don't know how that is possible, but it was.) I missed out on my two TCP listening sockets: manager at 5038 and tcp sip at 5060. This is when I found out that doing 'manager reload' or 'sip reload' manually at a later point in time doesn't work. Why is that? Because the ast_tcptls_server_start() thinks that "nothing has changed". ****** STEPS TO REPRODUCE ****** # /etc/init.d/asterisk stop ; sleep 1 ; nc -l -p 5038 & nc -l -p 5060 & /etc/init.d/asterisk start Stopping Asterisk PBX: asterisk. [1] 22151 [2] 22152 Starting Asterisk PBX: asterisk. # grep ERR /var/log/asterisk/debug.log | tail -n3 [2011-05-13 13:52:49] ERROR[22161] tcptls.c: Unable to bind AMI server to 0.0.0.0:5038: Address already in use [2011-05-13 13:52:49] ERROR[22161] tcptls.c: Unable to bind SIP TCP server to 0.0.0.0:5060: Address already in use [2011-05-13 13:52:49] ERROR[22161] chan_sip.c: SIP TCP Server start failed. Not listening on TCP socket. # netstat -apnAinet | grep asterisk udp 0 0 0.0.0.0:5060 0.0.0.0:* 22161/asterisk # kill %1 ; kill %2 # asterisk -rx 'manager reload' ; asterisk -rx 'sip reload' # netstat -apnAinet | grep asterisk udp 0 0 0.0.0.0:5060 0.0.0.0:* 22161/asterisk (while the expected output should be this) # netstat -apnAinet | grep asterisk tcp 0 0 0.0.0.0:5060 0.0.0.0:* LISTEN 22161/asterisk tcp 0 0 0.0.0.0:5038 0.0.0.0:* LISTEN 22161/asterisk udp 0 0 0.0.0.0:5060 0.0.0.0:* 22161/asterisk ****** ADDITIONAL INFORMATION ****** I changed ast_tcptls_server_start and ast_tcptls_client_start to only set sin_family=0 at the beginning of the function. First if the connection/listening-socket is set up completely do I overwrite old_address with local_address (or remote_address). With the attached patch, manager reload and sip reload work like expected. Regards, Walter Doekes OSSO B.V. | ||
Comments: | By: David Vossel (dvossel) 2011-05-20 11:27:06 There are already goto escapes in those function to handle errors. It seems like the old_address should just be cleared there. Does my patch do what you'd expect as well? By: Walter Doekes (wdoekes) 2011-05-22 09:18:16 18:42 <@The_Boy_Wonder> wdoekes2: issue ASTERISK-17852 does it look like my change will work? 22:56 < wdoekes2> The_Boy_Wonder: for my particular case it would work indeed (failure on bind). if you would manage to get an error on the earlier socket call, it wouldn't. (I don't know if there are common socket errors, apart from running out of fd's) 23:11 <@The_Boy_Wonder> wdoekes2: ah alright. We'll go with your version 23:34 < wdoekes2> The_Boy_Wonder: we might want to add a desc->master = AST_PTHREADT_NULL after the if().. otherwise that code could be run several times and I'm not sure if it's wise to pthread_join() a thread that is already gone 23:35 < wdoekes2> or does someone else clear ->master? 23:41 <@The_Boy_Wonder> no clue Ok. I checked, and no one clears ->master, but they should. In the unpatched version, you can pthread_cancel a thread multiple times by binding to (configuring) different used ports after it was bound to a valid one once. In the patched version, it's even easier, but the problem already existed. Adding a desc->master = AST_PTHREADT_NULL solves this. (It probably works fine without, until that one moment when you actually do have a valid thread handle by accident and you shoot down a completely wrong thread. Note that there are other places in the source with a similar lack of AST_PTHREADT_NULL setting.) New patch (issue19289_delay_old_address_setting_tcptls_2.patch) is against trunk, btw. By: Digium Subversion (svnbot) 2011-05-23 09:31:05 Repository: asterisk Revision: 320271 U branches/1.6.2/main/tcptls.c ------------------------------------------------------------------------ r320271 | dvossel | 2011-05-23 09:31:05 -0500 (Mon, 23 May 2011) | 8 lines Fixes issue with ast_tcptls_server_start failing on second attempt to bind. (closes issue ASTERISK-17852) Reported by: wdoekes Patches: issue19289_delay_old_address_setting_tcptls.patch uploaded by wdoekes (license 717) ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=320271 By: Digium Subversion (svnbot) 2011-05-23 09:32:20 Repository: asterisk Revision: 320338 _U branches/1.8/ U branches/1.8/main/tcptls.c ------------------------------------------------------------------------ r320338 | dvossel | 2011-05-23 09:32:20 -0500 (Mon, 23 May 2011) | 14 lines Merged revisions 320271 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ........ r320271 | dvossel | 2011-05-20 16:24:48 -0500 (Fri, 20 May 2011) | 8 lines Fixes issue with ast_tcptls_server_start failing on second attempt to bind. (closes issue ASTERISK-17852) Reported by: wdoekes Patches: issue19289_delay_old_address_setting_tcptls.patch uploaded by wdoekes (license 717) ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=320338 By: Digium Subversion (svnbot) 2011-05-23 09:32:25 Repository: asterisk Revision: 320340 _U trunk/ U trunk/main/tcptls.c ------------------------------------------------------------------------ r320340 | dvossel | 2011-05-23 09:32:24 -0500 (Mon, 23 May 2011) | 21 lines Merged revisions 320338 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.8 ................ r320338 | dvossel | 2011-05-20 16:39:36 -0500 (Fri, 20 May 2011) | 14 lines Merged revisions 320271 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ........ r320271 | dvossel | 2011-05-20 16:24:48 -0500 (Fri, 20 May 2011) | 8 lines Fixes issue with ast_tcptls_server_start failing on second attempt to bind. (closes issue ASTERISK-17852) Reported by: wdoekes Patches: issue19289_delay_old_address_setting_tcptls.patch uploaded by wdoekes (license 717) ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=320340 By: Digium Subversion (svnbot) 2011-05-23 11:15:19 Repository: asterisk Revision: 320562 U branches/1.6.2/main/tcptls.c ------------------------------------------------------------------------ r320562 | dvossel | 2011-05-23 11:15:18 -0500 (Mon, 23 May 2011) | 9 lines Adds missing part to the ast_tcptls_server_start fails second attempt to bind patch. (closes issue ASTERISK-17852) Reported by: wdoekes Patches: issue19289_delay_old_address_setting_tcptls_2.patch uploaded by wdoekes (license 717) ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=320562 By: Digium Subversion (svnbot) 2011-05-23 11:18:35 Repository: asterisk Revision: 320568 _U branches/1.8/ U branches/1.8/main/tcptls.c ------------------------------------------------------------------------ r320568 | dvossel | 2011-05-23 11:18:35 -0500 (Mon, 23 May 2011) | 14 lines Merged revisions 320562 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ........ r320562 | dvossel | 2011-05-23 11:15:18 -0500 (Mon, 23 May 2011) | 9 lines Adds missing part to the ast_tcptls_server_start fails second attempt to bind patch. (closes issue ASTERISK-17852) Reported by: wdoekes Patches: issue19289_delay_old_address_setting_tcptls_2.patch uploaded by wdoekes (license 717) ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=320568 By: Digium Subversion (svnbot) 2011-05-23 11:28:17 Repository: asterisk Revision: 320606 _U trunk/ U trunk/main/tcptls.c ------------------------------------------------------------------------ r320606 | dvossel | 2011-05-23 11:28:17 -0500 (Mon, 23 May 2011) | 21 lines Merged revisions 320568 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.8 ................ r320568 | dvossel | 2011-05-23 11:18:33 -0500 (Mon, 23 May 2011) | 14 lines Merged revisions 320562 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ........ r320562 | dvossel | 2011-05-23 11:15:18 -0500 (Mon, 23 May 2011) | 9 lines Adds missing part to the ast_tcptls_server_start fails second attempt to bind patch. (closes issue ASTERISK-17852) Reported by: wdoekes Patches: issue19289_delay_old_address_setting_tcptls_2.patch uploaded by wdoekes (license 717) ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=320606 |