[Home]

Summary:ASTERISK-22745: chan_sip call setup very slow or fails when STUN server not available
Reporter:Michael Walton (mike@farsouthnet.com)Labels:
Date Opened:2013-10-22 02:27:09Date Closed:
Priority:MajorRegression?No
Status:Open/NewComponents:Channels/chan_sip/General Resources/res_rtp_asterisk
Versions:12.0.0-beta1 13.18.4 Frequency of
Occurrence
Constant
Related
Issues:
is duplicated byASTERISK-25216 Asterisk periodic hangs. UDP Recv-Q greatly exceeds zero.
is related toASTERISK-29507 STUN timeout is silently delaying calls
Environment:Ubuntu 10.04Attachments:( 0) ASTERISK-22745-gtalk-stun.r402438.patch
( 1) ASTERISK-22745-sip-stun.r402438.patch
Description:Asterisk 12 compiled with chan_pjsip and chan_sip enabled. Call setup to or from chan_sip peer takes 10 seconds or more. To reproduce:
* Enable icesupport in rtp.conf
* Use an unreachable STUN server address for stunaddr, or disconnect WAN
* Disable icesupport in sip.conf for a chan_sip peer that does not require STUN, e.g. local phone
* Dial to or from phone
Comments:By: Michael Walton (mike@farsouthnet.com) 2013-10-22 02:36:41.364-0500

A suggested simple fix is to pass the chan_sip icesupport config option when calling ast_rtp_instance_new, so that the ICE plumbing can be bypassed for peers that don't need it.

By: Matt Jordan (mjordan) 2013-10-25 10:44:15.629-0500

That works around the problem for peers that don't need ICE or are configured to not use ICE; however, that doesn't actually fix the problem, which is that {{chan_sip}} blocks whenever an operation it depends on blocks.

I'm not sure there is a long term fix for this in {{chan_sip}}, given its single threaded architecture.

By: Michael Walton (mike@farsouthnet.com) 2013-10-30 03:18:33.197-0500

I'll create a patch with the "workaround" shall I? It does make chan_sip+ICE more usable by making the ICE switch properly per-peer.

By: Matt Jordan (mjordan) 2013-10-30 08:20:49.204-0500

I think a workaround for {{chan_sip}} is entirely appropriate. It at least lets you pay the penalty for the behavior on a peer by peer basis.

By: Michael Walton (mike@farsouthnet.com) 2013-11-04 07:18:16.476-0600

This patch fixes chan_sip per-peer (or chan_pjsip per-endpoint) selection of ICE support, using the previously unused RTP property AST_RTP_PROPERTY_STUN to select ICE support hooks in res_rtp_asterisk

By: Michael Walton (mike@farsouthnet.com) 2013-11-04 07:25:03.735-0600

This patch moves the setup of ICE session from ast_rtp_new to ast_rtp_prop_set (only if setting AST_RTP_PROPERTY_STUN to true). This doesn't significantly change order of execution, since both chan_sip and chan_pjsip call ast_rtp_instance_set_prop directly after successful return from ast_rtp_instance_new. I see chan_gtalk sets AST_RTP_PROPERTY_STUN to 1, so I have attached separately a patch for it that just makes sure the STUN property is set before RTCP property.

By: Michael Walton (mike@farsouthnet.com) 2013-11-04 07:25:25.507-0600

Setup STUN before RTCP

By: Steve Davies (one47) 2015-08-05 05:10:58.898-0500

Hi, Is there any reason why this patch seems to have gone quiescent?

IMHO It is part of a necessary solution to a real problem and I plan to try and create an Asterisk 11.18 backport of it.

In fact it does not go far enough, as even patched in this way, a broken stun server will adversely affect a dial command of the form:
   Dial(SIP/noiceneeded&SIP/iuseice)
And to fix it more completely, the triggering of the STUN (and TURN) lookup would be deferred to the point where the new channel that owns the RTP is running in it's own thread (or something like that anyway - I need to research it more closely)


By: Joshua C. Colp (jcolp) 2015-08-05 05:34:26.819-0500

Deferring it is called trickle ICE, which gathers candidates in the background and then trickles them in the negotiation process. Adding support for this requires API changes, chan_sip changes, and pjnath changes. It does not have trickle ICE support. This means that when you are going to send the SDP you need all the candidates.

As for why this hasn't gotten anywhere yet - everything these days goes through gerrit. Until such time as someone takes ownership and takes the patch through the gerrit process then this is waiting.

By: Steve Davies (one47) 2015-08-05 05:49:13.540-0500

Thanks Josh - I really struggle with Gerrit, but will see what I can do. Perhaps I just need more practice :)

Sadly this change is near to code that has been re-factored between 11, 12 and 13 so it won't be a simple cherry-pick either.

I understand trickle ICE, and that Asterisk is not ready for that yet, but I thought a compromise might be to decouple the STUN/TURN setup from the channel+rtp creation code path so a Dial or similar will not block on repeated serialised failed STUN lookup attempts resulting in multiples of 9-second STUN timeouts before placing a call.


By: Joshua C. Colp (jcolp) 2015-08-05 05:52:23.292-0500

That is a way, but it would be changing either the threading model of dialing or chan_sip some - both of which are a dangerous avenue to go down. There's also some Gerrit usage info at https://wiki.asterisk.org/wiki/display/AST/Gerrit+Usage