[Home]

Summary:ASTERISK-25317: asterisk sends too many stun requests
Reporter:Stefan Engström (StefanEng86)Labels:
Date Opened:2015-08-12 04:42:53Date Closed:2016-01-05 13:40:55.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_sip/WebSocket Resources/res_rtp_asterisk
Versions:13.5.0 Frequency of
Occurrence
Constant
Related
Issues:
Environment:Fedora 20 - x86_64, asterisk 13.5, pjprojects 2.4, sipml5 chromeAttachments:( 0) pjnathticket.txt
( 1) wiresharkstunburst.PNG
Description:The use case is asterisk dialing a webrtc chan_sip peer who uses chrome

After upgrading to asterisk 13.5 and pjprojects 2.4, I still have an issue with asterisk sending too many stun requests (10-100 identical ones within a millisecond) after receiving a "SDP Answer" from a chrome webrtc client.

I do not know if this causes any real issues noticable for the end-user, but there's certainly unecessary network spikes.

Note that this does _not_ happen when calling the same peer on a firefox browser.

This issue happens 100% of the time for me and only the magnitude of the stun-spam varies. Can anyone reproduce it?

Comments:By: Asterisk Team (asteriskteam) 2015-08-12 04:42:54.718-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Rusty Newton (rnewton) 2015-09-16 17:59:54.832-0500

In reproduction I see the same thing. I'm not an expert on STUN so I'm going to see if someone else can look into it now that I verified it is reproducible.

I used Chrome 45.0.2454.85, the sipml5 live demo and Asterisk 13.5.0 with chan_sip.

By: Stefan Engström (StefanEng86) 2015-10-31 13:38:40.114-0500

It seems the magnitude of the stun-spam is proportional to the the answer time, i.e. the time between "send INVITE" to "receive OK".

My understanding of the process is:

Before chrome client answers, the chrome client sends stun requests with a 20 ms interval, which we pass to pjnath by calling pj_ice_sess_on_rx_pkt from __rtp_recvfrom. Pjnath produces stun responses which is passed to asterisk through ast_rtp_on_ice_tx_pkt, and are then sent back to the chrome client, great,

but when we received a SIP OK for the invite, then we do <something> which makes pjnath call ast_rtp_on_ice_tx_pkt 10-200 times in rapid succession, which makes asterisk send those 10-200 stun requests. I still have no idea why pjnath has piled up that many stun requests, but it certainly seems related to the above process




By: Stefan Engström (StefanEng86) 2015-11-03 05:16:23.703-0600

It seems that in pjnath/src/pjnath/ice_session.c,

in on_stun_rx_request, (which is called many times between offer sent and answer received, since chrome sends stun requests in that time interval)

there is code which populates a list called ice->early_check:
if (ice->rcand_cnt == 0) {
       /* We don't have answer yet, so keep this request for later */
       LOG4((ice->obj_name, "Received an early check for comp %d Ignoring it!!!",
             rcheck->comp_id));
       pj_list_push_back(&ice->early_check, rcheck);
} else ...


When we receive the remote candidates (chrome's), we call pj_ice_sess_start_check,
which creates a new stun request for each entry in the list early_check, which are all sent at once:

...
/* First, perform all pending triggered checks, simultaneously. */
   rcheck = ice->early_check.next;
   while (rcheck != &ice->early_check) {
       LOG4((ice->obj_name,
             "Performing delayed triggerred check for component %d",
             rcheck->comp_id));
...

I do not understand the purpose of this procedure by pjnath, but by commenting out the line pj_list_push_back(&ice->early_check, rcheck) in pjnath/src/pjnath/ice_session.c I could avoid the stun-spam, and everything seems to work. I have not found any alternative to this yet... Maybe one can disable these early_check by some other means...



By: Stefan Engström (StefanEng86) 2015-12-22 04:00:45.387-0600

This issue *SHOULD* have been resolved by release 13.7 (due to the fix for ASTERISK-24146) - I cannot verify 100% since I masked the symptoms by patching pjprojects