[Home]

Summary:ASTERISK-26732: res_rtp_asterisk: Implement RTCP Multiplexing - breaking WebRTC in Chrome
Reporter:Dan Jenkins (danjenkins)Labels:
Date Opened:2017-01-19 03:23:36.000-0600Date Closed:2017-03-16 07:41:00
Priority:MajorRegression?
Status:Closed/CompleteComponents:Resources/res_rtp_asterisk
Versions:13.13.1 14.2.1 Frequency of
Occurrence
Constant
Related
Issues:
must be completed before resolvingASTERISK-26846 chan_sip: Add rtcp-mux support
Environment:Chrome 57 onwards Attachments:( 0) asterisk-ugly-rtcpmux.diff
Description:Chrome 57 has a breaking change when it comes to interop with WebRTC gateways. They've changed their previous "negotiate", to "require" when it comes to rtcp-mux

Asterisk, as I understand it does not have rtcp multiplexing and so will break when it comes to compatibility with WebRTC across all versions of Asterisk that supports WebRTC (as far as I understand, thats back to 11 I think)

We have a flag we can enable client side for the time being; and I'm trying to find out how long that flag will be available for - but thats no lomng term solution.

I wrote on the mailing list about the issue - http://lists.digium.com/pipermail/asterisk-dev/2017-January/076091.html

For comparison - I've got other systems using WebRTC which use RTPEngine (with Kamailio) and Freeswitch - both of these have enabled me to enable flags etc to get past this issue.

I don't know what the right move is going forward. I'm just reporting the issue - every single application out there that utilises Asterisk for WebRTC will have to either move platform or enable a flag client side in the hope that Asterisk will enable the feature set required for compatibility with Chrome

This affects Chrome 57 onwards - We're currently on Chrome 55 mid cycle - which means roughly 6 weeks until this becomes mainstream.

If you want help reproducing this, please let me know and we can have a conversation about URLs in somewhere less public.

The error you'll get will be something along the lines of "setRemoteDescriptionOnFailure
Failed to set remote answer sdp: Session error code: ERROR_CONTENT. Session error description: Failed to setup RTCP mux filter.."

Comments:By: Asterisk Team (asteriskteam) 2017-01-19 03:23:38.193-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Marek Cervenka (cervajs) 2017-02-03 03:54:01.140-0600

https://webrtcstandards.info/rtcp-mux/
https://medium.com/@nimbleape/webrtc-asterisk-and-chrome-57-a706fde33780#.uhw5s4mlk

By: Sebastian Gutierrez (sum) 2017-02-14 06:34:21.820-0600

May be removed on M59

https://groups.google.com/a/chromium.org/forum/#!msg/blink-dev/OP2SGSWF5lo/v7GOaWt_CQAJ



By: Lorenzo Miniero (lminiero) 2017-03-02 12:25:01.155-0600

I don't need this, as I always use my Janus in front of Asterisk which takes care of rtcp-mux for me, but thought I'd submit this ugly patch anyway, in case it can help people using Asterisk's built-in WebRTC functionality and are affected by this issue.

This is a VERY ugly patch, which tries to solve the issue in a somewhat simple way, and as it will be clearer only solves half of the problem: the other half would require a refactoring in the way Asterisk handles incoming traffic, and is beyond my expertise. Anyway, with a couple of quick tests on both outgoing and incoming calls, it seems to work on Chrome for me, so I'm sharing it anyway.

The way the patch works is the following:

1. added new rtcpmux boolean in sip_pvt (so chan_sip only, sorry, not familiar with chan_pjsip);
2. added new AST_RTP_PROPERTY_RTCPMUX ast_rtp_property (rtp_engine.h);
3. added new rtcpmux boolean in ast_rtp (res_rtp_asterisk.c);
3. if a=rtcp-mux is found in either process_sdp_a_audio or process_sdp_a_video, the private rtcpmux is set, and we use ast_rtp_instance_set_prop to set the AST_RTP_PROPERTY_RTCPMUX property to true; this in turn sets the rtcpmux in the related ast_rtp instance to true;
4. changed a couple of checks in res_rtp_asterisk.c so that it doesn't complain if no RTCP candidate is involved;
5. if rtcpmux is set, pj_ice_sess_send_data uses the RTP component instead of the RTCP component to send data to the peer;
6. incoming traffic: big TODO (more on this in a minute);
7. negotiation-wise, in all offers and answers originated by Asterisk we add a=rtcpmux for both audio and video (for answers should check if it was offered, but I'm lazy... told you it's ugly); anyway, rtcp-mux is only enabled if the peer added it themselves.

I was saying that incoming traffic is still a TODO, and this is for a simple reason. Normally, with rtcp-mux you parse the incoming packet, check the payload type, and use that to identify whether it's RTP or RTCP: in fact, since the headers don't match, there's a range of payload types you can't use as they'd match RTCP traffic. Once you identify the packet, you then treat it as RTP or RTCP in the code.

In Asterisk, you can't do that in an easy way, as traffic is not push based (network pushing data to your logic), but pull-based, with manual calls to ast_rtp_read that specify whether you want to read an RTP or a RTCP packet. Considering the ambiguity rtcp-mux introduces, you cannot know in advance if the next packet you'll get from the network buffer will be RTP or RTCP. This means that, at the moment, I believe the code is pulling both from the RTP component, and simply dropping the RTCP packets somewhere in the process (e.g., when failing to decrypt SRTP as it's really SRTCP).

Long story short, audio works in both directions, but RTCP is broken. The only way to get this working, I guess, is some kind of refactoring that works like this:

1. ast_rtp_read is called with rtcp=0;
2. data is read;
3. a check is done: if it's RTP, it's handled as usual; if it's RTCP, it's saved for later;
4. ast_rtp_read is called with rtcp=1;
5. the oldest RTCP packet from the queue is returned and removed from the buffer;
6. etc.etc.

Not exactly sure if I'm misunderstanding the way the media processing is supposed to work in Asterisk, and if this solution makes any sense. Anyway, none of that is in this patch, so I'll leave this to further discussion.

By: Lorenzo Miniero (lminiero) 2017-03-02 12:26:27.704-0600

PS: this ugly patch doesn't touch anything in pjnath either: if resources are allocated for the RTCP component and then rtcp-mux is negotiated, they should be discarded too.

By: Mark Michelson (mmichelson) 2017-03-07 14:31:29.483-0600

I have a review up for this feature here: https://gerrit.asterisk.org/#/c/5138

That link goes to the 13 version of the patch. If you want 14 or master, you can get to it from the "same topic" section on the right-hand side.

By: Dan Jenkins (danjenkins) 2017-03-07 18:02:45.292-0600

From the commit message - should I assume this only works for pjsip ? You know me - I'm ALL for moving forward with not adding things to chan_sip - but in this case; because people are already going to have to move from 11 to 13 (bad people if you're still using 11 but I know people are); let alone making them move over to pjsip for WebRTC? I haven't read the code yet as I have pretty poor internet connection at the moment so sorry If I've missed something

By: Joshua C. Colp (jcolp) 2017-03-07 18:07:53.590-0600

The change currently only supports chan_pjsip. It may get added to chan_sip in the future.

By: Dan Jenkins (danjenkins) 2017-03-07 18:26:50.810-0600

OK - I'm interested to know from the people watching this issue whether not having chan_sip support just means this is useless to them - and when Chrome does remove their flag; whether they just wouldn't be able to utilise WebRTC using Asterisk anymore - we know getting people to upgrade versions is hard enough; and then making them upgrade to pjsip too (Which they should be doing - but again people find bugs and reasons not to move just yet)

Thanks for the effort on this! Its much appreciated!

By: Sean Bright (seanbright) 2017-03-07 19:03:53.000-0600

I've only briefly reviewed Mark's patch, but the lion's share of the changes are happening in res_rtp_asterisk. I don't know level of effort, but I'm happy to try and get rtcp-mux support in chan_sip once Mark's patch lands.

(Ironically Mark also wrote "The lion's share of the changes in this commit are in res_rtp_asterisk.c" so it turns out I am a big fat phony)

By: Dan Jenkins (danjenkins) 2017-03-07 19:28:14.595-0600

Say it ain't so Sean!

By: Matthew Fredrickson (mattf) 2017-03-07 20:25:46.634-0600

I think from our perspective (core-team/Digium's dev team), we are not planning on adding chan_sip support.  That doesn't stop someone from the community from submitting it as well.  It's just that we're (per usual) putting forward development time into chan_pjsip.

Hope that clarifies things a bit.  [~seanbright], if you'd like to have a go at chan_sip support, I don't think anybody is going to stop you.

By: Dan Jenkins (danjenkins) 2017-03-07 20:40:42.357-0600

Yup- totally get that - and if you remember from AstriDevCon; I'm usually the advocate for not putting development time into adding new features into chan_sip - but this isn't a new feature. This is fixing a bug (even though its not truly a "bug" in asterisk - its been caused by Asterisk not following whats required in WebRTC) - and if I remember rightly - it was said that the core team would still work on fixing bugs. If you did a poll of people utilising WebRTC using Asterisk today; I'd bet a large proportion are using chan_sip - by not fixing this you'll be alienating all of those users (who will potentially leave using the project if they are forced to migrate to PJSIP - because lets face it, other projects are ahead of Asterisk when it comes to WebRTC). Right now I don't have hard numbers on users using Asterisk for WebRTC on chan_sip so I'm kinda waiting for people to proclaim "this won't fix my issue" enough....

(EDIT - sorry, got sidetracked - I worry that a lot of these chan_sip community users don't have the resources to fix this - and hence my belief that this being a bug, should have been fixed by core)

I'll stop worrying about this until people make a big enough noise about it.

As always - thank you for the work you all do on Asterisk - I may sound complainy but you know I do it from a place of love for the project :)

By: Ross Beer (rossbeer) 2017-03-08 04:00:37.156-0600

As RTCP is being worked on could the segfault caused by issue ASTERISK-26835 be resolved at the same time?

By: Joshua C. Colp (jcolp) 2017-03-08 04:06:47.250-0600

[~rossbeer] Fixing that issue was not a goal of this project so I don't believe it would be.

By: Marek Cervenka (cervajs) 2017-03-08 06:46:28.165-0600

please, please ... let chan_sip die. do not feed him
Two SIP stacks is pain for all (mainly for newcomers)

By: Sean Bright (seanbright) 2017-03-08 07:12:07.923-0600

Thanks [~rossbeer] and [~cervajs] for your constructive feedback on this issue.

By: James Van Vleet (jvanvleet) 2017-03-08 08:02:46.389-0600

Well since you asked this watcher moved to pjsip specifically because we were moving to webRTC.  I can understand the plight of people that might struggle getting off of chan_sip but if you are going to be running something as leading edge as webRTC you really need to run pjsip on Asterisk.  Things are still quite fluid and what is known as webRTC is really a large (and loose) set of features and standards - many of which have been ignored by the voip community for years.   This won't be the last new feature (this really is not a bug) that causes webRTC to not work.  

By: Sean Bright (seanbright) 2017-03-08 14:20:44.989-0600

Initial patch for chan_sip over at ASTERISK-26846 if you're feeling brave.

By: Andrew Nagy (tm1000) 2017-03-13 21:02:02.588-0500

I think it should also be mentioned that if one is using both chan_sip & pjsip at the same time the result is unknown upon with "driver" Asterisk will use for the websocket communication. Thus one should refer to ASTERISK-24106 which is a (resolved/fixed) ticket that disables chan_sip websockets. Which will be necessary to use pjsip only if [~seanbright]'s patch is not approved.

As to the question in this thread. I side with [~danjenkins]. As much as I like/love PJSIP this is a breaking change because Asterisk is (was?) not following the WebRTC spec. As such without Dan's blog a month ago I would have never figured out the error in Asterisk ([2017-03-13 18:33:09] WARNING[32113][C-0000004d]: res_rtp_asterisk.c:777 ast_rtp_ice_start: No RTCP candidates; skipping ICE checklist (0x7f4de4acb3d0))

So thank's Dan and all others who've contributed to this ticket.

By: Friendly Automation (friendly-automation) 2017-03-16 07:41:02.381-0500

Change 5138 merged by Joshua Colp:
Add rtcp-mux support

[https://gerrit.asterisk.org/5138|https://gerrit.asterisk.org/5138]

By: Friendly Automation (friendly-automation) 2017-03-16 08:58:57.382-0500

Change 5139 merged by Joshua Colp:
Add rtcp-mux support

[https://gerrit.asterisk.org/5139|https://gerrit.asterisk.org/5139]

By: Friendly Automation (friendly-automation) 2017-03-16 10:47:09.571-0500

Change 5140 merged by Joshua Colp:
Add rtcp-mux support

[https://gerrit.asterisk.org/5140|https://gerrit.asterisk.org/5140]

By: Friendly Automation (friendly-automation) 2017-03-17 08:06:18.336-0500

Change 5193 merged by zuul:
rtcp: Add PJSIP tests for RTCP-MUX support.

[https://gerrit.asterisk.org/5193|https://gerrit.asterisk.org/5193]

By: Friendly Automation (friendly-automation) 2017-03-18 05:37:25.160-0500

Change 5236 merged by Joshua Colp:
res_pjsip_asterisk.c: Fix compile error if libsrtp is not installed.

[https://gerrit.asterisk.org/5236|https://gerrit.asterisk.org/5236]

By: Friendly Automation (friendly-automation) 2017-03-18 05:37:43.489-0500

Change 5238 merged by Joshua Colp:
res_pjsip_asterisk.c: Fix compile error if libsrtp is not installed.

[https://gerrit.asterisk.org/5238|https://gerrit.asterisk.org/5238]

By: Friendly Automation (friendly-automation) 2017-03-18 05:37:58.793-0500

Change 5237 merged by Joshua Colp:
res_pjsip_asterisk.c: Fix compile error if libsrtp is not installed.

[https://gerrit.asterisk.org/5237|https://gerrit.asterisk.org/5237]

By: Friendly Automation (friendly-automation) 2017-03-18 05:38:13.401-0500

Change 5227 merged by Joshua Colp:
res_rtp_asterisk: Fix crash when RTCP is not present when DTLS is stopped.

[https://gerrit.asterisk.org/5227|https://gerrit.asterisk.org/5227]

By: Friendly Automation (friendly-automation) 2017-03-18 05:38:25.123-0500

Change 5228 merged by Joshua Colp:
res_rtp_asterisk: Fix crash when RTCP is not present when DTLS is stopped.

[https://gerrit.asterisk.org/5228|https://gerrit.asterisk.org/5228]

By: Friendly Automation (friendly-automation) 2017-03-18 05:38:37.750-0500

Change 5232 merged by Joshua Colp:
res_rtp_asterisk: Fix crash when RTCP is not present when DTLS is stopped.

[https://gerrit.asterisk.org/5232|https://gerrit.asterisk.org/5232]