[Home]

Summary:ASTERISK-28686: chan_sip strictrtp=yes fails when media source is changed: no audio
Reporter:Walter Doekes (wdoekes)Labels:
Date Opened:2020-01-13 03:49:00.000-0600Date Closed:2020-01-27 18:22:42.000-0600
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/Interoperability
Versions:13.30.0 Frequency of
Occurrence
Frequent
Related
Issues:
is related toASTERISK-24349 strictrtp has trouble in NAT scenario's with 100->183->180->200
is related toASTERISK-20633 Asterisk SIP channel handling of MOH media re-INVITES not RFC 3264 compliant
Environment:Attachments:
Description:Hi!

When the SDP origin changes, the SDP session version from the new media is not necessarily higher than the existing session version.

chan_sip contains this bit:
{code}
       if (ast_test_flag(&p->flags[1], SIP_PAGE2_IGNORESDPVERSION) ||
           (p->sessionversion_remote < 0) ||
           (p->sessionversion_remote != rua_version)) {
               p->sessionversion_remote = rua_version;
       } else {
...
                       ast_debug(2, "Call %s responded to our reinvite without changing SDP version; ignoring SDP.\n", p->callid);
                       return FALSE;
{code}
That means that the updated SDP will be ignored for a change like this:
- Initial o= line (for example in 183):
{noformat}
o=root 1845921041 1845921041 IN IP4 193.x.x.215
{noformat}
- Updated o= line (for example in 200):
{noformat}
o=- 1126144851 1126144853 IN IP4 87.x.x.235
{noformat}

In this example, the new SDP sess_version 1126144851 is lower than 1845921041, causing the SDP to be ignored. And because the SDP is not processed, {{ast_rtp_instance_set_remote_address}} is never called, and the new RTP source is discarded.

The fix: do not only check sess_version, but also check the "unique parts" from the SDP. And if they change, then also update the SDP.

Basically this:
{noformat}
       if (ast_test_flag(&p->flags[1], SIP_PAGE2_IGNORESDPVERSION) ||
-           (p->sessionversion_remote < 0) ||
-           (p->sessionversion_remote < rua_version)) {
-               p->sessionversion_remote = rua_version;
+                       sess_version > p->sessionversion_remote ||
+                       strcmp(unique, S_OR(p->sessionunique_remote, ""))) {
+               p->sessionversion_remote = sess_version;
+               ast_string_field_set(p, sessionunique_remote, unique);
       } else {
{noformat}
Along with a marginally bigger patch (to be put on gerrit), we then get these:
{noformat}
[2020-01-13 10:31:20] VERBOSE[10720][C-00000001] chan_sip.c:
 Got SDP version 203027438 and unique parts [- 203027438 IN IP4 10.x.x.161]
[2020-01-13 10:31:20] VERBOSE[10720][C-00000001] chan_sip.c:
 Got SDP version 1845921041 and unique parts [root 1845921041 IN IP4 193.x.x.215]
[2020-01-13 10:31:31] VERBOSE[10720][C-00000001] chan_sip.c:
 Comparing SDP version 1845921041 -> 1126144853 and unique parts [root 1845921041 IN IP4 193.x.x.215] -> [- 1126144851 IN IP4 87.x.x.235]
{noformat}
And then everything works as intended.
Comments:By: Asterisk Team (asteriskteam) 2020-01-13 03:49:02.060-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

By: Michael L. Young (elguero) 2020-01-13 07:54:47.130-0600

Hola Walter,

Would this be related to the analysis and discussion that took place in the issue ASTERISK-20633?

By: Walter Doekes (wdoekes) 2020-01-13 09:10:59.090-0600

Sure. Looks like this changeset fixes that issue.

I didn't read the discussion carefully. But as for the conclusion:
{quote}
Essentially, a re-INVITE is modifying an existing session, not creating a new one. Therefore, the SDP version number should follow what is established in the RFC3264.
{quote}
I would say that (1) changing the media peer (and thus, origin) is allowed and (2) the media peer won't know anything about the session-version from the previous/other peer. That would lead _me_ to conclude that my changes are legal and warranted.

By: Michael L. Young (elguero) 2020-01-13 09:29:46.930-0600

I am not quite so sure about that conclusion.  I don't think this change should be made.

The RFC provides clear guidance about modifying an existing session.  That is why the quote you put above says that there are limitations on modifying an "existing session".  The context from what this is referring to has to do with the modification of the "o=" line.  There are certain limitations.  One of them being that the "o=" line MUST not change except for the version number being _incremented_.

If you have strictrtp turned on, I would expect Asterisk to strictly enforce this.

For interop, the ignoresdpversion should be used.

I think it is important to read the entire discussion in order to understand the entire context around why I would not recommend this change.

By: Walter Doekes (wdoekes) 2020-01-13 10:31:49.593-0600

{quote}
The RFC provides clear guidance about modifying an existing session.
{quote}
Yes. But here there are different sessions going on.
{quote}
RFC3261: 15. Terminating a Session
  \[...\] The state of the session and the state of the
  dialog are very closely related.  When a session is initiated with an
  INVITE, each 1xx or 2xx response from a distinct UAS creates a
  dialog, and if that response completes the offer/answer exchange, it
  also creates a session.  As a result, each session is "associated"
  with a single dialog - the one which resulted in its creation.
{quote}
So, when I get a 183 with a Tariff Notice, chan_sip should create a dialog (+ media session). And then, when I get a 200 (from a different entity), it should create a new dialog (+ media session).

But, since chan_sip only copes with a single dialog and a single audio stream for a call, I'm perfectly fine with it overwriting the old (now obsolete) dialog (+media session).

*In all cases, discarding the SDP from the first 200 makes no sense.*

Now, if we look at this:
{quote}
RFC3261: 13.1 Overview
  A 2xx response to an INVITE establishes a session, and it also
  creates a dialog between the UA that issued the INVITE and the UA
  that generated the 2xx response.  Therefore, when multiple 2xx
  responses are received from different remote UAs (because the INVITE
  forked), each 2xx establishes a different dialog.  All these dialogs
  are part of the same call.
{quote}
Don't you read that as:
- multiple dialogs can be established;
- thus, multiple (media) sessions can be established;
- thus, we don't want to discard any SDP from different origins.

Because chan_sip can only handle a single stream, do you want it to only use the oldest one? Or use the latest and greatest one? (Which, in the real world, holds the actual audio source.)

I've tried to come up with a real-world problem that ignoring the updated SDP would fix, but I cannot. But maybe I'm overlooking something here...?

By: Walter Doekes (wdoekes) 2020-01-13 10:43:04.615-0600

And on top of all this, right now, the behaviour is inconsistent, which is even worse than (too) lax or (too) strict.

If the 2nd stream gets a session-version which is higher: it is accepted. If it's lower, it's (silently) rejected. This causes (hard to debug) "random" failures.

If you're adamant about this, perhaps you'd like to make a case to accept my patch, but then (loudly) reject the Answer if
{{p->sessionunique_remote && strcmp(unique, p->sessionunique_remote)}} ?

By: Walter Doekes (wdoekes) 2020-01-13 11:09:54.970-0600

As for your:
{quote}
If you have strictrtp turned on, I would expect Asterisk to strictly enforce this.
{quote}
The strictrtp is there _not_ to disallow the session media to be changed. It is to disallow unrelated strangers from enumerating my Asterisk-ports and stealing the audio stream. (RTPbleed, or however you want to call it.)

And, now, a short git-blame history lesson. _To emphasize that the current codebase was not the result of careful design, but rather grew organically._

{noformat}
commit 6aaa99230115240b6bcf757183d16244865e1232
Author: Russell Bryant <russell@russellbryant.com>
Date:   Wed Jan 16 21:53:10 2008 +0000

   Merge the changes from issue #10665 from the team/group/sip_session_timers branch.

   This set of changes introduces SIP session timers support (RFC 4028).  In short,
   this prevents stuck SIP sessions that were not properly torn down due to network
   or endpoint failures during an established SIP session.
...
+       if (p->sessionversion_remote < 0 || p->sessionversion_remote != rua_version) {
+               p->sessionversion_remote = rua_version;
+               p->session_modify = TRUE;
{noformat}
That used to be a `!=`, not a `>`. And it appears it was added to handle a problem with session (timer) updates, not to avoid the media stream getting updated.

And when {{SIP_PAGE2_IGNORESDPVERSION}} was added in {{91192e30c}}, it still was. That option was _not_ created for the use case described here, but for broken clients which didn't update the session-version _at all_.

First when change {{85e57521a}} was done (when others were facing issues due to broken clients again), the `!=` check was changed to a `>`. At this point, partial RFC reading created the broken situation where we are now. (Explicitly not blaming Kevin or anyone else here.)

{noformat}
commit 85e57521ab6eee3dc866e5882694f054f98b7186
Author: Kevin P. Fleming <kpfleming@digium.com>
Date:   Mon Jun 15 20:42:38 2009 +0000

   Accept T.38 re-INVITE responses with invalid SDP versions.
...
-       if (ast_test_flag(&p->flags[1], SIP_PAGE2_IGNORESDPVERSION)
-               || p->sessionversion_remote < 0
-               || p->sessionversion_remote != rua_version) {
+       if (ast_test_flag(&p->flags[1], SIP_PAGE2_IGNORESDPVERSION) ||
+           (p->sessionversion_remote < 0) ||
+           (p->sessionversion_remote < rua_version)) {
{noformat}

QED

By: Michael L. Young (elguero) 2020-01-13 12:35:01.261-0600

Sorry, I misspoke about the purpose of "strictrtp" and what to expect from it.  I wrote my response in haste.  It is in the title and sort of mislead me from the actual issue at hand.  I retract that statement.

Not sure if chan_sip is worth that much effort on my part anymore since I don't really use it.

The description of the issue is talking about an active session (audio) and changing the media source.  That is all that my comments were mainly focused on.  You are quoting from the SIP RFC 3261 which talks about sessions and modifying sessions.  What I am referring to is the SDP RFC 3264 which is very specific about modifying the SDP characteristics in an active session.

My point was that if we want to go down the path of history, there are several issues on this issue tracker where we have already had discussions around endpoints trying to change the media source on an active session and they do so by changing the "o=" line.  In those discussions the conclusion was that trying to change the "o=" line more than just incrementing the version number with an active session is not proper behavior.  Those conclusions are based on RFC3264 . We have directed people to use "ignoresdpversion=yes" if they so desire.

This just change just doesn't feel right but I am not going to declare that I am an expert or authority on this.

By: Walter Doekes (wdoekes) 2020-01-14 06:30:23.913-0600

{quote}
Not sure if chan_sip is worth that much effort on my part anymore since I don't really use it.
{quote}
I understand that meddling in the old driver is suboptimal. From a quick search through the tracker, it appears that PJSIP does not even have an option to allow media switching. (I do hope it doesn't behave as quirkily as chan_sip does though.)

However, we have to deal with the real world. And in particular, in my case where the 183 has a different media source than the actual 200, ignoring the 200 _sometimes_ is just wrong(tm).

{quote}
This just change just doesn't feel right [...]
{quote}

The only pragmatic thing to do here, is accept the fix, as the (mis)feature as it stands now cannot work reliably for anyone. (Because it _only_ takes the version into account and not the rest of the o= line.)

If people are interested, I might be persuaded into adding a denymediachange=yes option, to restore old behaviour (but then taking the entire o= line into account). And then, you might consider differentiating between a new media session after a provisional 1xx, vs. a new media session after the first final 200. But that feels to me like "a new feature", for which the effort might be better spent on chan_pjsip instead.

{quote}
You are quoting from the SIP RFC 3261 which talks about sessions and modifying sessions. What I am referring to is the SDP RFC 3264 which is very specific about modifying the SDP characteristics in an active session.
{quote}
Yes, but we're implementing both. There is no chan_sip driver without SIP or without SDP. And I'm arguing that the former forces you to implement multiple sessions (not going to do that). Or, if that's too much to ask (which it is), implement something that's as close to it as possible, for the greatest interoperability (which is, accepting the new media source).

{quote}
My point was that if we want to go down the path of history, there are several issues on this issue tracker where we have already had discussions around endpoints trying to change the media source on an active session and they do so by changing the "o=" line. In those discussions the conclusion was that trying to change the "o=" line more than just incrementing the version number with an active session is not proper behavior. Those conclusions are based on RFC3264 . We have directed people to use "ignoresdpversion=yes" if they so desire.
{quote}
I can see how it feels like defeat, if you're changing your stance about something you have been defending forever. But now that you realise the (undesired) randomness of the current behaviour, there are three sane positions you can take on this matter:
- fix it, by making it stricter
- fix it, by making it looser (my patch)
- don't care about chan_sip (also, my patch)


By: Walter Doekes (wdoekes) 2020-01-14 07:02:28.313-0600

As for PJSIP, I think George Joseph may have fixed both issues in ASTERISK-27936:
{noformat}
commit 880fbff6b73eba5c70c40edf8ca3d10069d8a6bd
Author: George Joseph <gjoseph@digium.com>
Date:   Mon Jun 18 20:22:17 2018 -0600

   res_pjsip_session:  Add ability to accept multiple sdp answers
...
follow_early_media_fork=yes
accept_multiple_sdp_answers=no
{noformat}
In pjsip lingo, I would currently be interested in the {{follow_early_media_fork=yes}} option (because the to-tags in the 183 is different from the one in the 200), while the reporter of ASTERISK-20633 wants {{accept_multiple_sdp_answers=yes}} (because the to-tag is the same).

By: Michael L. Young (elguero) 2020-01-14 08:12:17.397-0600

Hey Walter.  Not feeling defeat about anything.  I didn't think I was in a battle here.  I thought we were all trying to work together here.  I always learn from you and was just trying to understand things.  I freely admit my errors and learn from them.

The intention was to add context to this issue and bring forward information to help decide what is best for the community.  As mentioned in those issues linked and you mention on Gerrit, it has been acceptable all these years to use ignoresdpversion=yes.

In your scenario, you are referring to having a 183 then getting the 200.  I think this is a new scenario, that, from what I can recall, I haven't personally seen tackled.  Yes, the 200 has the final say on the media source.

The proposed change does not take into account a 183 and then getting a 200, right?  It is just going to accept any SDP change, if I understand the patch correctly.  Maybe that is acceptable.  The stance has always been that you can't have a 200 and then later on something up and changes the SDP on you without some form of checking or without you expressly saying that is okay with you by ignoring the sdp versioning (although I know that doesn't seem right either because we are not checking the IP part).  I am just pushing back to make sure it is thought all the way through, which I am sure you already have done because I know you do not make willy nilly changes.

Thanks.  I am just trying to work together by supplying context and information.

Edit: Just saw comments about pjsip.  That is great to know.  Thanks for adding that information.
Just more reason to be using pjsip over chan_sip.  My comments have all been centered around chan_sip.

By: Walter Doekes (wdoekes) 2020-01-14 11:02:27.822-0600

{quote}
The stance has always been that you can't have a 200 and then later on something up and changes the SDP on you without some form of checking or without you expressly saying that is okay with you by ignoring the sdp versioning
{quote}
Yes. While I can see where that is coming from, from the RFC 3264 point of view, I have still yet to find a use-case where doing that fixes a problem in SIP. Remember that SDP is encapsulated in SIP, which also keeps packet identification and order.

If/when the remote device decides to let someone else handle the media (establish an entirely new media session if you will), we have very little to gain by not trusting the remote device. It has the SIP identifiers needed to identify itself (from/to/callid). If an attacker has those identifiers it will most likely have the o= line as well. So if it wanted to abuse anything, it would just increment the sess-version and not be bothered by this check.

As for multiple simultaneous 200s (from different to-tags): Asterisk will choose the first one and BYE the others. At this point, the to-tag would be locked, and other 200s (with different to-tags) wouldn't influence the SDP-session state.

My concerns are:
- the `!=` check was added when session timers were added: there was likely a reason to ignore the there-are-no-updates SDP in those case; so I don't just want to ignoresdpversion=yes (it might open up strictrtp learning at every session-timer interval?)
- the current check is broken; so it only makes sense to fix it (in any direction, in this case the one that suits me best)

By: Friendly Automation (friendly-automation) 2020-01-27 18:22:43.572-0600

Change 13592 merged by Friendly Automation:
chan_sip: Always process updated SDP on media source change

[https://gerrit.asterisk.org/c/asterisk/+/13592|https://gerrit.asterisk.org/c/asterisk/+/13592]

By: Friendly Automation (friendly-automation) 2020-01-27 18:28:56.617-0600

Change 13691 merged by Friendly Automation:
chan_sip: Always process updated SDP on media source change

[https://gerrit.asterisk.org/c/asterisk/+/13691|https://gerrit.asterisk.org/c/asterisk/+/13691]

By: Friendly Automation (friendly-automation) 2020-01-27 18:29:40.772-0600

Change 13690 merged by Friendly Automation:
chan_sip: Always process updated SDP on media source change

[https://gerrit.asterisk.org/c/asterisk/+/13690|https://gerrit.asterisk.org/c/asterisk/+/13690]

By: Friendly Automation (friendly-automation) 2020-01-27 18:30:42.442-0600

Change 13692 merged by Friendly Automation:
chan_sip: Always process updated SDP on media source change

[https://gerrit.asterisk.org/c/asterisk/+/13692|https://gerrit.asterisk.org/c/asterisk/+/13692]