[Home]

Summary:ASTERISK-29253: Incorrect bridging on transfer
Reporter:Yury Kirsanov (lt_flash)Labels:
Date Opened:2021-01-20 00:24:41.000-0600Date Closed:2022-04-26 14:59:36
Priority:MinorRegression?
Status:Closed/CompleteComponents:Bridges/bridge_simple
Versions:16.15.0 18.1.1 Frequency of
Occurrence
Constant
Related
Issues:
is duplicated byASTERISK-29273 Incorrect off-hold on ReINVITE via Replaces
Environment:Ubuntu Linux 18.04.5 LTSAttachments:( 0) bridge_simple.tar.gz
( 1) bridge_softmix.tar.gz
( 2) call_flow.txt
Description: We have an Asterisk server and one SIP device registered with it and one SIP trunk to another system. Also we another SIP device to call for test purposes.

Here's a simplified diagram of call flow with attended transfer we're trying to achieve:

SIP Device A -> Asterisk PBX -> SIP Trunk -> External User -> Attended transfer -> Asterisk PBX -> SIP Device B.

SIP Device A originates a call to some pre-defined number that's routed into SIP trunk (TLS+SRTP). Remote party behind SIP trunk ("External user") answers this call and then starts attended transfer to SIP Device B on Asterisk PBX. That SIP trunk uses Re-INVITE with Replaces header in order to complete transfer. During transfer SIP Device A can hear MOH after External User initiated attended transfer to SIP Device B. Then External User tries to complete the transfer connecting SIP Device A with SIP Device B. Call connects but SIP Device A continues to hear Music On Hold while SIP Device B can hear what SIP Device A says.

Now, if we unload module 'bridge_simple' Asterisk PBX starts to use 'bridge_softmix' module and connect calls correctly, SIP Device A can establish two way communication with SIP Device B. But during attended transfer no MOH is played at all even though Asterisk shows messages like 'Starting music on hold'. And without 'bridge_simple' no Music On Hold is played at all even if we set up just a local extension that plays MOH, like this:

exten=>100,1,Answer()
exten=>100,n,MusicOnHold()

If we load bridge_simple then MOH is played fine but during transfer calls are bridged incorrectly.

I'm happy to provide full logs upon request as I don't want to edit them.

Also, if we use SIP REFER method for transferring calls there's another issue - when External User tries to finalize call transfer Asterisk drops established call to SIP Device B and immediately re-dials it. I believe this happens because SIP Trunk is not passing Replaces in Refer-To header, but that's another issue.
Comments:By: Asterisk Team (asteriskteam) 2021-01-20 00:24:41.726-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

By: Joshua C. Colp (jcolp) 2021-01-20 03:55:02.928-0600

You haven't specified which SIP implementation is in use, as well please do sanitize logs and attach them. Attaching them means that any individual is able to investigate, see, and potentially resolve. By not sanitizing you limit yourself to just the members of the Sangoma Asterisk team unless someone takes interest and directly coordinates for log information. We leave this as a last resort as a result, and highly prefer attached logs.

As well, is anything behind NAT? Have you configured Asterisk to be behind NAT? What environment is this on? Virtualized? Do you have a timing module active? Have you verified everything without involving transfers?

I ask because some of your statements seem as though re-INVITEs for direct media may be going on.

By: Yury Kirsanov (lt_flash) 2021-01-20 05:17:28.670-0600

Hi Joshua,
Thanks for you reply.
1. We're using PJSIP implementation
2. We're NOT using any NAT, for this specific test I've made sure we're not using even any RTP proxies or anything like that, everything is on pure public IP addresses.
3. Direct media is not occurring because we have different legs of calls each connecting to Asterisk server, so we always have either 'SIP Device -> Asterisk' or 'SIP Device -> External User' or 'Asterisk -> SIP Device' scenario
4. There's no issue with timing, RTP stream is constant and sequence numbers are in appropriate order, no lost packets too.
5. I'm happy to attach logs but not directly to this case as they have sensitive information. In my previous tickets George was asking me to put logs on some shared storage and then send him a link via email. I'm happy to do so for you.
6. I have verified and 100% sure that everything works as it should except to the point when Re-INVITE with 'Replaces:' header comes through to Asterisk asking to replace Call-ID of 'Asterisk -> External user' with 'SIP Device A -> Asterisk' and 100% sure that when 'bridge_simple' module is unloaded call is connected just fine. I've also checked and confirmed that in Replaces header there's a correct Call-ID for replacement call specified.
7. Timing module is enabled by default and uses res_timing_timerfd module.

Thanks!

By: Joshua C. Colp (jcolp) 2021-01-20 05:29:10.666-0600

Is there an explicit reason you can't sanitize the logs of sensitive information? I'm trying to understand the reasoning to try to reduce the amount it happens. We (the Sangoma Asterisk Team) do have the ability to accept such logs, however I don't want the issue tracker to become a place where people just send logs directly to the Sangoma Asterisk Team out of convenience. I've noticed in recent times this has increased such that some people don't even bother and just directly email things to the Sangoma Asterisk Team. Over all it removes the ability for the open source community to help or investigate things unless they go to extra lengths and it decreases your chances of the issue being fixed.

By: Joshua C. Colp (jcolp) 2021-01-20 05:35:16.715-0600

And if you do still want to submit privately then the individual doing triage who will be in later can provide the information and ensure it is attached to things.

By: Yury Kirsanov (lt_flash) 2021-01-20 05:41:47.885-0600

Well I thought that it's better to provide full logs in PCAP format but of course I can edit them and attach here. I will post them as separate legs of calls soon, thanks.

By: Joshua C. Colp (jcolp) 2021-01-20 05:43:26.087-0600

We require additional debug to continue with triage of your issue. Please follow the instructions on the wiki [1] for how to collect debugging information from Asterisk. For expediency, where possible, attach the debug with a '.txt' file extension so that the debug will be usable for further analysis.

Thanks!

[1] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information

In regards to logs I'm referring to Asterisk logs as I've linked above, not just pcaps. Packet captures only give the SIP side of the signaling and don't show what Asterisk itself is doing with it.

By: Yury Kirsanov (lt_flash) 2021-01-20 07:06:30.415-0600

Please find attached the call flow of described call. Yes, both remote handsets are behind NAT on their side, but Asterisk and SIP trunk are purely on public IPs.


By: Joshua C. Colp (jcolp) 2021-01-20 07:09:32.609-0600

The Asterisk level debug that I mentioned is still needed for this to see what is going on within Asterisk itself.

By: Yury Kirsanov (lt_flash) 2021-01-20 07:14:07.081-0600

Joshua,
No worries, I will be able to provide this tomorrow as currently it's midnight here in Australia. Thanks!

By: Yury Kirsanov (lt_flash) 2021-01-21 04:25:49.756-0600

Please find attached full debug logs for both bridge types.

By: Yury Kirsanov (lt_flash) 2021-01-23 02:49:45.030-0600

Hi Joshua,
Did you have a chance to have a look at this issue? It's very important for us to resolve it. Thanks!

By: Joshua C. Colp (jcolp) 2021-01-23 04:57:26.619-0600

I am not the active person doing triage, and there is no time frame on when this would be looked into or resolved after triage.

By: Yury Kirsanov (lt_flash) 2021-01-24 19:36:26.055-0600

Ok but it looks to be quite a major bug to me that's why I thought someone would have a look into this because as soon as bridge_simple is unloaded everything starts to work as it should.

By: Joshua C. Colp (jcolp) 2021-01-25 04:21:56.984-0600

It's a bug that so far noone else has experienced, or that we've seen across other FreePBX users, Switchvox users, or plain Asterisk users. If people have and they comment then we'll certainly take that into consideration from a Sangoma perspective.

By: Yury Kirsanov (lt_flash) 2021-01-25 04:26:06.549-0600

Ok, no worries, I'm just a bit surprised as previously when I logged a bug "that noone else experienced" with segfault in Asterisk during T.38 exchange - that was actioned immediately, there were no links to other similar tickets and noone else commented in that ticket. Anyway, hopefully someone would have a look into this, otherwise it's unclear to me why did you ask to provide details if there's no idea if someone going to check it. Thanks.

By: Joshua C. Colp (jcolp) 2021-01-25 04:36:22.790-0600

People, including Sangoma employees, can take personally interest in issues and investigate/look at them if they wish. As for requesting information the more information that is available immediately, the easier it is to solve issues if someone does look at this - including community members. If the information isn't available then it has to go back to you, the person has to wait, etc. Essentially the less blockers to solving an issue the better if it does get looked at.

By: Yury Kirsanov (lt_flash) 2021-01-25 04:48:36.936-0600

No worries, I will be hoping that someone would become interested in looking into strange bridge behaviour soon. Thanks again.

By: George Joseph (gjoseph) 2021-01-26 09:14:34.228-0600

Yury, Did this issue happen with earlier versions of Asterisk?  Can you pinpoint which version started having this issue?


By: Yury Kirsanov (lt_flash) 2021-01-26 09:25:02.935-0600

Hi George,
I've first encountered this issue with Asterisk version 16.5.0 and then I tried to compile every version in every branch and the issue happens in them all.

By: George Joseph (gjoseph) 2021-01-26 09:27:33.208-0600

Gotcha, thanks.


By: Yury Kirsanov (lt_flash) 2021-01-26 09:35:34.351-0600

George, sorry, I made a typo, I meant '16.15.0'. We were using this LTS branch for a long time and then we connected this SIP trunk and found the issue. So I don't know if this issue happens to any earlier versions than 16.15.0.

By: George Joseph (gjoseph) 2021-01-26 09:43:50.464-0600

I understand.   Is it possible for you to try it on earlier versions?


By: Yury Kirsanov (lt_flash) 2021-01-26 09:55:42.641-0600

Yes, no worries, I can compile any version you'd recommend. Which one would you like me to try?

By: Yury Kirsanov (lt_flash) 2021-01-27 22:32:31.329-0600

Hi George, sorry to distrub you, but what version of Asterisk would you like me to compile and test? I'm happy to use any of them. Thanks.

By: George Joseph (gjoseph) 2021-01-28 07:21:08.326-0600

Oops, I missed your earlier comment, sorry.   How about 16.12?   I'm just trying to determine if this issue is a recent regression.


By: Yury Kirsanov (lt_flash) 2021-01-28 07:49:42.406-0600

No worries, I will compile it and test tomorrow! Thanks!

By: Yury Kirsanov (lt_flash) 2021-01-29 02:08:22.069-0600

Hi George,
I've compiled Asterisk versions 16.0.0, 16.5.0 and 16.12.0 and tested bridge behaviour. Absolutely no difference at all, if bridge_simple is loaded - calls are connected incorrectly, if it's unloaded - calls are connected correctly, but no Music On Hold is available on the PBX.

By: George Joseph (gjoseph) 2021-01-29 08:05:57.682-0600

OK Yury.  Thanks for testing that.


By: Yury Kirsanov (lt_flash) 2021-01-29 08:29:54.834-0600

No worries, I'll be glad to help if anything else is needed!

By: George Joseph (gjoseph) 2021-02-01 08:33:47.302-0600

Adding [~igoro] to the watch list.


By: Igor Olhovskiy (IhorOlkhovskyi) 2021-02-01 09:12:59.836-0600

Just to add, this behavior confirmed for me on Asterisk 13.38, 13.25 on both chan_pjsip and chan_sip, and 16.16.0 with chan_pjsip.

By: Yury Kirsanov (lt_flash) 2021-02-01 09:56:15.836-0600

George, I've spoken to Ihor and he confirms that the behaviour is absolutely the same but to mention - they're using a different type of SIP Gateway, mine is Microsoft Teams, they're using some PBX.

By: George Joseph (gjoseph) 2021-02-01 11:30:40.404-0600

I'll add this info to our internal ticket.


By: Yury Kirsanov (lt_flash) 2021-02-15 05:36:13.749-0600

Hi, is there any chance for a fix on this one? Thanks!

By: Joshua C. Colp (jcolp) 2021-02-15 05:39:09.027-0600

The issue is open and accepted, but there is noone actively working on it and there is no time frame on when it will be looked into.

By: Yury Kirsanov (lt_flash) 2021-02-15 05:55:55.366-0600

Joshua, is there a way to do this on a paid basis? For example if our company is happy to pay for this to be resolved - how do we do it and how much would it cost?

By: Joshua C. Colp (jcolp) 2021-02-15 06:03:14.998-0600

You would need to contact Sangoma Sales[1] if you are referring to paying Sangoma to do so. I have no information on whether this would be something we would do, or the cost involved.

[1] https://www.sangoma.com/contact-us/

By: Yury Kirsanov (lt_flash) 2021-06-03 05:02:10.963-0500

Is there any update on this issue? Sangoma Sales didn't respond to my emails. Thanks.

By: Joshua C. Colp (jcolp) 2021-06-03 05:20:15.193-0500

Any updates regarding this issue from an open source perspective would be posted here. As there has been no comment, there is no update.

By: Jon Wright (jon.wright) 2022-04-16 06:51:06.818-0500

Yury - I've been struggling with this exact same problem with my deployments. We're a large college group over in the UK using multiple asterisk deployments hooked into Nortel and Mitel telephone exchanges and the attended transfers (which are continually used by our switchboard staff and are "external" to asterisk) always result in the one-way MOH / incorrect bridging. I've also tried unloading 'bridge_simple' to work around the problem, but in our case it breaks the refers so we cant get the transfers to go through at all.

Josh - I imagine there's a lot more people out there affected by this issue than has been reported. I've been struggling with it for months and only now realised it was a bridging bug when stumbled across Yury's issue.

Really keen to get some traction behind a fix.

Kind Regards,

Jon

By: Igor Olhovskiy (IhorOlkhovskyi) 2022-04-16 07:01:31.076-0500

@Jon, I've found a workaround for this, but it's really couldn't be a "production" solution.
What is done - I've put a Kamailio SIP Proxy between legacy PBX and Asterisk for SIP signalling. And changing all "hold" via "sendonly" to "sendrcv". So, Asterisk is keeping RTP connection with legacy PBX.
But it's not working via REFER mechanism.

By: Yury Kirsanov (lt_flash) 2022-04-16 07:07:12.477-0500

Guys, actually I do have a patch for this issue, we have paid some C developers and they fixed this issue. I will provide the patch here within couple of hours, its literally a couple of lines of code.

By: Yury Kirsanov (lt_flash) 2022-04-16 07:12:36.169-0500

This patch works as well for Asterisk 18, please just manually copy the lines with + sign into your bridge_simple.c and recompile. Let me know how it goes!

Hope this helps and hopefully it will be checked by Digium and included into Asterisk. I was waiting for them to do this and that's why I wasn't posting anything about the patch here.

{code}
Description: Unhold channels on join simple bridge
Last-Update: 2021-06-17

--- asterisk-16.15.0~dfsg.orig/bridges/bridge_simple.c
+++ asterisk-16.15.0~dfsg/bridges/bridge_simple.c
@@ -155,6 +155,15 @@ static int simple_bridge_join(struct ast
   ast_channel_unlock(c0);
   ast_channel_unlock(c1);

+       if (ast_channel_hold_state(c1) == AST_CONTROL_HOLD) {
+               ast_debug(1, "Channel %s simulating UNHOLD for bridge simple join.\n", ast_channel_name(c1));
+               ast_indicate(c1, AST_CONTROL_UNHOLD);
+       }
+       if (ast_channel_hold_state(c0) == AST_CONTROL_HOLD) {
+               ast_debug(1, "Channel %s simulating UNHOLD for bridge simple join.\n", ast_channel_name(c0));
+               ast_indicate(c0, AST_CONTROL_UNHOLD);
+       }
+
   if (!new_top) {
       /* Failure.  We'll just have to live with the current topology. */
       return 0;
{code}

By: Jon Wright (jon.wright) 2022-04-16 07:49:12.191-0500

Fantastic! Thanks Yury, I'm not back in the office until w/c 25th so will give it a go then and report back.

Igor - thanks for the tips too. Good to know people have come up with other workarounds. Like most, I'm keen to get REFERs working so will have a go with the source code mod.

Josh - if this all checks out, would be great to have it merged into the main bridge_simple.c source for future releases.

Kind Regards,

Jon

By: Joshua C. Colp (jcolp) 2022-04-16 08:24:02.928-0500

Any fixes need to be attached as a patch with license agreement signed, once done you can also put it up for review and then it will be reviewed for inclusion and merged. The process is documented on the wiki[1].

[1] https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process

By: Yury Kirsanov (lt_flash) 2022-04-16 09:00:17.703-0500

I'm not too interested in going through the whole process of creating a branch in Git and so on. Anyway, to have this resolved we had to pay an independent programmers, so I'm just happy to have it working and putting it here just because of my good will. If anyone's interested in reviewing this patch and merging it into Asterisk - they're welcome to do so. Otherwise use the patch on your own risk. It looks like Digium is not too interested in reviewing it and merging into main branch anyway. From their own contribuition process page:

Do I have to put my patch up for code review?
In short, no. Attaching your patch as a code contribution to an issue in JIRA is all that is required.

Which I did. Thanks everyone and hope my patch help you in resolving this issue.

By: Igor Olhovskiy (IhorOlkhovskyi) 2022-04-16 09:07:03.589-0500

Joshua,
I've spoke with Yury and got his permission to go with this patch on my behalf. I'll prepare a PR according to yor guidelines

By: Yury Kirsanov (lt_flash) 2022-04-16 09:10:05.669-0500

Yep, no worries at all, happy to have Ihor doing that and have this code checked by someone professional and merged into Asterisk if all good. I'm not a developer so it's nearly imposdible for me to contribute code myself.

By: Joshua C. Colp (jcolp) 2022-04-16 11:05:36.117-0500

Indeed, putting it up as a patch on here is all that is required to make it available but that doesn't guarantee that it will go up for review and be merged until someone like yourself, [~IhorOlkhovskyi] or Sangoma does so.

Speaking for both people at Sangoma and other community developers, it can be extremely demoralizing when you say things like "is not too interested". We're always interested in seeing changes go in, but it's just not possible for us to do everything for everyone all the time. We aim for a middle ground where the community also helps to a point.

Heck, it's a Saturday where I am and not even during business working hours and I'm here responding and checking on things.

Also adding a note here that despite the license agreement not being signed, since the change is minimal and is really similar to existing code I'll allow it.

By: Igor Olhovskiy (IhorOlkhovskyi) 2022-04-16 11:23:34.269-0500

[~jcolp], just to say, I really appreciate your (and for sure all the community) work that you're doing! And pretty aware of how busy could be core developers here and why this kind of stuff can take so much time to resolve.
That's exactly why I proposed my assistance here to follow your guidelines, that is how community should work from my point of view - everybody is doing their best, so I can help the whole community to profit from Yury's patch.
I think remarks like "non-interested" was referred to some maybe not super successful negotiations for commercial support, which are more money, than community questions.
So, again - many thanks for job you're doing despite it's your day-off!

By: Joshua C. Colp (jcolp) 2022-04-16 11:32:10.488-0500

Indeed, and thank you for taking this on.

By: Yury Kirsanov (lt_flash) 2022-04-16 11:44:47.001-0500

Joshua,
I've contacted Sangoma sales via email and phone - nobody wanted to discuss a possibility of fixing this issue on a paid basis with me. I've asked multiple times if anyone from Sangoma will be able to have a look at the issue - and got an answer 'Maybe some day, but no timeframes'. That's where my 'not too interested' is coming from. The issue is here for more than a year and I don't agree that it was a minor issue as it was a major issue for our business here in Australia. So I've done all the steps in sorting out this issue for our company and the patch did help to resolve it. I can't guarantee it's a correct patch or it will resolve same issue for everyone as I'm not a developer, again - we had to hire someone to resolve it for us in our particular case. I'm thankful for your support and appreciate that you're replying to us on Saturday, but it's my weekend too and also we have a long weekend due to public holidays in Australia - it's Easter time. But I stiil found some time to present the solution to this problem even though nobody wanted to help us in resolving this. So please understand me too - I don't want to go through the whole process of submitting a proper Git patch due to the lack of response from Sangoma. I was very frustrated nobody wanted to help us even on a paid basis. I do understand that Asterisk is a free software but still it's hard to understand why new features in Asterisk are given priority over fixing old bugs. And this bug comes from at least Asterisk 11. As you can see by comments above - a lot of people are affected by it. Again - my apologies if I offended you or somebody else and hope that the patch is correct for this issue.

I can also add that I was able to apply it to Asterisk 18.11.2 today and it is working fine under our circumstances. Thanks.

By: Sean Bright (seanbright) 2022-04-21 09:29:37.044-0500

The (slightly modified) patch has been [submitted for review|https://gerrit.asterisk.org/c/asterisk/+/18411].

By: Igor Olhovskiy (IhorOlkhovskyi) 2022-04-21 10:16:04.707-0500

Sean, thanks!
I was not able to do a PR cause still waiting for signing contributor agreement to finish :)


By: Friendly Automation (friendly-automation) 2022-04-26 14:59:37.935-0500

Change 18440 merged by Friendly Automation:
bridge_simple.c: Unhold channels on join simple bridge.

[https://gerrit.asterisk.org/c/asterisk/+/18440|https://gerrit.asterisk.org/c/asterisk/+/18440]

By: Friendly Automation (friendly-automation) 2022-04-26 15:01:47.118-0500

Change 18411 merged by Friendly Automation:
bridge_simple.c: Unhold channels on join simple bridge.

[https://gerrit.asterisk.org/c/asterisk/+/18411|https://gerrit.asterisk.org/c/asterisk/+/18411]

By: Friendly Automation (friendly-automation) 2022-04-26 15:09:00.303-0500

Change 18441 merged by Friendly Automation:
bridge_simple.c: Unhold channels on join simple bridge.

[https://gerrit.asterisk.org/c/asterisk/+/18441|https://gerrit.asterisk.org/c/asterisk/+/18441]

By: Friendly Automation (friendly-automation) 2022-04-26 15:09:06.082-0500

Change 18442 merged by Friendly Automation:
bridge_simple.c: Unhold channels on join simple bridge.

[https://gerrit.asterisk.org/c/asterisk/+/18442|https://gerrit.asterisk.org/c/asterisk/+/18442]