[Home]

Summary:ASTERISK-28118: Not forwarding RTP packets / Packet loss
Reporter:Samuel For (samfun)Labels:pjsip sip
Date Opened:2018-10-18 09:43:43Date Closed:
Priority:MajorRegression?
Status:Open/NewComponents:Channels/chan_sip/General
Versions:13.23.1 16.0.0 Frequency of
Occurrence
Constant
Related
Issues:
Environment:Ubuntu 18.04. Asterisk 16.0.0. chan_sip.Attachments:( 0) build_asterisk16.sh
( 1) config.zip
( 2) packet_loss_digitalocean.pcap
( 3) test2_ast13_console.txt
( 4) test2_ast13.pcap
( 5) test2_ast16_console.txt
( 6) test2_ast16.pcap
Description:Hi there,

We were going to do some performance benchmarks of Asterisk 16.0.0 on different cloud providers but instead we ran into a problem where Asterisk continously drops RTP packets. We believe this to be a bug in Asterisk.

This is the setup:

CLIENT - Client which generates a call
SUBJECT - This is the machine we are benchmarking, it forwards the call from CLIENT to ITSP
ITSP - Fake ITSP that receives the call and plays MOH back

Doing this, we see that the SUBJECT has packet loss on both outgoing legs towards CLIENT and ITSP. When we run a tcpdump on SUBJECT we see the packets coming from the ITSP, no loss, but SUBJECT is not forwarding all of them, about 3% are lost and never reaches the outbound network interface. Hence, it does not seem to be an external networking issue.

We thought this might be related to the cloud provider, so we tested on AWS EC2 (XEN and Nitro/KVM), DigitalOcean (KVM) and it was failing on all of them.

The SUBJECT is not under any other load, so this happens from the first call.

Machine specs for all of them:
2 CPU
4GB Ram
Ubuntu 18.04
No firewall enabled
The machine is basically unloaded during this. Load is 0.0, Idle 100% etc..

Please find a PCAP attached that shows this phenomenon.

CLIENT: 85.24.248.224
SUBJECT: 104.248.244.68
ITSP: 35.157.105.234

I've also attached the full configuration and build script we used to setup the Asterisk machine.

We have the machines up and running for a few more days so I'm happy to collect any data or run any tests you deem appropriate.

Thanks!
Comments:By: Asterisk Team (asteriskteam) 2018-10-18 09:43:45.034-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Samuel For (samfun) 2018-10-18 09:45:27.600-0500

- Build script to run on a brand new machine
- PCAP showing the problem
- Config.zip contains all the Asterisk conf used in this benchmark

By: Joshua C. Colp (jcolp) 2018-10-18 09:57:10.367-0500

Please also provide the console output with debug enabled (debug to console in logger.conf) and core set debug 9 done on the CLI, including SIP traffic as well (sip set debug on).

By: Joshua C. Colp (jcolp) 2018-10-18 09:59:02.753-0500

Oh, and did you install 15 or 13 afterwards under the same conditions thus confirming it was limited to 16?

By: Samuel For (samfun) 2018-10-18 10:45:33.147-0500

Hi again. Problem is with 13 as well. Just tested that now. Please find pcaps and console output for both 13/16 attached.

By: Samuel For (samfun) 2018-10-18 10:46:04.335-0500

Test run #2 for both 13/16 with console output and pcaps.

By: Samuel For (samfun) 2018-10-18 10:50:28.343-0500

Interesting in the pcap for ast13 we do not see the packet loss. Instead when playing the RTP stream from Subject -> Client we see "Inserted silence" markers instead making the clipping sound.

By: Joshua C. Colp (jcolp) 2018-10-18 11:10:25.184-0500

You appear to be using the jitterbuffer - if you disable it and let each side handle the buffering themselves (which they should do) does that then resolve the problem?

By: Samuel For (samfun) 2018-10-18 14:08:10.096-0500

Yes, indeed, the packet loss goes away when disabling the JB.

Is there a known issue with the JB hence your recommendation to turn it off?

The ITSPs we've worked with usually have a fixed JBs that is smaller than what would be good for 4G/LTE handphones, hence we want to use JB to de-jitter mobile SIP traffic before hitting the ITSP.

By: Joshua C. Colp (jcolp) 2018-10-19 05:09:57.630-0500

I don't know of any current known issues, but the JB code is extremely old and infrequently used. It may have problems in your scenario or usage.

By: Samuel For (samfun) 2018-10-19 05:42:26.528-0500

Using chan_pjsip, does it use the same jitter buffer as chan_sip or the one included with PjSIP? If the other JB we could run some tests and provide the results on that as well.

By: Joshua C. Colp (jcolp) 2018-10-19 06:45:50.579-0500

Same jitterbuffer for everything.

By: George Joseph (gjoseph) 2018-10-26 13:28:13.413-0500

Samuel,

It'd be very helpful if you could test using chan_pjsip.  Although it's the same jitterbuffer there may be other factors involved.  Having example chan_pjsip configurations to test with will also make it easier for us to test.

thanks.


By: Samuel For (samfun) 2018-10-30 09:33:59.035-0500

Hi George,

I had torn down the lab environment after the conversation with Joseph but I set it up again today on new servers on DigitalOcean.

Using chan_sip I can reproduce the packet loss in 100% of the calls.

Using chan_pjsip, one out of 40 test calls showed packet loss. I suppose that could have been a real network glitch.

The chan_sip configuration is attached since earlier.

By: George Joseph (gjoseph) 2018-10-30 15:30:16.319-0500

I'm acknowledging this issue but because it only happens with chan_sip and chan_sip is supported only by the community, it'll be up to a community member to address.