[Home]

Summary:ASTERISK-27748: Random webrtc calls without audio and ALAW<->OPUS transcoding errors
Reporter:Fran Vicente (fvicente)Labels:opus pjsip webrtc
Date Opened:2018-03-19 07:15:06Date Closed:2018-07-12 12:53:34
Priority:MajorRegression?
Status:Closed/CompleteComponents:Codecs/codec_opus Core/CodecInterface
Versions:15.3.0 Frequency of
Occurrence
Occasional
Related
Issues:
Environment:CentOS release 6.9 (Final) Linux fvicastwrtc.local 2.6.32-696.el6.x86_64 #1 SMP Tue Mar 21 19:29:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Running on VMWare ESXi 6.0Attachments:( 0) capture2.zip
( 1) full
( 2) pjsip_extensions_pogp.conf
Description:I have an environment which uses an Asterisk 15.3.0 to bridge calls to WebRTC endpoints configured using OPUS codec. The calls are originated on another Asterisk, connected to this using ALAW codec.

All the calls run the same dialplan (that simply calls the webrtc client) and the same webrtc endpoints, but randomly the following errors shows on the Asterisk console, and I get no audio on the calls.

{code}
[Mar 19 11:53:25] WARNING[8195][C-00000002] translate.c: Out of buffer space
[Mar 19 11:53:25] DEBUG[8194][C-00000002] translate.c: Sample size different 160 vs 960
[Mar 19 11:53:25] DEBUG[8194][C-00000002] translate.c: Sample size different 160 vs 960
[Mar 19 11:53:25] ERROR[8195][C-00000002] codec_opus.c: Opus: Unable to parse packet for number of samples: corrupted stream
[Mar 19 11:53:25] WARNING[8195][C-00000002] translate.c: no samples for opustolin
[Mar 19 11:53:25] ERROR[8195][C-00000002] codec_opus.c: Opus: decoding: corrupted stream
[Mar 19 11:53:25] WARNING[8195][C-00000002] translate.c: Out of buffer space
[Mar 19 11:53:25] DEBUG[8194][C-00000002] translate.c: Sample size different 160 vs 960
[Mar 19 11:53:25] ERROR[8195][C-00000002] codec_opus.c: Opus: decoding: corrupted stream
{code}

The dialplan simply does:
{code}
exten => _X.,1,Dial(PJSIP/${EXTEN})
{code}

See the complete log and configuration attached.
Comments:By: Asterisk Team (asteriskteam) 2018-03-19 07:15:07.292-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Joshua C. Colp (jcolp) 2018-03-20 06:36:21.876-0500

A packet capture containing the payload would also be needed for this so it could be examined.

By: Fran Vicente (fvicente) 2018-03-21 05:48:22.868-0500

I have attached the pcap dumped on the Asterisk with WebRTC server in the zip, with the configuration files and keys generated using 'ast_tls_cert' and passphrase 'qwerty'.

10.100.1.17 is the Asterisk with WebRTC.
10.100.1.15 is other Asterisk which starts the call.
89.1.250.22 is the WebRTC  client.

Let me know if you need anything else.
Thanks

By: Joshua C. Colp (jcolp) 2018-03-28 05:16:10.744-0500

Okay, so the transcoding is as follows:

Asterisk ULAW to OPUS to send to WebRTC client
WebRTC OPUS to ULAW for ingress?

If so then what is the specific browser (including version) that is in use?

By: Joshua C. Colp (jcolp) 2018-03-28 05:23:10.762-0500

If you are also using a different opus definition in codecs.conf that would also be needed.

By: Fran Vicente (fvicente) 2018-03-28 05:44:07.858-0500

The client phone runs on Google Chrome Version 65.0.3325.181 (Official Build) (64-bit), and using SIPjs as signaling api. But I'm able to reproduce the same issue on Firefox 59.0.1, although It seems that the problem occurs less frequently than on Chrome.

The transcoding path show by "core show channel PJSIP/XXXXX" on the WebRTC channel is:
{code}
WriteTranscode: Yes (slin@48000)->(opus@48000)
 ReadTranscode: Yes (opus@48000)->(slin@48000)
{code}

I've tried with different opus configurations (including enabling or disabling fec, dtx, cbr, etc.) but for the provided example I've restored the original codecs.conf, that does not apply any configuration to the opus codec as it has the lines commented.


By: Joshua C. Colp (jcolp) 2018-04-03 07:21:08.409-0500

We use Opus daily on a mix of machines without any problems like this. It may be that the audio source feeding the Opus encoder is resulting in larger than normal sample sizes in Opus frames causing problems. Does this occur on multiple systems or just one?

By: Fran Vicente (fvicente) 2018-04-04 02:22:14.492-0500

We are able to reproduce the problem on multiple systems, but on all of them, the Asterisk servers are running virtualized under VMware ESXi. The hosts where the virtual machines run are mostly idle, so it shouldn't be a problem related with high resources utilization.

To discard problems related with packet loss, we have tried with clients running on the same local network, but the problem also occurs.

I found strange that once the error occurs, it continues until the call gets hung up.

By: Kevin Harwell (kharwell) 2018-04-11 16:58:58.685-0500

I didn't get the corruption error, but the audio was unintelligible (buzzing sound) when I ran the described scenario:

Asterisk2 (alaw) -> (alaw) Asterisk1 (opus) -> (opus) Endpoint

I tried substituting a couple of different codecs for alaw (ulaw, g722) and heard the same result. Also noticed the following in the debug, which is similar to the reporters:
{noformat}
[Apr 11 16:33:03] VERBOSE[7398][C-00000001] res_rtp_asterisk.c: Got  RTP packet from    10.24.22.255:4038 (type 107, seq 014493, ts 3091628534, len 000050)
[Apr 11 16:33:03] DEBUG[7396][C-00000001] translate.c: Sample size different 960 vs 160
[Apr 11 16:33:03] DEBUG[7396][C-00000001] res_rtp_asterisk.c: Ooh, format changed from none to alaw
[Apr 11 16:33:03] DEBUG[7396][C-00000001] res_rtp_asterisk.c: Starting RTCP transmission on RTP instance '0x7fb20001db88'
[Apr 11 16:33:03] VERBOSE[7396][C-00000001] res_rtp_asterisk.c: Sent RTP packet to      10.24.17.53:17884 (type 08, seq 028459, ts 000200, len 000160)
[Apr 11 16:33:03] DEBUG[7396][C-00000001] translate.c: Sample size different 960 vs 160
[Apr 11 16:33:03] VERBOSE[7396][C-00000001] res_rtp_asterisk.c: Sent RTP packet to      10.24.17.53:17884 (type 08, seq 028460, ts 000360, len 000160)
[Apr 11 16:33:03] VERBOSE[7396][C-00000001] res_rtp_asterisk.c: Got  RTP packet from    10.24.17.53:17884 (type 08, seq 009371, ts 000160, len 000160)
[Apr 11 16:33:03] VERBOSE[7398][C-00000001] res_rtp_asterisk.c: Got  RTP packet from    10.24.22.255:4038 (type 107, seq 014494, ts 3091629494, len 000050)
[Apr 11 16:33:03] DEBUG[7396][C-00000001] translate.c: Sample size different 160 vs 960
[Apr 11 16:33:03] DEBUG[7396][C-00000001] translate.c: Sample size different 960 vs 160
[Apr 11 16:33:03] VERBOSE[7396][C-00000001] res_rtp_asterisk.c: Sent RTP packet to      10.24.17.53:17884 (type 08, seq 028461, ts 000520, len 000160)
[Apr 11 16:33:03] DEBUG[7398][C-00000001] res_rtp_asterisk.c: Ooh, format changed from none to opus
[Apr 11 16:33:03] VERBOSE[7398][C-00000001] res_rtp_asterisk.c: Sent RTP packet to      10.24.22.255:4038 (type 107, seq 012029, ts 000672, len 000102)
[Apr 11 16:33:03] VERBOSE[7396][C-00000001] res_rtp_asterisk.c: Got  RTP packet from    10.24.17.53:17884 (type 08, seq 009372, ts 000320, len 000160)
[Apr 11 16:33:03] DEBUG[7396][C-00000001] translate.c: Sample size different 160 vs 960
[Apr 11 16:33:03] VERBOSE[7398][C-00000001] res_rtp_asterisk.c: Sent RTP packet to      10.24.22.255:4038 (type 107, seq 012030, ts 001632, len 000103)
[Apr 11 16:33:03] VERBOSE[7398][C-00000001] res_rtp_asterisk.c: Got  RTP packet from    10.24.22.255:4038 (type 107, seq 014495, ts 3091630454, len 000050)
[Apr 11 16:33:03] DEBUG[7396][C-00000001] translate.c: Sample size different 960 vs 160
[Apr 11 16:33:03] VERBOSE[7396][C-00000001] res_rtp_asterisk.c: Sent RTP packet to      10.24.17.53:17884 (type 08, seq 028462, ts 000680, len 000160)
[Apr 11 16:33:03] VERBOSE[7396][C-00000001] res_rtp_asterisk.c: Got  RTP packet from    10.24.17.53:17884 (type 08, seq 009373, ts 000480, len 000160)
[Apr 11 16:33:03] DEBUG[7396][C-00000001] translate.c: Sample size different 160 vs 960
[Apr 11 16:33:03] VERBOSE[7398][C-00000001] res_rtp_asterisk.c: Sent RTP packet to      10.24.22.255:4038 (type 107, seq 012031, ts 002592, len 000101)
{noformat}
Here's the basic _pjsip.conf_ endpoint configuration I used on ast1:
{noformat}
[endpoint_t](!)
type=endpoint
context=default
direct_media=no
rtcp_mux=yes

[aor_t](!)
type=aor
qualify_frequency=0
max_contacts=10

;; Phone endpoint using opus
[102](aor_t)
contact=sip:102@<endpoint IP>
mailboxes=102@default

[102](endpoint_t)
aors=102
allow=!all,opus

;; Asterisk endpoint using alaw
[ast2](aor_t)
contact=sip:<IP of ast2>

[ast2](endpoint_t)
aors=ast2
from_user=ast1
allow=!all,alaw
{noformat}
And for ast2:
{noformat}
[ast1](aor_t) ; uses same template definition as ast1
contact=sip:<IP of ast1>

[ast1](endpoint_t) ; uses same template definition as ast1
aors=ast1
from_user=ast2
allow=!all,alaw
{noformat}
I then originated a call from ast2 to ast1, which called endpoint 102:
{noformat}
CLI> channel originate PJSIP/102@ast1 application Playback demo-congrats
{noformat}

By: Kevin Harwell (kharwell) 2018-04-11 17:10:11.499-0500

ast1 was running 64 bit Ubuntu 14.04 and ast2 64 bit Ubuntu 16.04. Neither is a VM. Also this happens for me every time I run it.

By: Kevin Harwell (kharwell) 2018-04-11 17:13:44.305-0500

Oh forgot to mention, as you can probably tell from the configuration, the problem occurred for me using a regular SIP phone that supported opus. However, I did check using a webrtc endpoint as well and -heard no audio at all upon answering- I was able to hear audio fine.

By: Kevin Harwell (kharwell) 2018-06-28 17:16:25.516-0500

[~fvicente], Is this still a problem?

I went back and tested my scenario again and realized I had a configuration problem for the endpoint I was using (I needed to specify values for the bitrate and max_playback_rate in codecs.conf). Once I did the call/audio in both directions sounded as it should.

That being said I am now unable to replicate your problem. I've gone back too and taken a look at the given pcap and configuration files. I only see the one call negotiated in the pcap and it negotiates L16/8000, which does not correspond to the given log. I'm not seeing any sip trace where a call using opus was attempted.

In order to move forward with this issue we'll require a full Asterisk debug log that includes debug set to at least 5, the sip trace of an offending call, and rtp debug turned on. Also include a packet capture that corresponds to that debug log.

If you are not sure how, please follow the instructions on the wiki [1] for how to collect debugging information from Asterisk.

Thanks!

[1] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information



By: Kevin Harwell (kharwell) 2018-07-12 12:53:34.224-0500

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines