[Home]

Summary:ASTERISK-25629: [patch] Native Packet-Loss Concealment (PLC)
Reporter:Alexander Traud (traud)Labels:patch
Date Opened:2015-12-15 14:00:17.000-0600Date Closed:
Priority:MinorRegression?
Status:Open/NewComponents:Codecs/codec_ilbc Codecs/codec_speex Codecs/General
Versions:11.22.0 13.9.1 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) native_plc.patch
Description:In VoIP/SIP, the RTP media is transferred not via reliable TCP but [UDP|https://en.wikipedia.org/wiki/User_Datagram_Protocol#Comparison_of_UDP_and_TCP]. This makes sure the data arrives as fast as possible. However, because of the use of UDP, packets might never arrive or take another route via the Internet. Therefore, one packet might be faster than previous packets. Or stated differently: Packets might arrive late. In both cases, those packets are lost because they cannot be used to re-build the media stream.

Therefore, VoIP specific audio-codecs like iLBC, Speex, and SILK/Opus are able to conceal lost packets. This is called Packet-Loss Concealment (PLC). In this case, it is called native PLC, because no additional source code must be written, but the underlying library supports this already. Additionally, Asterisk offers generic PLC while writing Signed-Linear.

To conceal packet-loss, the codec library must be aware that a packet got lost. Furthermore because RTP packets inter-depent, late packets must not be forwarded to the library not to confuse its state. If a late packet arrives, the library would not know that it has to discard it, because the library does not know its RTP sequence number. The VoIP application – in our case Asterisk – has to discard late packets and indicate lost packets to the library. This is true for iLBC, Speex, SILK, and Opus Codec.

Long story short: In contrast to [the documentation|https://wiki.asterisk.org/wiki/pages/viewpage.action?pageId=5243109], native PLC does not happen. Without or with a jitter buffer.

All Long-Term-Support releases (including Asterisk 1.8.32.3, Asterisk 1.4.44, and Asterisk 1.2.40) were re-tested with a simple setup: Two Wi-Fi access points bridged via Ethernet to the same DHCP server, using the same SSID, and just WPA2-Personal but sending on different 2,4 GHz channels (Wi-Fi Roaming). Plus a Wi-Fi enabled VoIP/SIP client, moving around those access point. Whenever the Wi-Fi client changes the access point, it has to authenticate again. That creates packet loss of about one to several dozen packets. In those Asterisk releases, their iLBC transcoding module was passed. The backtrace is
main/channel.c:ast_read(.)
 main/translate.c:ast_translate(.)
   main/translate.c:framein(.)
     codecs/codec_ilbc.c:framein(.)
but native PLC was not done, because there was no indication of lost packets. Furthermore, late packets were not dropped but forwarded as usual packets. Those packets with unexpected RTP sequence numbers were double-checked via Wireshark » Menu » Telephony » RTP » Show All Streams » Analyse » Next non-OK.
Comments:By: Alexander Traud (traud) 2015-12-15 14:10:47.985-0600

The attached patch fixes this issue for Asterisk 13 and is threefold:

The new code determines missing packets based on the RTP sequence number. This allows
(1) to detect the amount of lost packets (to be forwarded to the transcoding library) and to
(2) discard late packets (which are not forwarded to the transcoding library).

My patch for ASTERISK-25353 did create a regression now, because: When packets are lost but recreated by PLC, the same amount of frames were passed into the for-loop. For example, when 40 packets got lost, 40 frames are missing, and 40 frames were transcoded per each step. If there is a multi-level transcoding, again 40 frames were forwarded. This does overflow internal buffers. For example, codec_resample is able to buffer only two packets from the Opus Codec. When more than two packets are lost, the transcoding would not work. Therefore

(3) the loop was changed to transcode each frame individually.

By: Alexander Traud (traud) 2016-09-26 06:30:57.973-0500

The Resolution state was not changed, when the change was reverted in Git. The change is not part of any branch, yet. Therefore, this issue is neither closed nor fixed, yet. I do not find the button to change the Resolution state. I guess, I am not allowed to change it back to Open.

By: Steve Davies . (stevedavies) 2016-10-05 06:35:54.379-0500

Hi,

Comment from a user - I applied this patch to my 13.10.0 system and it really didn't work well for me.

One case was a call which was being transcoded alaw -> (slin) -> g729.

The user on the g729 side called up an inband transfer (# on our system).  At that point :

{code:title=full log|borderStyle=solid}
[Oct  4 12:24:33] DTMF[28899][C-0000d7a3] channel.c: DTMF begin '#' received on SIP/41.221.230.19:5060-0001bd56
[Oct  4 12:24:33] DTMF[28899][C-0000d7a3] channel.c: DTMF begin passthrough '#' on SIP/41.221.230.19:5060-0001bd56
[Oct  4 12:24:33] DTMF[28899][C-0000d7a3] channel.c: DTMF end '#' received on SIP/41.221.230.19:5060-0001bd56, duration 60 ms
[Oct  4 12:24:33] DTMF[28899][C-0000d7a3] channel.c: DTMF end accepted with begin '#' on SIP/41.221.230.19:5060-0001bd56
[Oct  4 12:24:33] DTMF[28899][C-0000d7a3] channel.c: DTMF end '#' detected to have actual duration 61 on the wire, emulation will be triggered on SIP/41.221.230.19:5060-0001bd56
[Oct  4 12:24:33] DTMF[28899][C-0000d7a3] channel.c: DTMF end '#' has duration 61 but want minimum 80, emulating on SIP/41.221.230.19:5060-0001bd56
[Oct  4 12:24:33] DTMF[28899][C-0000d7a3] channel.c: DTMF end emulation of '#' queued on SIP/41.221.230.19:5060-0001bd56
[Oct  4 12:24:33] VERBOSE[28899][C-0000d7a3] file.c: <SIP/41.221.230.19:5060-0001bd56> Playing 'pbx-transfer.g729' (language 'en')
[Oct  4 12:24:33] VERBOSE[28544][C-0000d7a3] res_musiconhold.c: Started music on hold, class '2521', on channel 'Local/7278200@enswitch-phone-000134c9;2'
[Oct  4 12:24:33] NOTICE[28509][C-0000d7a3] translate.c: 18772 lost frame(s) 0/46762 (g729@8000)->(slin@8000)->(alaw@8000)
[Oct  4 12:24:33] VERBOSE[28509][C-0000d7a3] codec_g72x.c:     -- G.729 PLC
..
{code}

At that point my log has 18772 "G.729 PLC" messages all in the same second.

As far as I can tell the logic decided to ask g729 codec to construct 18772 "plc" frames.  That seems crazy and the result is not pretty.

Regards,
Steve


By: Alexander Traud (traud) 2016-10-18 10:19:30.104-0500

You found a bug for sure. Now, the cause must be found. In your case, the lost/late-packet detection does not work because the RTP sequence numbers do not advance as expected. Please, describe your setup in greater detail, for example which channel driver you are using, like {{chan_sip}} or {{res_pjsip}}. The remote party, what is it, an Asterisk as well or a VoIP/SIP phone? If possible capture the RTP packets via [Wireshark|https://www.wireshark.org/#download] and look into the sequence number. Anything special? And so on … Apple summarized the parts which constitutes a [good issue report…|http://web.archive.org/web/20160324163652/https://developer.apple.com/bug-reporting/using-bug-reporter/problem-detail/] If you cannot debug this yourself, I must be able to reproduce your scenario within minutes, thanks to a detailed step-by-step description. If all the above is too complicated, just one question: What do you mean by in-band transfer – is that a forwarding to another extension or a special app in your dialplan?