[Home]

Summary:ASTERISK-21246: [patch] use of rtpkeepalive uses CN packet with marker bit set, plus a ULAW payload instead of CN
Reporter:Peter Katzmann (pk16208)Labels:
Date Opened:2013-03-14 05:20:36Date Closed:2013-05-08 23:08:38
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_sip/General Resources/res_rtp_asterisk
Versions:1.8.20.1 11.3.0 Frequency of
Occurrence
Related
Issues:
is caused byASTERISK-17304 [patch] [regression] rtpkeepalive no longer works, but is still documented to work
Environment:ubuntu 10.04Attachments:( 0) asterisk.conf.tar.gz
( 1) asterisk-21246-rtp-cng-payload-error_1.8_v2.diff
( 2) asterisk-21246-rtp-cng-payload-error_1.8.diff
( 3) CallWithTranscoding
( 4) codecchange.0
( 5) codecchange.2
( 6) codecchange.3
( 7) codecchange-24032013.bz2
( 8) codecchange-24032013-udp.pcap
( 9) codecchange-filtered.pcap.gz
(10) codecchange-patched
(11) codecchange-patched_v2.txt
(12) codecchange-patched.pcap
(13) codecchange-patch-v2.pcap
(14) filtered1512-2.pcap
(15) filtered1512-2-leg.pcap
(16) myDebugLog
(17) myDebugLog
(18) negiot-switch
(19) vanilla-trace.txt
(20) ws-trace.txt
Description:See attached traces
During call asterisk switches the negotiated codec (alaw) in the rtp to ulaw.
If there is no renegotiation (like on my snom 720 sometimes), the user has a audible distortion.
Comments:By: Michael L. Young (elguero) 2013-03-14 09:11:09.847-0500

We require a complete debug log to help triage the issue. This document will provide instructions on how to collect debugging logs from an Asterisk machine for the purpose of helping bug marshals troubleshoot an issue: https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information



By: Michael L. Young (elguero) 2013-03-14 12:36:31.640-0500

Peter, has the Asterisk code been modified?  Are you running any patches on it?

By: Peter Katzmann (pk16208) 2013-03-15 02:33:09.026-0500

Michael yes,
one custom patch who processes the qsig aoc info of the patton smartnode,
i also use the patches for the debian package structure

Nothing more. But i will make a try to sort out that our custom patch do no harm


By: Peter Katzmann (pk16208) 2013-03-15 06:49:55.503-0500

Traces against vanilla asterisk

By: Peter Katzmann (pk16208) 2013-03-19 15:56:04.487-0500

Hello,
i upload a new debug file. In this case we are forcing transcoding from g722 to alaw and the glitches are gone


By: David Woolley (davidw) 2013-03-20 05:59:33.529-0500

If the phone included both codecs in its SDP, it is perfectly legitimate for the codec to change mid-stream.  This actually happens normally when people send RFC 2833 DTMF.

By: David Woolley (davidw) 2013-03-20 06:09:01.485-0500

A midstream change that can confuse some phones is if there is a discontinuity in the timestamps, but no corresponding change in SSRC.  At least as of early 1.6, Asterisk only set the marker bit in this case, because there was no mechanism to propagate the SSRC between incoming and outgoing channels, when non-natively bridged.  The marker bit is not reliably transmitted, and its semantics are an oiptional opportunity to flush latency buffers, not the invalidation of the time stamp.

We only found this problem on Cisco phones, and worked round it by incremetning SSRC whenever the marker bit was set.

By: Peter Katzmann (pk16208) 2013-03-20 06:41:46.013-0500

Hello,
i can deliver you a pcap file if it will help.
The snom phones with 8.4 and 8.7 firmware series are hte problem in out case. The Siemens OpenStage seems to work reliable.
But you said when Both sides Allow the codec, but only the patton side allows the usage of ulaw, the asterisk side has the following config:
disallow=all
allow=alaw h261 h263 h263p h264, and the phone was configured to answer only with one codec.


By: Rusty Newton (rnewton) 2013-03-27 17:31:56.063-0500

Please provide a pcap and corresponding DEBUG log for a call on vanilla Asterisk.

By: Peter Katzmann (pk16208) 2013-04-02 09:33:53.822-0500

High,
the traces from 15tee march are made against vanilla build.
the pcap file should be the corresponding one

By: David Woolley (davidw) 2013-04-03 05:31:39.401-0500

The vanilla-trace.txt files shows a 415 response which is not being acknowledged, and for which the corresponding INVITE has not been logged.  I can't find this in the PCAP.

There is a re-invite to A-law on the other leg, even though the re-invite seems to have been rejected.

By: Peter Katzmann (pk16208) 2013-04-03 07:19:09.989-0500

call to myDebugLog 15/Mar/13 12:49 PM and vanilla trace

By: David Woolley (davidw) 2013-04-03 07:51:33.416-0500

The pcap seems to be missing the most important leg and the debug is missing sip debug output.

However, my guess is that Asterisk has native bridged the call (packet to packet), because the negotiated codecs are compatible. 10.107.160.11 has then either unilaterally changed the codec, or has tried to re-invite to it (i.e. the 415 is the response to that re-invite, which somehow hasn't got logged), but had the re-invite rejected, then changed the codec anyway.

Because Asterisk is native bridging, it is trusting 10.107.160.11 to only use agreed codecs, so doesn't notice the codec change and therefore does not drop out of the native bridge and start transcoding.

A work around may be to enable a feature that is incopatible with native bridging (because it needs the media stream to be inspected).

An accurate and complete capture of the leg to 10.107.160.11 is really needed to confirm that it is at fault.

By: Peter Katzmann (pk16208) 2013-04-03 08:29:27.932-0500

Ok the call leg should be here soon
An asterisk debug log with sip ? Also no problem at all if necessary

By: Peter Katzmann (pk16208) 2013-04-03 08:30:06.276-0500

call leg to filtered1512-2

By: Rusty Newton (rnewton) 2013-04-04 17:50:59.104-0500

Peter we'll probably need an Asterisk log file with SIP debug turned on as well. That log should correlate with the traffic in your two latest pcaps:

filtered1512-2.pcap

filtered1512-2-leg.pcap

If you can't get that now, then you'll need to provide new pcaps with new Asterisk log files.

Essentially we need the pcaps to show all the SIP and RTP traffic. Then there needs to be corresponding Asterisk log files that contain SIP debug output along with the VERBOSE and DEBUG message types (both turned up to level 5).

Sorry for any confusion!






By: Peter Katzmann (pk16208) 2013-04-11 03:22:59.631-0500

Hello,
two new files uploaded. PCAP Trace and corrsponding debug log as suggested.

This trace has the branch/1.8 svn asterisk as base

peter

By: Rusty Newton (rnewton) 2013-04-17 17:37:48.016-0500

Peter,

The calls in the pcap don't seem to appear in codecchange.0 log file. The calls in the pcap have call-id  *3c2683f54ccd-awzivjl30opn* and *2eede5053b53a84e065150367d4ececa@10.0.0.5:5060*. I don't see these in the log file.



By: Peter Katzmann (pk16208) 2013-04-18 03:44:52.241-0500

Aaarg my fault, it was in codecchange.3

Will upload it immediate

By: Rusty Newton (rnewton) 2013-04-23 17:01:53.330-0500

I looked through the PCAP again along with the codecchange.3 debug. A few observations

* Asterisk settles on the ALAW codec for both call legs
* About ~20 seconds after the ALAW streams start; for no obvious reason Asterisk starts sending ULAW marker packets to both endpoints involved in the call.
* Asterisk is sending the ULAW packets to both end points, but the endpoints are only talking ALAW.
* The ULAW packets are being sent from and to the same ports used in the ALAW stream
* There was no re-negotiation of Codecs. The codec did not change.
* Asterisk starts re-inviting the caller, but SDP is same on both sides so no changes happen and Asterisk ignores the SDP.
* I don't know whats going on..

Are you positive this is from your vanilla, unmodified Asterisk install?

Please provide the rtp.conf, sip.conf and any other relevant configuration files for the peers/users involved.

Please provide a new debug log and pcap, but also enable RTP debug for the Asterisk log. (Sorry I didn't have you do that originally)

By: Michael L. Young (elguero) 2013-04-23 17:56:06.401-0500

I notice that those "ULAW" packets have a different SSRC and a different Seq.  The seq number doesn't even match the sequence of the "ALAW" packets.  The Marker bit is on and the Timestamp is 0.  Weird.

By: Peter Katzmann (pk16208) 2013-04-24 08:25:00.368-0500

As requested,
-complete asterisk config.
-New trace with rtp debug enabled
sip debug enabled
debug and verbose at level 7
-asterisk 1.8.21
- pcap trace


By: Michael L. Young (elguero) 2013-04-24 09:15:46.195-0500

Peter, thanks for the debug info.  The RTP debug has what we needed to pinpoint what is happening.

Rusty, looks like it might have something to do with comfort noise (CNG) in res/res_rtp_asterisk.  The debug message is wrong to start.  In looking at the code, the debug message is displayed after it increases the sequence number by one.  The PCAP has the correct seq number.  So, this part of the code needs to be looked at in order to make sure it is behaving properly (using the right codec for the CNG) as well as we need to correct the debug message.

By: Michael L. Young (elguero) 2013-04-24 10:34:01.161-0500

Hmmm... That is what I get for rushing to make comments and rushing at looking at code.  Debug message is correct... according to the code.  Digging into this a bit further now.

By: Michael L. Young (elguero) 2013-04-24 12:34:51.201-0500

Peter, can you give this patch a try and see if this corrects the problem.  [^asterisk-21246-rtp-cng-payload-error_1.8.diff]

Thanks

By: Peter Katzmann (pk16208) 2013-04-25 06:40:17.080-0500

Logfiles with asterisk-21246-rtp-cng-payload-error_1.8.diff.

Its different but still does change the codec

By: Michael L. Young (elguero) 2013-04-25 11:52:35.361-0500

Please try this updated patch if you can.  [^asterisk-21246-rtp-cng-payload-error_1.8_v2.diff]

I appreciate your testing.

What is actually supposed to be happening is that, since you have "rtpkeepalive" set to 20 secs, Asterisk will send an empty rtp packet.  We are using a CN packet to keep the rtp alive.

The bug is that we are setting the payload to ULAW when it should have been CN.  Also, we should not be setting the marker bit like we currently are doing.  I also have fixed the debug message that has been bothering me so that we get an accurate debug message which matches the actual seqno in the pcap.

Thanks for your testing and feedback.  I was also able to finally produce these packets by setting rtpkeepalive and everything appears to be working but it is always good to have another person test it in their environment.

By: Peter Katzmann (pk16208) 2013-04-26 09:30:51.043-0500

Just for your info.
No negative side effects observed

peter

By: Michael L. Young (elguero) 2013-04-26 10:10:37.857-0500

Peter,

Are you sure you are using only the second patch?  The pcap and the debug log look the same as if it was done with the first patch.

Also, just to clarify, has your audio distortion been fixed?

Thanks

By: Peter Katzmann (pk16208) 2013-04-26 11:10:23.705-0500

Michael,

Currently not so sure.
It seems that the debian build mixed both patches.
So i have to redo it and will send report on monday.

Currently i can't hear audio problems but will look harder :)


By: Peter Katzmann (pk16208) 2013-04-29 01:47:58.410-0500

New traces now with correctly applied v2 patch

By: Michael L. Young (elguero) 2013-04-30 07:56:38.765-0500

That looks more like it.  Thanks for helping with the testing.  We will work on getting this into the code base now.