[Home]

Summary:ASTERISK-23142: Large timestamp skew in RTP stream during blind transfer
Reporter:Filip Frank (frenk77)Labels:
Date Opened:2014-01-15 05:06:22.000-0600Date Closed:
Priority:MajorRegression?
Status:Open/NewComponents:Resources/res_rtp_asterisk
Versions:11.6.0 11.7.0 13.18.4 Frequency of
Occurrence
Related
Issues:
is related toASTERISK-25268 Neither a src change or marker after (attended) transfer
Environment:Attachments:( 0) asterisk_rtp_large_transfer_skew.patch
( 1) iptel207setting.txt
( 2) rtp_instance_debug.patch
( 3) rtp_timestamp_jump.pcap
Description:I have problem with asterisk 11, I call from phone A (ip 10.76.17.130 - iptel207) to phone B(iptel106). After answer the call on B, i use blind transfer asterisk feature to phone C(iptel500). Problem is after bridge with A and C, the timestamps in RTP from asterisk to phone A jumps. In first RTP packet with jumped timestamp the market bit is set, but SSRC is not changed. This is test situation, but in real i have a problem if A is Ericsson IMS(operator O2). They drop audio from our asterisk after timestamp jump and A dont hear  C. I my attached pcap trace bad RTP is with SSRC 0x445FBF7D. I see in wireshark big skew too....

I found this in RTP RFC3550:

" All packets from a synchronization source form part of the same
     timing and sequence number space, so a receiver groups packets by
     synchronization source for playback."

Then I think timestamps cant jump without SSRC change. We not use direct media. All RTPs flow over asterisk.
Comments:By: Filip Frank (frenk77) 2014-01-15 05:07:20.197-0600

Trace from my test call.

By: Filip Frank (frenk77) 2014-01-15 05:11:10.733-0600

sip setting for phone A - iptel207

By: Matt Jordan (mjordan) 2014-01-15 08:55:37.781-0600

I don't believe this is a bug.

Looking at SSRC 0x445FBF7D, at at 6.039734 a period of time occurs where Asterisk is not sending RTP to the destination. At 6.244873, we resume sending.

The timestamp jump for the SSRC in question reflects reality: during the time period where the bridge is broken and the transfer is being performed, there is a space in time where no RTP is sent from Asterisk to the other source. When a bridge is re-established, the delta in the timestamp is reflective of the last RTP that was sent. So the combination of SSRC, timestamp, and sequence number is correct for that stream.

From Asterisk's perspective, the RTP flow from itself to the UA that it established a session with hasn't changed. The fact that it is bridged with another source is immaterial: Asterisk is a B2BUA and the RTP stream established with the other UA in the bridge is an entirely different RTP stream. That shouldn't result in another SSRC/timestamp/seq no for the stream for the other UA, which is unbroken.



By: Filip Frank (frenk77) 2014-01-15 14:57:11.589-0600

I think milions of milisecond jump do not reflect reality of 200ms free space i RTP between bridge.....

By: Matt Jordan (mjordan) 2014-01-15 16:42:30.617-0600

Looking at the two packets I referred to:

|| Time || Source || Destination || SSRC || Seq || Time ||
| 6.039734 | 10.76.17.2 | 10.76.17.130 | 0x445FBF7D | 10492 | 23720 |
| 6.244873 | 10.76.17.2 | 10.76.17.130 | 0x445FBF7D | 10493 | 25360 |

So there isn't a jump here.

I did, however, take another look at the RTP stream using Wireshark's RTP analyzer, and it did point out the packet that you were referring to - which actually occurs at 7.374434. At that point in time, the timestamp does jump from 33680 to 1038379896, which definitely isn't correct. So the bug is at that point, and not where I originally looked.

As an aside, with large pcaps, it does help us if you point out where you think the problem is, as opposed to letting us dig through it and try to find it ourselves (or, in this case, miss it the first time around...)



By: Filip Frank (frenk77) 2014-01-16 01:26:45.696-0600

Sorry for this, i am not see the first short jump because i searched only big jumps. I try write you affected packet numbers in some next traces. And thank you for bug confirmation.

By: Filip Frank (frenk77) 2014-02-25 06:51:43.708-0600

I made attached rtp_instance_debug.patch to improve debugging, and with it, I've found that problem seems to be in res_rtp_asterisk.c, in ast_rtp_raw_write
function code(between preraw5 and preraw6 debug message in my patch):

if (ast_test_flag(frame, AST_FRFLAG_HAS_TIMING_INFO)) {
rtp->lastts = frame->ts * rate;
}

Here rtp->lastts is mistakenly changed, but I don't know whether it's because AST_FRFLAG_HAS_TIMING_INFO is errorenously
enabled, or it's another problem..

Here is a log with my debug patch:

Feb 25 13:38:22 pbxtest1 asterisk[28291]: VERBOSE[28341][C-00000000]: res_rtp_asterisk.c:2628 in ast_rtp_raw_write: RTP instance '0x7ff5206d8718', Sent RTP packet to      10.76.17.132:2234 (type 08, seq 023306, ts 350660800, len 000160)
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2502 in ast_rtp_raw_write: RTP preraw instance '0x7ff5206d8718', seqno is 23307, lastts is 350660800
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2515 in ast_rtp_raw_write: RTP preraw2 instance '0x7ff5206d8718', seqno is 23307, lastts is 350660800 , pred is 350660960
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2518 in ast_rtp_raw_write: RTP preraw3 instance '0x7ff5206d8718', seqno is 23307, lastts is 350660952 , pred is 350660960
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2524 in ast_rtp_raw_write: RTP preraw4 instance '0x7ff5206d8718', seqno is 23307, lastts is 350660960 , pred is 350660960
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2567 in ast_rtp_raw_write: RTP preraw5 instance '0x7ff5206d8718', seqno is 23307, lastts is 350660960
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2578 in ast_rtp_raw_write: RTP preraw6 instance '0x7ff5206d8718', seqno is 23307, lastts is 350660960
Feb 25 13:38:22 pbxtest1 asterisk[28291]: VERBOSE[28341][C-00000000]: res_rtp_asterisk.c:2628 in ast_rtp_raw_write: RTP instance '0x7ff5206d8718', Sent RTP packet to      10.76.17.132:2234 (type 08, seq 023307, ts 350660960, len 000160)
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2175 in ast_rtp_update_source: RTP instance, '0x7ff5206d8718', Setting the marker bit due to a source update
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28310][C-00000000]: res_rtp_asterisk.c:4004 in ast_rtp_remote_address_set: Setting RTCP address on RTP instance '0x7ff5206d8718'
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28310][C-00000000]: res_rtp_asterisk.c:3923 in ast_rtp_prop_set: Ignoring duplicate RTCP property on RTP instance '0x7ff5206d8718'
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2175 in ast_rtp_update_source: RTP instance, '0x7ff5206d8718', Setting the marker bit due to a source update
Feb 25 13:38:22 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2175 in ast_rtp_update_source: RTP instance, '0x7ff5206d8718', Setting the marker bit due to a source update
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2502 in ast_rtp_raw_write: RTP preraw instance '0x7ff5206d8718', seqno is 23308, lastts is 350660960
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2515 in ast_rtp_raw_write: RTP preraw2 instance '0x7ff5206d8718', seqno is 23308, lastts is 350660960 , pred is 350661120
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2518 in ast_rtp_raw_write: RTP preraw3 instance '0x7ff5206d8718', seqno is 23308, lastts is 350663440 , pred is 350661120
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2567 in ast_rtp_raw_write: RTP preraw5 instance '0x7ff5206d8718', seqno is 23308, lastts is 350663440
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2578 in ast_rtp_raw_write: RTP preraw6 instance '0x7ff5206d8718', seqno is 23308, lastts is 80
Feb 25 13:38:23 pbxtest1 asterisk[28291]: VERBOSE[28341][C-00000000]: res_rtp_asterisk.c:2628 in ast_rtp_raw_write: RTP instance '0x7ff5206d8718', Sent RTP packet to      10.76.17.132:2234 (type 08, seq 023308, ts 000080, len 000160)
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2502 in ast_rtp_raw_write: RTP preraw instance '0x7ff5206d8718', seqno is 23309, lastts is 80
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2515 in ast_rtp_raw_write: RTP preraw2 instance '0x7ff5206d8718', seqno is 23309, lastts is 80 , pred is 240
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2518 in ast_rtp_raw_write: RTP preraw3 instance '0x7ff5206d8718', seqno is 23309, lastts is 240 , pred is 240
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2567 in ast_rtp_raw_write: RTP preraw5 instance '0x7ff5206d8718', seqno is 23309, lastts is 240
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2578 in ast_rtp_raw_write: RTP preraw6 instance '0x7ff5206d8718', seqno is 23309, lastts is 240
Feb 25 13:38:23 pbxtest1 asterisk[28291]: VERBOSE[28341][C-00000000]: res_rtp_asterisk.c:2628 in ast_rtp_raw_write: RTP instance '0x7ff5206d8718', Sent RTP packet to      10.76.17.132:2234 (type 08, seq 023309, ts 000240, len 000160)
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2502 in ast_rtp_raw_write: RTP preraw instance '0x7ff5206d8718', seqno is 23310, lastts is 240
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2515 in ast_rtp_raw_write: RTP preraw2 instance '0x7ff5206d8718', seqno is 23310, lastts is 240 , pred is 400
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2518 in ast_rtp_raw_write: RTP preraw3 instance '0x7ff5206d8718', seqno is 23310, lastts is 400 , pred is 400
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2567 in ast_rtp_raw_write: RTP preraw5 instance '0x7ff5206d8718', seqno is 23310, lastts is 400
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2578 in ast_rtp_raw_write: RTP preraw6 instance '0x7ff5206d8718', seqno is 23310, lastts is 400
Feb 25 13:38:23 pbxtest1 asterisk[28291]: VERBOSE[28341][C-00000000]: res_rtp_asterisk.c:2628 in ast_rtp_raw_write: RTP instance '0x7ff5206d8718', Sent RTP packet to      10.76.17.132:2234 (type 08, seq 023310, ts 000400, len 000160)
Feb 25 13:38:23 pbxtest1 asterisk[28291]: DEBUG[28341][C-00000000]: res_rtp_asterisk.c:2502 in ast_rtp_raw_write: RTP preraw instance '0x7ff5206

By: Filip Frank (frenk77) 2014-02-25 06:54:16.347-0600

Added Patch for better debug RTP

By: Olle Johansson (oej) 2014-07-02 04:41:08.733-0500

There is definitely a problem here. For some reason we take the incoming time stamp as the source of timing. On my system we suddenly jump backwards in time. My call scenario:

Alice calls bob
Alice puts Bob on hold
Asterisk plays moh for bob
-> Every RTP packet has a TS that follows Alices incoming TS + 160 which is fine
Alice unholds bob
-> Asterisk suddenly takes the incoming TS as a source and resets TS on the outbound call leg

I also think the HAS_TIMING_INFO is the problem here. It overrides all other checks if the TS is sane
and just jumps in time. I wonder where that code came from, what it is supposed to solve. As Matt say
the RTP streams are unrelated and taking the timing from the FR as exact timing is bad.

By: Olle Johansson (oej) 2014-07-02 04:46:45.096-0500

This code was introduced in the jitterbuffer patch back in 2006
http://lists.digium.com/pipermail/svn-commits/2006-May/014123.html

I don't see how the jitterbuffer can or should affect our outbound stream.

By: Olle Johansson (oej) 2014-07-04 01:58:34.913-0500

There has been a discussion about this issue on the mailing list and it seems like we are in agreement of dumping the code that resets the time stamps on rtp writes. No one seems to remember why it was comitted as part of the jitter buffer work.



By: Filip Frank (frenk77) 2014-07-04 03:27:54.767-0500

I add my patch, we use it on about 70 of our customers PBX, without any problem. But we not using jitterbuffer for SIP, because it causes more errors in audio especially with transfers.

By: David Woolley (davidw) 2015-02-26 10:10:53.134-0600

I think the real problem here is that Asterisk is failing to pass the SSRC through and timestamps should be interpreted relative to a particular SSRC.  We did a hack for this by adding an option that added one to the transmitted SSRC whenever there was a source change (actually attempt to set the marker bit).  However we found that whilst this helped with some peers, it caused problems for others.

The transfer_skew patch, here, would actually be undesirable for anyone dealing with Lync.  We are debugging an older version of Asterisk that seems to be failing to honour the incoming timestamps and we are getting situation where a gap is introduced by Lync (it turns a single comfort noise frame into a silence frame followed by a gap in the media.  Our Asterisk is behaving as this patch would have it do, and Lync does not like the result, when the stream is fed back to it.  It goes into a recovery mode and deletes a second's worth of media after the original gap.

By: David Woolley (davidw) 2015-02-26 10:39:21.893-0600

For info:  I think the reason we are not getting timestamps passed through Is the won't fix issue ASTERISK-12042, but the jitter buffer in question is that in Lynx, not in Asterisk.

By: David Woolley (davidw) 2015-03-03 13:17:02.605-0600

On further analysis, ASTERISK-12042 only applies to timestamp and seqno, but the field that actually propagates the timestamp information is delivery and the problem we are having with this looks like it is that if a smoother is present, and there is a partial frame in its memory, it will continue to dead reckon the output value of delivery, based only on sample counts, ignoring the input one.  When there is a gap in the media, Lync sees a negative timestamp step because the value of delivery hasn't accounted for the gap, and resets its jitter buffer, dumping an extra second of media.

I haven't found a fix for this in any SVN version but our working version is rather obsolete, so I may not be able to test this on a supported version.

By: dtryba (dtryba) 2015-06-05 10:15:18.620-0500

I experience this problem (with an Ericcson IMS upstream) also when doing attended transfers.
Workaround I used is to set an other packetization interval towards the Ericsson (either 10 or 40ms work fine). This forces asterisk to create its own timestamps/ssrc, jumps timestamps (both backwards as forwards) in the offending input rtp streams (an ancient asterisk 1.4 with Aastra 67xx as reception and Gigaset DECT  as transfer target are have a high chance of triggering jumps) aren't propagated to the Ericcson.