[Home]

Summary:ASTERISK-21394: [patch] - Fundamental changes to CDR within single asterisk family (1.8) during externally initiated blind transfers with an h extension present
Reporter:Ishfaq Malik (ishmalik)Labels:
Date Opened:2013-04-09 03:45:22Date Closed:2013-04-30 17:42:38
Priority:MajorRegression?Yes
Status:Closed/CompleteComponents:CDR/General
Versions:1.8.18.0 1.8.20.0 1.8.21.0 Frequency of
Occurrence
Constant
Related
Issues:
must be completed before resolvingASTERISK-21773 Asterisk 1.8.22.0 Open Blockers
must be completed before resolvingASTERISK-21774 Asterisk 11.4.0 Open Blockers
is caused byASTERISK-19173 All blind transfers failing on 1.8.9.0-rc1
Environment:Centos 5Attachments:( 0) Basic_SIP_blind_transfer_debug_with_patch_test_results.txt
( 1) fix_missing_blindXfer_cdr
( 2) fix_missing_blindXfer_cdr2
( 3) Masterwith_patch.csv
( 4) queue_cdr_with_patch.rar
( 5) Two_recordings__are_identical__in_playback_and_bad_call_direction.pdf
Description:The way that asterisk handles the CDR for blind transfers has changed between version 1.8.7.0 and 1.8.18.0. The call path is: inbound call from SIP provider -> asterisk server -> SIP peer -> transfer to another SIP Peer. The transfer is done by using the dedicated keys on the hardware and not the features.conf method.

In version 1.8.7.0 this would produce 2 entries in the CDR, 1 showing the time the inbound call started and which context it was handles by, the second showing the the context the peer belongs to and looking like an extension to extension call.

We tried upgrading to 1.8.18.0 and did some regression testing and found that the CDR now only produces a single entry with the details all pertaining to the second leg of the call (bar the duration which is for the full life of the call). It has no reference to the inbound context that initially handled the call.

This is a pretty big change to be inside a single branch and I'm guessing it wasn't intentional.

Also, all our servers are installed using Puppet so the configuration on them all is the same and we use RealTime Architecture from a single database.

Here is an example (with sanitised data):

Here is how an inbound blind transferred call is recorded in 1.8.7.0

{noformat}
*************************** 1. row ***************************
  calldate: 2013-04-04 11:42:09
      clid:
       src: <my-mobile>
       dst: s
  dcontext: <inbound-number-context>
   channel: SIP/xxx.xxx.xxx.xxx-00002ba3
dstchannel: SIP/501-00002ba7
   lastapp: Dial
  lastdata: SIP/501&SIP/511&SIP/531,10
  duration: 22
   billsec: 12
disposition: ANSWERED
  amaflags: 3
accountcode:
 userfield:
*************************** 2. row ***************************
  calldate: 2013-04-04 11:42:31
      clid:
       src: <my-mobile>
       dst: 504
  dcontext: <peer-context>
   channel: SIP/xxx.xxx.xxx.xxx-00002ba3
dstchannel: SIP/504-00002bbb
   lastapp: Dial
  lastdata: SIP/504,20
  duration: 7
   billsec: 5
disposition: ANSWERED
  amaflags: 3
accountcode:
 userfield:
2 rows in set (0.03 sec)
{noformat}






And here is how it is recorded in 1.8.18.0
{noformat}
*************************** 1. row ***************************
  calldate: 2013-04-04 11:39:24
      clid:
       src: <my-mobile>
       dst: 503
  dcontext: <peer-context>
   channel: SIP/xxx.xxx.xxx.xxx-000000ea
dstchannel: SIP/503-000000ed
   lastapp: Dial
  lastdata: SIP/503,20
  duration: 34
   billsec: 9
disposition: ANSWERED
  amaflags: 3
accountcode:
 userfield:
{noformat}


In the newer version there is only one entry which has the combined call
time in it but the dcontext corresponds with the second leg of the call
which means we can't see what number was dialled.
Comments:By: Rusty Newton (rnewton) 2013-04-18 09:36:26.991-0500

We need to reproduce this. You'll need to help provide us with exact steps to do that.

The steps you described so far:

{quote}
The call path is: inbound call from SIP provider -> asterisk server -> SIP peer -> transfer to another SIP Peer. The transfer is done by using the dedicated keys on the hardware and not the features.conf method.
{quote}

Can you please describe how each of these steps happens and provide the dialplan, AGI/AMI steps that facilitates each?

If you can also provide an Asterisk debug log (VERBOSE at level 5 at least) and SIP PCAP of the calls described that will help as well.

Please attach any output or large blocks of information in separate .txt files to the issue.

By: Steve Davies (one47) 2013-04-25 04:19:50.516-0500

This behaviour is changed by the patch here:

https://reviewboard.asterisk.org/r/1685

SVN r352199

I will continue to investigate and see if I can find a solution to the original and the current issue.



By: Ishfaq Malik (ishmalik) 2013-04-25 04:43:05.504-0500

Steve, does this negate the need for more information from me?

By: Steve Davies (one47) 2013-04-25 05:25:48.329-0500

Additional information is probably not needed. Any SIP REFER will cause the issue with the simplest of dialplans.

Basically the above mentioned patch (correctly) tries to stop the 'h' extension running when a call is returned to the PBX as part of a technology-specific blind-transfer. It does this by setting the AST_FLAG_BRIDGE_HANGUP_DONT flag - Sadly this has side-effects other than just stopping the 'h' extension from running, including the discarding of CDR data that you observed.

I will upload a proposed patch against 1.8 that hopefully carries out the same function as the original patch, but without setting that flag. It is being uploaded as compiling, but untested, and I will feed-back my results. Note that I do not use 'h' extensions on my system here.


By: Steve Davies (one47) 2013-04-25 06:52:19.619-0500

Uploaded patch works for me, but probably needs to be checked by the people who had problems solved from ASTERISK-19173.

By: Matt Jordan (mjordan) 2013-04-25 09:46:39.364-0500

Thanks for running this down and creating a patch Steve. I think this goes to show just how brittle CDRs can be in the current bridging code.

In your patch, it looks like the following should be combined:

{noformat}
- if (ast_test_flag(&config->features_caller, AST_FEATURE_NO_H_EXTEN)) {
+ if (chan->_softhangup & (AST_SOFTHANGUP_ASYNCGOTO | AST_SOFTHANGUP_UNBRIDGE)) {
+ /*
+ * If the bridge was broken for a hangup that isn't real,
+ * then don't run the h extension, because the channel isn't
+ * really hung up. This should really only happen with AST_SOFTHANGUP_ASYNCGOTO,
+ * but it doesn't hurt to check AST_SOFTHANGUP_UNBRIDGE either.
+ */
+ h_context = NULL;
+ } if (ast_test_flag(&config->features_caller, AST_FEATURE_NO_H_EXTEN)) {
h_context = NULL;
} else if (ast_exists_extension(chan, chan->context, "h", 1,
S_COR(chan->caller.id.number.valid, chan->caller.id.number.str, NULL))) {
{noformat}

to:

{noformat}
- if (ast_test_flag(&config->features_caller, AST_FEATURE_NO_H_EXTEN)) {
+ /*
+ * If the bridge was broken for a hangup that isn't real,
+ * then don't run the h extension, because the channel isn't
+ * really hung up. This should really only happen with AST_SOFTHANGUP_ASYNCGOTO,
+ * but it doesn't hurt to check AST_SOFTHANGUP_UNBRIDGE either.
+ */
+ if ((chan->_softhangup & (AST_SOFTHANGUP_ASYNCGOTO | AST_SOFTHANGUP_UNBRIDGE)) || (ast_test_flag(&config->features_caller, AST_FEATURE_NO_H_EXTEN))) {
h_context = NULL;
} else if (ast_exists_extension(chan, chan->context, "h", 1,
S_COR(chan->caller.id.number.valid, chan->caller.id.number.str, NULL))) {
{noformat}

Otherwise, we could set {{h_context}} to NULL, fail the check for {{AST_FEATURE_NO_H_EXTEN}}, do a look up of the {{h}} extension, find that it exists, and end up setting {{h_context}} to something other than NULL.

By: Steve Davies (one47) 2013-04-25 09:56:13.460-0500

Yes and no. Good spot on the missing "else" but I did intentionally leave it as a separate condition for readability.

I am not sure what the Asterisk coding standards say about such things? I will fix that and re-upload my version. Your suggested change also works of course :)

By: Matt Jordan (mjordan) 2013-04-25 10:23:03.923-0500

There isn't anything defined in the coding standards about that - it gets pretty subjective quickly :-) Personally, I'm fine with it either way.

By: Rusty Newton (rnewton) 2013-04-25 11:24:56.204-0500

Ishfaq can you test the patch and report back?

By: Ishfaq Malik (ishmalik) 2013-04-26 03:26:57.364-0500

Have applied the patch to 1.8.21.0 and re-tested. CDR behaviour is back to what it was in 1.8.7.0

All good from here.

By: David Brillert (aragon) 2013-04-26 09:40:08.666-0500

Ishfaq, you need to look at the actions available in your ticket and enter feedback so this ticket goes back into open status.  In feedback status we could all be waiting for a while...

By: Ishfaq Malik (ishmalik) 2013-04-26 09:45:14.489-0500

Have applied the patch to 1.8.21.0 and re-tested. CDR behaviour is back to what it was in 1.8.7.0

All good from here.

(Apologies for not spotting the feedback button)

By: Matt Jordan (mjordan) 2013-04-26 21:17:05.164-0500

Given how tricky this particular section of code is, I'm going to go ahead and post this up to Review Board.

By: David Brillert (aragon) 2013-04-29 08:34:50.460-0500

I tested the patch.
It didn't appear to break anything but it didn't the issues in blind transfers in CDR either.
In a basic test.
1. Call inbound arriving from external trunk rings ext 6000
2. 6000 answers
3. 6000 does SIP blind transfer to the external trunk with dial plan 8001 to ring a remote device.
* Mixmonitor is used to record the call

Results:
1. Steps 2 and 3 both get logged by CDR with the same unique ID
2. Step 3 gets saved with internal call direction although the call was outgoing
3. Mixmonitor saves the same playback results to both call legs (presumably because the call legs have the same unique ID).

I'll attach a pdf showing the results in a reporting summary
I'll attach a verbose debug log with SIP debugging and the CDR record


By: David Brillert (aragon) 2013-04-29 08:48:31.260-0500

Test results uploaded

By: David Brillert (aragon) 2013-04-29 09:00:42.734-0500

Also the patch has no effect on fixing issue ASTERISK-21486
Attaching test results with patch also.
I presume that this patch does not include any caveats to handle blind transfers out of an agent extension...

In a basic test.
1. Call inbound arriving from external trunk rings queue and answered by agent at extension 6000
2. 6000 answers
3. 6000 does SIP blind transfer to the external trunk with dial plan 8001 to ring a remote device.

   Mixmonitor is used to record the call

Results:
1. The call to the queue gets logged
2. The SIP blind transfer leg does not appear anywhere in the CDR.  Making billing impossible.
3. Mixmonitor saves both legs of the SIP blind transfer

Test results uploaded as queue cdr with patch.rar

By: Matt Jordan (mjordan) 2013-04-29 09:35:08.027-0500

{quote}
1. Call inbound arriving from external trunk rings ext 6000
2. 6000 answers
3. 6000 does SIP blind transfer to the external trunk with dial plan 8001 to ring a remote device.
Mixmonitor is used to record the call
Results:
1. Steps 2 and 3 both get logged by CDR with the same unique ID
2. Step 3 gets saved with internal call direction although the call was outgoing
3. Mixmonitor saves the same playback results to both call legs (presumably because the call legs have the same unique ID).
I'll attach a pdf showing the results in a reporting summary
I'll attach a verbose debug log with SIP debugging and the CDR record
{quote}

This isn't the issue that the reporter was facing. I mentioned on ASTERISK-21486 that it would be worth testing this patch, but it sounds as if it doesn't resolve your issue.

A few other notes:
# A unqiue ID is not unique to a CDR. That is a channel unique ID. You can construct a full unique identifier based on the channel unique ID, the CDR sequence number, and - if you really need it - the linkedid.
# Asterisk has no concept of inbound versus outbound, nor will it ever. The concept of an inbound call versus an outbound call has to be performed by the dialplan writer. The fact that your billing system is flagging something as 'incoming' is a problem with your billing system.

Since your issues are not related to this problem, and since it appears as if fixing the issues you are facing would require fundamental changes in the behavior of Asterisk's CDRs, it does not appear as if there is anything else required on this issue.

By: David Brillert (aragon) 2013-04-29 10:09:36.465-0500

Hi Matt,

I posted details on my other bug report here because the two issues were marked related and I was hopeful this patch would fix both (not because I was trying to hijack this bug report).
{quote}
1.    A unique ID is not unique to a CDR. That is a channel unique ID. You can construct a full unique identifier based on the channel unique ID, the CDR sequence number, and - if you really need it - the linkedid.
{quote}
Fair enough, I will talk to the developers of the billing system.
{quote}
2.    Asterisk has no concept of inbound versus outbound, nor will it ever. The concept of an inbound call versus an outbound call has to be performed by the dialplan writer. The fact that your billing system is flagging something as 'incoming' is a problem with your billing system.
{quote}
Fair enough, I will talk to the developers of the billing system.

{quote}
'Since your issues are not related to this problem, and since it appears as if fixing the issues you are facing would require fundamental changes in the behavior of Asterisk's CDRs, it does not appear as if there is anything else required on this issue.'
{quote}

I understand that my issue is no longer related to this bug report since they appear to be two separate issues.
You should remove the relationship then so there is no confusion.

By: Rusty Newton (rnewton) 2013-04-29 16:20:47.602-0500

{quote}
I understand that my issue is no longer related to this bug report since they appear to be two separate issues.
You should remove the relationship then so there is no confusion.
{quote}

Removed