[Home]

Summary:ASTERISK-24208: Channels with CDR Information Remain Active Even After ConfBrige Is Ended
Reporter:Frankie Chin (fchin)Labels:
Date Opened:2014-08-11 20:48:02Date Closed:2014-09-14 21:57:52
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Applications/app_confbridge
Versions:13.0.0-beta1 Frequency of
Occurrence
Constant
Related
Issues:
is related toASTERISK-24241 crash: CDRs recursively attempt to update Party B information in a multi-party bridge, overrunning the stack
Environment:Ubuntu 10.04Attachments:( 0) full
( 1) leaked_skewed.7z.001
Description:I have one Asterisk running on a physical Ubuntu machine, and 20 other Asterisks running on virtual Ubuntu machines. The virtual Asterisks are registered to the physical Asterisk using IAX.

AMI is used to originate calls to all the virtual Asterisks and join them into a conference bridge hosted in the physical Asterisk. Once all the participants join the conference, the physical Asterisk will be taking up close 90% of the CPU usage.

The real concern is that.... even after ending the conference (using "confbridge kick [ID] all" CLI command), the channels with CDR information will still remain active (as indicated by the "cdr show active" command). Also, the physical Asterisk will still be taking 90% of the CPU usage. If I type the "cdr set debug on" in the CLI console, the screen will be loaded with seemingly endless loop of activities (Please see my comment).

This issue was first found when I was using Asterisk Version 12. I just tested it using Version 13 Beta 1 and the problem persists. Note: If I only invite 10 virtual Asterisks into the conference, then everything seems to be very normal, i.e. CPU is around 1%, all the channels are cleared after the conference is ended.

The AMI Originate Action (for one virtual Asterisk):
Action: Originate
Channel: IAX2/vm1/1001
Exten: 1
Priority: 1
Context: conference
Async: true
CallerID: AMI

The dial plan at the physical asterisk looks like this:
[conference]
exten => 1,1,ConfBridge(1234,,,conf_menu)

The dial plan at the virtual asterisk looks like this:
[internal]
exten => 1001,1,Answer
exten => 1001,n,Wait(180)
exten => 1001,n,Hangup()
Comments:By: Frankie Chin (fchin) 2014-08-11 21:08:46.386-0500

The screenshot I attached didn't show up, so I copied a portion of the CLI console output after running the "cdr set debug on" command here...

[Edit by Rusty - Trimmed excessive inline debug and added noformat tags]
{noformat}
0xb4582df4 - Created CDR for channel IAX2/vm1-16680
0xb4582df4 - Transitioning CDR for IAX2/vm1-16680 from state NONE to Single
0xb4582df4 - Set answered time to 1407807886.152873
0xb4582df4 - Transitioning CDR for IAX2/vm1-16680 from state Single to Bridged
0xb4582df4 - Party A IAX2/vm1-16680 has new Party B IAX2/vm17-23455
0xb458327c - Created CDR for channel IAX2/vm1-16680
0xb458327c - Transitioning CDR for IAX2/vm1-16680 from state NONE to Single
0xb458327c - Set answered time to 1407807886.155718
0xb458327c - Transitioning CDR for IAX2/vm1-16680 from state Single to Bridged
0xb458327c - Party A IAX2/vm1-16680 has new Party B IAX2/vm17-23455
{noformat}

By: Matt Jordan (mjordan) 2014-08-12 09:26:41.717-0500

We require a complete debug log to help triage the issue. This document will provide instructions on how to collect debugging logs from an Asterisk machine for the purpose of helping bug marshals troubleshoot an issue: https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information



By: Matt Jordan (mjordan) 2014-08-12 09:28:16.779-0500

*Note*: All channels have CDRs, so this likely has nothing to do with CDRs. What's more, CDRs are built from channel snapshots, not the channels themselves and do not affect the lifetime of the channels. If you have lingering channels, that's typically a channel reference leak.

In addition to the DEBUG log, please enable REF_DEBUG in your {{menuselect}} build options, and provide the refs log generated in your Asterisk log directory. Note that we don't need to see 100 channels - only a single channel that reproduces the problem is sufficient.

By: Frankie Chin (fchin) 2014-08-14 01:14:38.086-0500

Hi Matt, please find the attached "full" log as requested. I followed the steps in the "Collecting Debug Information" page except that I didn't turn on the SIP debug. I hope this would help. Please let me know if you want me to enable CDR debug by "cdr set debug on".

I have also enabled the REF_DEBUG option in the menuselect, rebuilt and reinstalled Asterisk. The refs log generated is extremely huge ~700 MB. Could you please explain in more details how I  can make it to generate debug info only for a single channel? Thanks.

By: Frankie Chin (fchin) 2014-08-18 17:18:22.796-0500

I'm trying to assign this JIRA issue back to Matt Jordan.

By: Matt Jordan (mjordan) 2014-08-21 09:54:58.327-0500

# The FULL log is helpful. When this occurs, does 'core show channels' list any active channels?
# You can process a REF_DEBUG log to only show the leaked objects, which will be a lot smaller than the full file. In {{contrib/scripts}}, run the following:
{format}
contrib/scripts$ ./refcounter.py -n -f /var/log/asterisk/refs > leaked_skewed.txt
{format}
Attach the resulting file to this issue, which should be a lot smaller than several hundred megabytes.

By: Frankie Chin (fchin) 2014-08-21 21:35:25.720-0500

1. After ending the conference, "core show channels" listed 0 active channels.
2. Running the "refcounter.py" script this time only reduced the size from ~400MB to ~220MB. I'll try attach the "leaked_skewed.txt" to this JIRA issue.

Btw, I found a way to avoid this issue, i.e. by turning off CDR logging in "cdr.conf".

By: Frankie Chin (fchin) 2014-08-21 21:41:32.986-0500

The "leaked_skewed.txt" is compressed using 7zip and attached to this issue.

By: Matt Jordan (mjordan) 2014-09-03 10:41:42.212-0500

I have a feeling this is related to ASTERISK-24241. Can you apply the patch on that issue and see if it alleviates some of the problem you're seeing?

By: Frankie Chin (fchin) 2014-09-03 17:41:21.860-0500

Hi Matt, thanks for the update. I'll try out the patch and let you know the outcome.

By: Frankie Chin (fchin) 2014-09-09 18:45:48.769-0500

Matt, I just tested out the patch and it seems working pretty good. The CPU usage was around 30% during the conference and went down to 1% after the conference was ended. "cdr set debug on" was no longer showing endless activities. Thanks a lot!