[Home]

Summary:ASTERISK-26259: audohook: Crash when duplicating frame
Reporter:Daniel Friedman (dani@3xton.com)Labels:
Date Opened:2016-08-02 08:32:03Date Closed:2016-08-04 16:41:26
Priority:MajorRegression?
Status:Closed/CompleteComponents:Core/General
Versions:13.10.0 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Asterisk ARI dialer integration with Dynamics CRM and call recordings.Attachments:( 0) backtrace.txt
( 1) backtrace.txt
Description:Hello,

I have integrated Asterisk 13.10.0 with our Microsoft Dynamics CRM. We are using ARI to originate the calls in our call center with call recordings. I am facing asterisk service crashes when the peak is high (200+ channels) but it happens once or twice a day.

I am attaching a core dump of the last crash. can you help me to figure out what is the problem?

Thank you,

Daniel Friedman
Trixton LTD.
Comments:By: Asterisk Team (asteriskteam) 2016-08-02 08:32:03.827-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Daniel Friedman (dani@3xton.com) 2016-08-02 08:37:40.550-0500

backtrace of a core dump

By: Joshua C. Colp (jcolp) 2016-08-02 08:43:22.848-0500

Thank you for taking the time to report this bug and helping to make Asterisk better. Unfortunately, we cannot work on this bug because your description did not include enough information. Please read over the Asterisk Issue Guidelines [1] which discusses the information necessary for your issue to be resolved and the format that information needs to be in. We would be grateful if you would then provide a more complete description of the problem. At a minimum, we need:

1. The specific steps or actions you took that caused you to encounter the problem.
2. The behavior you expected and the location of documentation that led you to that expectation.
3. The behavior you actually encountered.

To demonstrate the issue in detail, please include Asterisk log files generated per the instructions on the wiki [2]. If applicable, please ensure that protocol-level trace debugging is enabled, e.g., 'sip set debug on' if the issue involves chan_sip, and configuration information such as dialplan and channel configuration.

Thanks!

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines

[2] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information



By: Daniel Friedman (dani@3xton.com) 2016-08-02 10:16:26.522-0500

hello,

thank you for your prompt reply.
i base my case on this ticket: ASTERISK-24147 that although it was fixed it seems that i face the same situation.

it is very hard for me to provide more debug details because it is a loaded production server.
here is the current status:

Setting max files open to 200000
System uptime: 5 hours, 18 minutes, 57 seconds
Last reload: 5 hours, 18 minutes, 57 seconds

49 active calls
98 active channels



i thought the backtrace of the core dump would be enough.

all i can add is that i am running a self compiled asterisk source code with all the necessary libraries on a Centos 6.7 64 bit.
i am using the Freepbx framework with my own customization of the dialplan.
i am running the Asterisk server on a dedicated server with 8 CPUs and 24 Gb of Memory.
i can add that i am using NFS to share the recordings files.

i hope that is sufficient for you.

thank you,

Dani.

By: Joshua C. Colp (jcolp) 2016-08-02 10:19:33.091-0500

Can you describe in more detail how it is being used? For example I see either call recording or spying in use, but it's not mentioned in your description.

By: Daniel Friedman (dani@3xton.com) 2016-08-02 10:45:34.516-0500

Hi,

yes, we are recording all the calls and spying on some occasionally.
furthermore, we are hanging up the calls through our CRM if the agent is trying to call through
another web page.
i will stop this behaviour on Sunday and test it for few days to see if the crashes returns.

meanwhile, is the backtrace file that i provided sufficient? i ran this command to issue it:
gdb -se "asterisk" -ex "bt full" -ex "thread apply all bt" --batch -c core.pbx-il-2016-08-02T12:50:39+0300 > /tmp/backtrace.txt

thank you,

Dani

By: Joshua C. Colp (jcolp) 2016-08-02 10:51:42.135-0500

Yes, it shows what appears to be memory corruption so understanding what's going on helps eliminate areas.

By: Daniel Friedman (dani@3xton.com) 2016-08-04 08:24:46.226-0500

Hi,

an update:

we have a mysql replication of our cdr database. as i mentioned i am using the Freepbx framework,
and it seems that the default install of the cdr database is based on MyISAM engine. it appears that it does
not perform well on a loaded server. i have altered the cdr table to the INNODB engine and since then
the asterisk server did not crashed (touch wood !!!).

maybe it concerns to my hard disks (SATA) which are rather slow comparing to SAS disks,
but the INNODB engine helps a lot to reduce the timers of the writing to the database.

here is an output of the system since the last crash:

Setting max files open to 200000
System uptime: 1 day, 13 hours, 41 minutes, 18 seconds
Last reload: 3 hours, 7 minutes, 58 seconds

70 active calls
143 active channels

I definitely touched a memory leak bug, but as the I/O has been improved significantly
I am not reaching it anymore.

I will report again if I will suffer from another crash.

thank you,

Dani



By: Daniel Friedman (dani@3xton.com) 2016-08-10 03:54:50.766-0500

a backtrace

By: Daniel Friedman (dani@3xton.com) 2016-08-10 03:54:55.230-0500

Hello,

I had a crash this morning after almost a week up and running. I cam attaching a backtrace of the core dump.
The Asterisk version is 13.9.1.
Can you take a look?

Thank you,

Daniel Friedman
Trixton LTD.

By: Asterisk Team (asteriskteam) 2016-08-10 03:54:55.453-0500

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.

By: Rusty Newton (rnewton) 2016-08-10 09:38:30.435-0500

Your backtrace appears to contain a memory corruption. We need one or both of the following items to continue investigation of the issue:
1. Valgrind output. See https://wiki.asterisk.org/wiki/display/AST/Valgrind for instructions on how to use Valgrind with Asterisk.
2. MALLOC_DEBUG output. See https://wiki.asterisk.org/wiki/display/AST/MALLOC_DEBUG+Compiler+Flag for instructions on how to use the MALLOC_DEBUG option.

Note that MALLOC_DEBUG and Valgrind are mutually exclusive options. Valgrind output is preferable, but will be more system resource intensive and may be difficult to get on a production system. In such a case, you may have better luck getting the necessary output from MALLOC_DEBUG.



By: Asterisk Team (asteriskteam) 2016-08-24 12:00:01.419-0500

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines