[Home]

Summary:ASTERISK-26443: Asterisk Random Core Dump / Crash
Reporter:Ted Pukas (nivindel)Labels:
Date Opened:2016-10-05 14:20:56Date Closed:2016-10-09 10:39:42
Priority:MajorRegression?
Status:Closed/CompleteComponents:
Versions:13.11.2 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Fresh installation of FreePBX Distro 10.13.66-64-bit, Sangoma 4-Port FXO PCI Card, Dell PowerEdge T20Attachments:( 0) backtrace-1.txt
( 1) backtrace-2.txt
Description:Asterisk 13.11.2 will randomly crash/core dump, and restart.  Unfortunately, I have not been able to find a pattern to this behavior.  Crashes seem to happen at least 1-2 times per hour

Please let me know if you require and more information.  
Comments:By: Asterisk Team (asteriskteam) 2016-10-05 14:20:57.051-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Ted Pukas (nivindel) 2016-10-05 14:29:15.204-0500

Backtrace.

By: Richard Mudgett (rmudgett) 2016-10-05 14:45:52.544-0500

Your PJPROJECT was built with assertions enabled.  It would be better if you built Asterisk with the bundled PJPROJECT instead: {{./configure --with-pjproject-bundled}}


By: Richard Mudgett (rmudgett) 2016-10-05 14:48:41.649-0500

Also:
https://wiki.asterisk.org/wiki/display/AST/Building+and+Installing+pjproject

By: Ted Pukas (nivindel) 2016-10-05 14:55:41.287-0500

I have run a 'yum install pjproject-debuginfo pjproject-devel' which I believe may install the necessary features from the shmz repository.  Will provide a backtrace after next crash.

By: Ted Pukas (nivindel) 2016-10-05 15:11:04.746-0500

New backtrace files after using yum to install pjproject-debuginfo & pjproject-devel -- Hopefully this provides the information that you are looking for.

There are two files, as there are two core dumps which are generated on every crash/restart of asterisk.

By: Richard Mudgett (rmudgett) 2016-10-05 15:23:19.254-0500

Those crashes you are having are assertion failures because your PJPROJECT is built with assertions enabled.  You really need to get it built correctly.  I provide a link earlier on how to do that.  Your best bet is to build asterisk with the bundled pjproject as explained on that page.

By: Ted Pukas (nivindel) 2016-10-05 15:30:05.776-0500

I'm sorry, I'm such a novice.  Are you saying that rebuilding without assertions enabled will potentially fix this crashes?

By: Richard Mudgett (rmudgett) 2016-10-05 15:42:51.353-0500

It will fix your issue because the three backtraces you have posted are all assertion failures in PJPROJECT.

By: Ted Pukas (nivindel) 2016-10-05 15:46:35.351-0500

Thank you Richard for your expertise and patience.  I will investigate recompiling.

By: Ted Pukas (nivindel) 2016-10-06 13:59:19.863-0500

Just an update;  We believe that we may have isolated this issue.  We have a Polycom SoundPoint VVX310 unit which is located in a remote office... this phone is in such a position where it is directly attached to a TP-Link Wireless bridge model TL-WA901ND (no ethernet connection is available in the room with the phone).  Since this phone has been disconnected the crashes have stopped.  We believe that somehow the TP-Link bridge is causing malformed SIP packets/requests and causing the core dump within PJSIP.  My understanding is that this is also causing the phone to reboot/crash.  We are going on 3 hours with no crash within Asterisk after disconnecting the phone which seems like a new record.  We are going to test for 24 hours and I will comment back then.

Potentially Problematic Network Setup is as follows:

Asterisk < E(NAT) > Router < > Internet Backbone < > Router < W(NAT) > TP Link Bridge < E > Polycom VVX310 Phone

NAT = NAT Translation
E = Ethernet Connection
W = Wireless Connection

By: Ted Pukas (nivindel) 2016-10-07 10:34:05.820-0500

24 Hours Later -- No Crashes, so it was 99.9% the TP-Link unit.  Going to replace with a different Wireless Bridge to see if that solves the problem.

By: Joshua C. Colp (jcolp) 2016-10-09 10:39:42.614-0500

Per the comments from Richard this is a result of pjproject being built in a developer mode instead of with assertions. The underlying assertion was triggered based on the network conditions you saw - causing traffic to unexpectly arrive triggering it. As this is not a bug I'm closing it out.