[Home]

Summary:ASTERISK-26387: Asterisk segfaults shortly after starting even with no active calls.
Reporter:Harley Peters (hpeters63)Labels:
Date Opened:2016-09-18 14:40:39Date Closed:2016-10-31 11:38:22
Priority:MajorRegression?
Status:Closed/CompleteComponents:
Versions:13.11.0 13.11.1 13.11.2 Frequency of
Occurrence
Constant
Related
Issues:
is related toASTERISK-26344 Asterisk 13.11.0 + PJSIP crash
is related toASTERISK-26516 pjsip: Memory corruption with possible memory leak.
is related toASTERISK-26517 pjsip: Memory corruption with possible memory leak.
Environment:Debian 8.5Attachments:( 0) backtrace.txt
( 1) backtrace.txt
( 2) backtrace-GIT-13-64b43d4M.txt
( 3) backtrace-GIT-13-64b43d4-with-console-logging-files.txt
( 4) combined-GIT-13-64b43d4.log
( 5) console-GIT-13-64b43d4.log
( 6) debug-GIT-13-64b43d4.log
( 7) gdb.txt
( 8) gdb.txt
( 9) security-GIT-13-64b43d4.log
Description:Asterisk segfaults shortly ( within minutes ) after starting even with no active calls.
segfault at 78 ip 00007fbeec6a0da4 sp 00007fbed27b2960 error 4 in libasteriskpj.so.2[7fbeec600000+16a000]

traps: asterisk[25526] general protection ip:7f1cd6ed9274 sp:7f1cbc0d9b50 error:0 in libpthread-2.19.so[7f1cd6ecf000+18000]

Asterisk 13.10.0 runs fine.
Comments:By: Asterisk Team (asteriskteam) 2016-09-18 14:40:40.736-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: George Joseph (gjoseph) 2016-09-18 16:41:35.696-0500

The backtrace has no useful information in it.  You'll need to recompile with at least the DONT_OPTIMIZE and BETTER_BACKTRACES flags and make sure you start asterisk with the '-g' option.

Follow the guidelines here:
https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines
https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

Also were you upgrading from an earlier version?
Were you changing from external pojproject to the internal bundled version?
Can you verify that the asterisk modules directory was cleaned out?




By: Harley Peters (hpeters63) 2016-09-18 18:21:54.186-0500

Not sure what happened to the first back trace I did this one the exact same way but it looks a tad better.


By: Harley Peters (hpeters63) 2016-09-18 18:26:54.412-0500

I was upgrading from version 13.10.0.
Both are using the internal bundled version of pjproject.
I didn't clear out the modules directory the first time but retried this time made no difference.
There have been no configurations changes by me between the previous working version and 13.11.


By: Richard Mudgett (rmudgett) 2016-10-11 15:50:06.259-0500

I have been unable to find a cause for the crash.  A patch was recently committed to the v13 branch to help gather more information.  The patch extends MALLOC_DEBUG to cover PJPROJECT memory pools when you use the --with-pjproject-bundled Asterisk configure script flag.

./configure --with-pjproject-bundled --enable-dev-mode

https://gerrit.asterisk.org/#/c/4030/

By: Harley Peters (hpeters63) 2016-10-11 22:10:23.014-0500

This is an updated backtrace running GIT-13-64b43d4M version.
configured with ./configure --with-pjproject-bundled --enable-dev-mode

By: Joshua C. Colp (jcolp) 2016-10-12 05:34:30.640-0500

Is it possible to also get the Asterisk console log with debug enabled (debug in logger.conf) and a high debug level?

By: Harley Peters (hpeters63) 2016-10-12 10:12:36.659-0500

This is an updated backtrace running GIT-13-64b43d4 version.
configured with ./configure --with-pjproject-bundled --enable-dev-mode

Including log files.


By: Harley Peters (hpeters63) 2016-10-17 16:43:49.577-0500

I have discovered the pjproject changeset ( 5349 ) that is causing the segfault. It appears to have to do with dns.
If I revert the 5349 change then asterisk built against pjproject 2.5.5 runs stable without ever causing a segfault.
Unfortunately it is rather large change and I have no idea why it's causing asterisk to segfault.
Am I the only one with this problem?
Nothing unusual about my systems dns just using dnsmasq caching server which queries the google dns servers.


By: Richard Mudgett (rmudgett) 2016-10-17 18:09:20.056-0500

Both ASTERISK-26387 and ASTERISK-26344 are likely the same issue.  I have been studying the logs from both issues.  I have noticed that on the ASTERISK-26387 logs there are OPTIONS ping responses for endpoint qualification that are being processed by a different serializer than the request.  This can cause reentrancy problems (e.g. crashes).  The outgoing OPTIONS requests go out on a pjsip/default serializer while the response is processed by a pjsip/distributor serializer because the distributor cannot find the original serializer that sent the request.  I also noticed that when this happened the updated contact status was reporting for an endpoint that needed DNS resolution (sip:sbc.anveno.com was one).  On the ASTERISK-26344 logs a similar thing is happening but I see it for outbound registration requests needing DNS resolution.  The REGISTER response is being processed by a pjsip/distributor serializer while the request went out on a pjsip/outreg serializer.

By: George Joseph (gjoseph) 2016-10-17 19:16:00.075-0500

That 5349 patchset has broken a few things.  I remember going through a git bisect on pjproject last month and wound up at the same place.  .That resulted in commit 6a5683cc277365ebb29ac4818ef96512a7b7a4ab



By: Friendly Automation (friendly-automation) 2016-10-28 17:15:15.629-0500

Change 4228 had a related patch set uploaded by Richard Mudgett:
bundled pjproject: Crashes while resolving DNS names.

[https://gerrit.asterisk.org/4228|https://gerrit.asterisk.org/4228]

By: Friendly Automation (friendly-automation) 2016-10-28 17:15:37.078-0500

Change 4229 had a related patch set uploaded by Richard Mudgett:
bundled pjproject: Crashes while resolving DNS names.

[https://gerrit.asterisk.org/4229|https://gerrit.asterisk.org/4229]

By: Friendly Automation (friendly-automation) 2016-10-28 17:15:54.209-0500

Change 4230 had a related patch set uploaded by Richard Mudgett:
bundled pjproject: Crashes while resolving DNS names.

[https://gerrit.asterisk.org/4230|https://gerrit.asterisk.org/4230]

By: Friendly Automation (friendly-automation) 2016-10-31 11:38:23.595-0500

Change 4228 merged by Joshua Colp:
bundled pjproject: Crashes while resolving DNS names.

[https://gerrit.asterisk.org/4228|https://gerrit.asterisk.org/4228]

By: Friendly Automation (friendly-automation) 2016-10-31 12:45:04.125-0500

Change 4229 merged by zuul:
bundled pjproject: Crashes while resolving DNS names.

[https://gerrit.asterisk.org/4229|https://gerrit.asterisk.org/4229]

By: Friendly Automation (friendly-automation) 2016-10-31 12:45:09.944-0500

Change 4230 merged by zuul:
bundled pjproject: Crashes while resolving DNS names.

[https://gerrit.asterisk.org/4230|https://gerrit.asterisk.org/4230]