[Home]

Summary:ASTERISK-26706: Segfault in dial_target_free stasis_channels.c:1349
Reporter:Ross Beer (rossbeer)Labels:
Date Opened:2017-01-09 05:26:17.000-0600Date Closed:2017-02-13 10:29:30.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Core/Stasis
Versions:13.13.1 Frequency of
Occurrence
Occasional
Related
Issues:
is related toASTERISK-26707 Segfault ast_json_free (p=0x7fb100000002) at json.c:190
is related toASTERISK-26713 Segfault when removing object from cache
Environment:Fedora 23Attachments:( 0) backtrace_20160109_clean.txt
( 1) backtrace_20JAN17.txt
Description:Segfault dial_target_free (doomed=0x7efd5c007c10) at stasis_channels.c:1349
Comments:By: Asterisk Team (asteriskteam) 2017-01-09 05:26:17.922-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Rusty Newton (rnewton) 2017-01-09 19:33:54.127-0600

I see the crash is in free, will you able to get valgrind or MALLOC_DEBUG output for this crash?

https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace#GettingaBacktrace-IdentifyingPotentialMemoryCorruption

By: Ross Beer (rossbeer) 2017-01-10 04:00:26.281-0600

I am unable to debug memory as the issue is on a production server.

By: Joshua C. Colp (jcolp) 2017-01-17 10:32:21.119-0600

As these issues all appear to be a result of memory corruption I'm tracking them solely under this issue.

By: Richard Mudgett (rmudgett) 2017-01-17 10:54:31.988-0600

[~rossbeer] The best thing you can do to help find these memory corruption issues is to run with MALLOC_DEBUG \[1] on the production system.  Performance isn't degraded too much and it does a pretty good job of pointing out what is getting corrupted.  Otherwise these backtraces don't help too much with the possible exception of the one that crashed in free().

\[1] https://wiki.asterisk.org/wiki/display/AST/MALLOC_DEBUG+Compiler+Flag

By: Ross Beer (rossbeer) 2017-01-19 04:23:19.627-0600

I believe these issues may be related to issue ASTERISK-26731 as the servers which use 'full_backend_cache=yes' are the servers crashing. Servers which do not use this sorcery option are not having any such issue.

I am going to disable 'full_backend_cache' to test this theory.

By: Rusty Newton (rnewton) 2017-01-19 12:47:28.967-0600

Thanks, let us know what you find.

By: Ross Beer (rossbeer) 2017-01-20 08:57:11.950-0600

Another Segfault, this looks to be memory related

[Edit by Rusty - Don't post full traces inline, please attach to the issue with .txt extension. Attached as backtrace_20JAN17.txt]

By: Rusty Newton (rnewton) 2017-01-23 09:45:32.443-0600

Did this one happen with full_backend_cache disabled?

By: Ross Beer (rossbeer) 2017-01-23 09:52:35.965-0600

Yes, the full backend cache was disabled at the time.

By: Richard Mudgett (rmudgett) 2017-01-24 14:45:29.190-0600

The last backtrace is not very useful either.  Asterisk is optimized so the call stack makes no sense and the line numbers don't even match up with the reported Asterisk v14.2.1.

By: Rusty Newton (rnewton) 2017-01-24 16:28:51.514-0600

Ross I'm not sure we'll be able to do anything here without MALLOC_DEBUG output.

By: Ross Beer (rossbeer) 2017-01-26 05:15:10.090-0600

I'm going to try updating from git once all of the reviews submitted by Richard Mudgett, regarding memory corruption have been merged in gerrit.

As this looks related to memory corruption, I hope that these resolve the issue.

Asterisk has been compiled with DONT_OPTIMIZE so I am unsure why the backtrace has been optimized.

By: Ross Beer (rossbeer) 2017-02-13 10:12:17.898-0600

I have not had a reoccurrence of this issue since upgrading.

Can we resolve the ticket and I will re-open if the issue happens again.

By: Richard Mudgett (rmudgett) 2017-02-13 10:29:30.290-0600

Closing per reporter request.