[Home]

Summary:ASTERISK-27300: Asterisk crashes randomly (FRACK!, chan_sip)
Reporter:Alex A. Welzl (awelzl)Labels:
Date Opened:2017-09-28 03:05:39Date Closed:2020-01-14 11:14:15.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:13.13.0 Frequency of
Occurrence
Occasional
Related
Issues:
is related toASTERISK-27321 Asterisk Crashing with FRACK Errors and Serious Network Trouble
is related toASTERISK-27412 core: Audiohook freeing interpolated frame when it shouldn't.
Environment:3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u3 (2017-08-15) x86_64 OpenSSL 1.0.1t 1032 sip peers (chan_sip with TLS/SRTP only)Attachments:( 0) backtrace.txt
( 1) backtrace2.txt
Description:Asterisk crashes randomly (currently 1-2 times per week). Any idea if this is the same issue like ASTERISK-26699? If not, upgrading to another version makes no sense.
{noformat}
[2017-09-27 19:48:13] ERROR[11520] astobj2.c: FRACK!, Failed assertion bad magic number 0x0 for object 0x45f6410 (0)
[2017-09-27 19:48:13] VERBOSE[11520] logger.c: Got 16 backtrace records
[2017-09-27 19:48:13] VERBOSE[11520] logger.c: #0: [0x45b777] /usr/sbin/asterisk(__ao2_ref+0x1a7) [0x45b777]
[2017-09-27 19:48:13] VERBOSE[11520] logger.c: #1: [0x45da2a] /usr/sbin/asterisk() [0x45da2a]
[2017-09-27 19:48:13] VERBOSE[11520] logger.c: #2: [0x45e4a2] /usr/sbin/asterisk(__ao2_callback_data+0x12) [0x45e4a2]
[2017-09-27 19:48:13] VERBOSE[11520] logger.c: #3: [0x7f6dbeb5a163] /usr/lib/asterisk/modules/chan_sip.so(+0x68163) [0x7f6dbeb5a163]
[2017-09-27 19:48:13] VERBOSE[11520] logger.c: #4: [0x7f6dbeb5ad95] /usr/lib/asterisk/modules/chan_sip.so(+0x68d95) [0x7f6dbeb5ad95]
[2017-09-27 19:48:13] VERBOSE[11520] logger.c: #5: [0x7f6dbeb87911] /usr/lib/asterisk/modules/chan_sip.so(+0x95911) [0x7f6dbeb87911]
[2017-09-27 19:48:13] VERBOSE[11520] logger.c: #6: [0x7f6dbeb8a1cc]
{noformat}
Comments:By: Asterisk Team (asteriskteam) 2017-09-28 03:05:40.917-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Richard Mudgett (rmudgett) 2017-09-28 09:59:21.209-0500

A FRACK is not a crash though what causes it could lead to a crash.  The only information the FRACK you report provides is that someone is using an ao2 object in chan_sip after it was destroyed.  The FRACK backtrace records will be useful if you enable BETTER_BACKTRACES in menuselect.  Enabling DONT_OPTIMIZE in menuselect would also help.

This FRACK has nothing to do with ASTERISK-26699 as that FRACK was for chan_pjsip/res_pjsip not chan_sip.

Also be aware that chan_sip is extended support and relies on the Asterisk community.

By: Rusty Newton (rnewton) 2017-09-29 09:55:36.726-0500

Please read through the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines]. Attach any new debug or traces to the issue as txt files.

I'm setting this issue to Waiting For Feedback as we need further traces and logs.

By: Rusty Newton (rnewton) 2017-09-29 09:55:47.035-0500

Thank you for the crash report. However, we need more information to investigate the crash. Please provide:

1. A backtrace generated from a core dump using the instructions provided on the Asterisk wiki [1].
2. Specific steps taken that lead to the crash.
3. All configuration information necesary to reproduce the crash.

Thanks!

[1]: https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace



By: Alex A. Welzl (awelzl) 2017-09-30 04:24:19.812-0500

It will take some days to gather the requested details as I have to recompile the options into the system and wait until the issue occurs.

By: Asterisk Team (asteriskteam) 2017-10-17 12:00:01.264-0500

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines

By: Alex A. Welzl (awelzl) 2017-10-23 01:20:23.664-0500

I have compiled all necessary backtrace options and upgraded to version 13.13-cert6.
Now waitin' for the crash to happen - current uptime 3 days.

By: Asterisk Team (asteriskteam) 2017-10-23 01:20:24.021-0500

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.

By: Richard Mudgett (rmudgett) 2017-10-23 11:11:07.090-0500

Why are you using Certified Asterisk [1]?  If you have a service level agreement then you should be contacting Digium Technical Support in accordance with the agreement.  Without a service level agreement, any fixes as a result of this issue will *not* trigger a new release of Certified Asterisk.

In addition, chan_sip is extended support which is supported by the Asterisk community.

[1] https://www.digium.com/products/asterisk/certified-asterisk

By: Alex A. Welzl (awelzl) 2017-10-23 14:39:27.197-0500

THX for your note. Yes, I am aware about that. We are planing to sign a service agreement soon.

By: Kevin Harwell (kharwell) 2017-10-24 16:30:41.012-0500

Going to move this back to "waiting on feedback" since we are still waiting on a backtrace and debug logs.

By: Alex A. Welzl (awelzl) 2017-10-26 14:12:10.619-0500

backtrace.txt from crash added.

By: Kevin Harwell (kharwell) 2017-10-26 15:19:17.899-0500

Looks like the peer goes away. You can see in Thread 323 that there is a FRACK on the peer, and then when it tries to use a reference from it....crash.

What else is happening on the system at the time? For instance is the system reloading? Since this happens a couple times a week would it be possible to bump up debugging and attach a debug log? If the log ends up being large we won't need the entire log. Just up to a few minutes before the crash.

Once it does crash again with debugging turned on attach both the log and a new backtrace to the issue.

By: Alex A. Welzl (awelzl) 2017-10-26 15:30:24.199-0500

THX for the info.
What does "the peer goes away" mean?
The last documented crash happened during the night, so there was no reloading nor a high load on the system itself. I will enable debug logging and add requested details after the next crash.

By: Kevin Harwell (kharwell) 2017-10-26 15:59:16.573-0500

{quote}What does "the peer goes away" mean?{quote}
It means the memory that stores the object that represents the peer is released/freed by the system.

By: Alex A. Welzl (awelzl) 2017-11-07 11:01:39.213-0600

new backtrace from the last crash.
debug log will be enabled shortly and we'll wait for the next crash.
Does this backtrace show the same reason as the last one?

By: Richard Mudgett (rmudgett) 2017-11-07 11:20:07.074-0600

Yes.  It crashed in the exact same place.

By: Richard Mudgett (rmudgett) 2017-11-13 18:40:40.374-0600

ASTERISK-27412 has useful valgrind output that could be helpful concerning this FRACK.

By: Asterisk Team (asteriskteam) 2017-11-28 12:00:01.351-0600

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines