[Home]

Summary:ASTERISK-24885: Frequent Asterisk 13.2 core crash
Reporter:Oleg Maximenco (docusync)Labels:
Date Opened:2015-03-16 13:07:45Date Closed:2015-03-26 09:22:25
Priority:CriticalRegression?
Status:Closed/CompleteComponents:
Versions:13.2.0 Frequency of
Occurrence
Frequent
Related
Issues:
is related toASTERISK-24941 Asterisk 13: ABI compatibility issue in res_pjsip_session breaks external modules
Environment:CentOS 7 / Kernel 3.10.0-123.2.e17.x96_64Attachments:( 0) backtrace-3-16_09am.txt
( 1) backtrace-3-16_12pm.txt
( 2) gdb-3-16_09am.txt
( 3) gdb-3-16_12pm.txt
Description:Asterisk crashes 3-5 times a day after upgrading from 13.1 to 13.2.
Comments:By: Rusty Newton (rnewton) 2015-03-16 13:18:22.049-0500

Thank you for the crash report. However, we need more information to investigate the crash. Please provide:

1. A backtrace generated from a core dump using the instructions provided on the Asterisk wiki [1].
2. Specific steps taken that lead to the crash.
3. All configuration information necesary to reproduce the crash.

Thanks!

[1]: https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace



By: Rusty Newton (rnewton) 2015-03-16 13:18:46.363-0500

See the linked instructions. It is important that you get the trace without optimizations.

By: Rusty Newton (rnewton) 2015-03-16 13:26:21.876-0500

Additionally, are you already engaged with Digium technical support? The crash appears to involve DPMA.

By: Scott Griepentrog (sgriepentrog) 2015-03-16 13:43:53.390-0500

What is the version of DPMA being used?

By: Oleg Maximenco (docusync) 2015-03-16 15:25:09.008-0500

I'll try to re-compile Asterisk with debug info on and make another backtrace by end of the week, but right now I'm just thinking to downgrade to 13.1, at least temporary so the users will calm down a little.

Yes, I've already contacted Digium support about a problem related to transfers: in about 1 case out of 10 our employees experience a "frozen" transfer - a transferring person's extension stays forever on the phone's screen.

The DPMA version is 13.0.2.1.1_x86_64.

Also started seeing this error message quite often (not sure if it's related though, but just in case):
[Mar 16 15:21:52] ERROR[25275]: taskprocessor.c:715 taskprocessor_push: tps is NULL!
[Mar 16 15:21:56] ERROR[19224]: taskprocessor.c:715 taskprocessor_push: tps is NULL!
[Mar 16 15:21:56] ERROR[29568]: taskprocessor.c:715 taskprocessor_push: tps is NULL!
[Mar 16 15:22:17] ERROR[24394]: taskprocessor.c:715 taskprocessor_push: tps is NULL!
[Mar 16 15:22:26] ERROR[24394]: taskprocessor.c:715 taskprocessor_push: tps is NULL!
[Mar 16 15:22:26] ERROR[19224]: taskprocessor.c:715 taskprocessor_push: tps is NULL!

Edit: the error message is showing up when a phone tries to register in the system, and because of the error message all registration requests fail. Hopefully restart will help.

Edit2: Restart helped, that does look something related to DPMA, because all other functionality besides of the DPMA-based wasn't affected.

By: Ross Beer (rossbeer) 2015-03-17 12:12:00.639-0500

I'm also seeing frequent crashes, the latest message shows:

[Mar 17 16:22:22] ERROR[34986]: pjsip:0 <?>:          except.c .....!!!FATAL: unhand

Mar 17 10:19:30 inbound01 kernel: asterisk[24413]: segfault at 40 ip 00002b954241c478 sp 00002b954c209450 error 4 in libc-2.12.so[2b95423ea000+18a000]
Mar 17 14:27:51 inbound01 kernel: asterisk[31135] general protection ip:2b380a7c4710 sp:2b3b3de0d548 error:0 in libc-2.12.so[2b380a73b000+18a000]
Mar 17 16:22:22 inbound01 kernel: asterisk[34986]: segfault at 40 ip 00002b81e6bc1478 sp 00002b86967c8450 error 4 in libc-2.12.so[2b81e6b8f000+18a000]

The first is related to a lack of memory so I think there may be a leak

By: Oleg Maximenco (docusync) 2015-03-17 16:24:40.188-0500

Rollback to 13.1-cert1 has fixed the crashes. All other libraries (pjproject, libpri, dahdi, digium_phone etc...) are unchanged, so I guess that's an Asterisk issue? I'll try to re-compile the 13.2 without optimizations on Tuesday, just don't want to make our folks mad again so soon... :)

By: Rusty Newton (rnewton) 2015-03-26 09:22:06.715-0500

Oleg, thanks for the additional information.

Since this crash is related to DPMA, please contact Digium technical support and pursue this issue with them. They will take your new crash debug and ask for any other information they need to investigate the problem.

http://www.digium.com/en/support/contact

You might send them the link to this issue so they can take a look at what you already have here as well.

By: Matt Jordan (mjordan) 2015-04-06 11:00:49.916-0500

Note that this issue is probably related to ASTERISK-24941. A fix for the ABI issue that broke external modules that depend on PJSIP is going to go out today in 13.2.1/13.3.1.