[Home]

Summary:ASTERISK-23339: Segfault in __ao2_find at astobj2.c, in find_interface at format.c
Reporter:David Brillert (aragon)Labels:
Date Opened:2014-02-21 08:52:55.000-0600Date Closed:2014-02-24 12:44:21.000-0600
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:11.6.0 11.7.0 Frequency of
Occurrence
Occasional
Related
Issues:
is the original version of this clone:ASTERISK-22875 CLONE - Segfault in __ao2_find ()
duplicatesASTERISK-23103 [patch]Crash in ast_format_cmp, in ao2_find
Environment:centos 5.9 64bitAttachments:( 0) backtrace_unoptimized_feb_20_2014.txt
( 1) backtrace_unoptimized_feb_24_2014.txt
( 2) backtrace_unoptimized_feb_24_20141.txt
Description:Segfault.  Backtrace attached.
Asterisk was compiled with DONT_OPTIMIZE and BETTER_BACKTRACES

{noformat}
Program terminated with signal 11, Segmentation fault.
#0  0x000000000044f52a in __ao2_find (c=0x0, arg=0x2b7d71548120, flags=OBJ_POINTER) at astobj2.c:1237
1237    astobj2.c: No such file or directory.
       in astobj2.c
(gdb) bt
#0  0x000000000044f52a in __ao2_find (c=0x0, arg=0x2b7d71548120, flags=OBJ_POINTER) at astobj2.c:1237
#1  0x00000000004ec739 in find_interface (format=0x191fb94c) at format.c:107
#2  0x00000000004ed086 in format_cmp_helper (format1=0x191fb94c, format2=0x193abb18) at format.c:314
#3  0x00000000004ed1a0 in ast_format_cmp (format1=0x191fb94c, format2=0x193abb18) at format.c:339
#4  0x00000000004f0c4d in cmp_cb (obj=0x193abb18, arg=0x191fb94c, flags=8) at format_cap.c:56
#5  0x000000000044ef42 in internal_ao2_callback (c=0x19668558, flags=OBJ_POINTER, cb_fn=0x4f0c1d, arg=0x191fb94c, data=0x0,
   type=DEFAULT, tag=0x0, file=0x0, line=0, func=0x0) at astobj2.c:1101
{noformat}
Comments:By: David Brillert (aragon) 2014-02-21 08:55:36.538-0600

I have reported this before but I couldn't get Asterisk rpms to compile properly with some build options we used to use for 1.4 and 1.8.
Removing those build options allows me to get a nice backtrace :D Attached
Had a production server dump on me 3 times yesterday in same place.

By: Matt Jordan (mjordan) 2014-02-21 09:12:24.887-0600

Unfortunately, there's still a large number of symbols that aren't in your libraries:

{noformat}
#0  0x000000000044f51a in __ao2_find ()
No symbol table info available.
#1  0x00000000004ec5dd in find_interface ()
No symbol table info available.
#2  0x00000000004ecf2a in format_cmp_helper ()
No symbol table info available.
#3  0x00000000004ed044 in ast_format_cmp ()
No symbol table info available.
#4  0x00000000004f0af1 in cmp_cb ()
No symbol table info available.
#5  0x000000000044ef32 in internal_ao2_callback ()
No symbol table info available.
#6  0x000000000044f3c5 in __ao2_callback ()
No symbol table info available.
#7  0x000000000044f52e in __ao2_find ()
No symbol table info available.
#8  0x00000000004f121e in ast_format_cap_iscompatible ()
No symbol table info available.
{noformat}

We'll really need the debug symbols in everything to find out what is actually occurring.

That aside, it is exceedingly odd to see something crashing in a format comparison. What format modules are you using? Are they compatible with the version of Asterisk that you're using?

By: Corey Farrell (coreyfarrell) 2014-02-21 09:21:51.214-0600

Also please post a backtrace for all threads: 'thread apply all bt'.

By: David Brillert (aragon) 2014-02-21 11:26:17.887-0600

Found this http://lists.rpm.org/pipermail/rpm-list/2009-January/000127.html
So will rebuild the rpm with '%define __strip /bin/true' and wait for the next crash and upload backtraces including 'thread apply all bt'

By: Corey Farrell (coreyfarrell) 2014-02-21 11:44:22.389-0600

If you are using CentOS or Fedora based system you most likely just need to 'yum install asterisk-debuginfo'.  The debug symbols normally split off to a separate addon RPM.

By: David Brillert (aragon) 2014-02-24 10:34:36.890-0600

Crash this morning, this backtrace looks legit.

By: David Brillert (aragon) 2014-02-24 10:35:16.834-0600

Backtrace attached

By: David Brillert (aragon) 2014-02-24 10:43:44.500-0600

Use 'backtrace unoptimized feb 24 2014(1).txt', its cleaner

By: Corey Farrell (coreyfarrell) 2014-02-24 12:42:53.923-0600

I've reviewed the updated backtrace and found that this was reported on ASTERISK-23103.  This ticket will be closed as duplicate, you can watch ASTERISK-23103 for progress.

Note this issue has only been seen during SHUTDOWN_FAST, which is caused by sending a kill signal to asterisk.  You are likely using init script to stop or restart while calls are active.

{noformat}
Thread 35 (Thread 0x2b7d39c82ba0 (LWP 11489)):
#0  0x0000003da98c8b37 in unlink () from /lib64/libc.so.6
No symbol table info available.
#1  0x000000000044592b in really_quit (num=0, niceness=SHUTDOWN_FAST, restart=0) at asterisk.c:1875
{noformat}

By: Matt Jordan (mjordan) 2014-02-24 12:45:13.139-0600

Yup. We're setting the {{interfaces}} container to NULL after cleaning it up, and not checking for its existence elsewhere.

Cleaning things up on exit is nice, but boy have there been ripple effects to busy systems on shut down :-P