[Home]

Summary:ASTERISK-22353: Random Asterisk Segmentation Fault
Reporter:Freetech Solutions (freetech)Labels:
Date Opened:2013-08-21 10:17:54Date Closed:2013-10-10 11:45:12
Priority:MajorRegression?
Status:Closed/CompleteComponents:
Versions:1.8.23.0 Frequency of
Occurrence
Related
Issues:
Environment:CentOS release 5.7 (Final) Kernel 2.6.18-238.12.1.el5 #1 SMP Tue May 31 13:23:01 EDT 2011 i686 i686 i386 GNU/Linux Asterisk 1.8.23.0 DAHDI Version: 2.6.1 Echo Canceller: HWEC, OSLEC libpri-1.4.14-0 libpri-devel-1.4.14-0 libopenr2-1.3.2-1 libopenr2-devel-1.3.2-1 Cards Installed: 03:08.0 Communication controller: Digium, Inc. Wildcard TE420 quad-span T1/E1/J1 card 3.3V (PCI-Express) (5th gen) (rev 02) Elastix 2.4 web frameworkAttachments:( 0) backtrace.txt
( 1) full_21082013.gz
( 2) system.conf
Description:Randomly, asterisk is generating a coredump:

-rw------- 1 asterisk asterisk  26M Feb 13  2013 core.conci-elx.example.com-2013-02-13T18:01:49-0300
-rw------- 1 asterisk asterisk  539 Aug 20 19:41 core.conci-elx.example.com-2013-07-01T15:08:54-0300
-rw------- 1 asterisk asterisk  48M Jul  2 11:50 core.conci-elx.example.com-2013-07-02T11:50:07-0300
-rw------- 1 asterisk asterisk  58M Jul  8 18:35 core.conci-elx.example.com-2013-07-08T18:35:31-0300
-rw------- 1 asterisk asterisk  99M Jul 15 17:15 core.conci-elx.example.com-2013-07-15T17:15:04-0300
-rw------- 1 asterisk asterisk  86M Jul 18 12:23 core.conci-elx.example.com-2013-07-18T12:23:12-0300
-rw------- 1 asterisk asterisk  90M Jul 22 19:30 core.conci-elx.example.com-2013-07-22T19:30:07-0300
-rw------- 1 asterisk asterisk  43M Jul 23 11:23 core.conci-elx.example.com-2013-07-23T11:23:39-0300
-rw------- 1 asterisk asterisk  87M Aug 13 17:15 core.conci-elx.example.com-2013-08-13T17:15:40-0300
-rw------- 1 asterisk asterisk  544 Aug 20 19:13 core.conci-elx.example.com-2013-08-20T17:15:44-0300

GDB output belongs to file "core.conci-elx.example.com-2013-08-21T11:08:39-0300".

Full log at the moment of the coredump shows nothing abnormal:

[Aug 21 11:08:31] VERBOSE[4439] res_agi.c:     -- <DAHDI/9-1>AGI Script hangup.agi completed, returning 0
[Aug 21 11:08:31] VERBOSE[4439] pbx.c:     -- Executing [s@macro-hangupcall:51] Hangup("DAHDI/9-1", "") in new stack
[Aug 21 11:08:31] VERBOSE[4439] app_macro.c:   == Spawn extension (macro-hangupcall, s, 51) exited non-zero on 'DAHDI/9-1' in macro 'hangupcall'
[Aug 21 11:08:31] VERBOSE[4439] features.c:   == Spawn extension (ext-queues, h, 1) exited non-zero on 'DAHDI/9-1'
[Aug 21 11:08:32] VERBOSE[4851] pbx.c:     -- Executing [s@ivr-3:9] Set("DAHDI/24-1", "TIMEOUT(digit)=3") in new stack
[Aug 21 11:08:32] VERBOSE[4851] func_timeout.c:     -- Digit timeout set to 3.000
[Aug 21 11:08:32] VERBOSE[4851] pbx.c:     -- Executing [s@ivr-3:10] Set("DAHDI/24-1", "TIMEOUT(response)=3") in new stack
[Aug 21 11:08:32] VERBOSE[4851] func_timeout.c:     -- Response timeout set to 3.000
[Aug 21 11:08:32] VERBOSE[4851] pbx.c:     -- Executing [s@ivr-3:11] Set("DAHDI/24-1", "__IVR_RETVM=") in new stack
[Aug 21 11:08:32] VERBOSE[4851] pbx.c:     -- Executing [s@ivr-3:12] ExecIf("DAHDI/24-1", "1?Background(en/cc_01_central)") in new stack
[Aug 21 11:08:32] VERBOSE[4851] file.c:     -- <DAHDI/24-1> Playing 'en/cc_01_central.alaw' (language 'en')
[Aug 21 11:08:39] VERBOSE[4439] res_musiconhold.c:     -- Started music on hold, class 'freetech', on SIP/2003-00000000
[Aug 21 11:08:39] VERBOSE[4439] pbx.c:   == Spawn extension (ext-queues, 7005, 10) exited non-zero on 'DAHDI/9-1'
[Aug 21 11:08:39] DEBUG[4439] chan_dahdi.c: disconnecting MFC/R2 call on chan 9
[Aug 21 11:08:39] DEBUG[4439] chan_dahdi.c: ast cause 16 resulted in openr2 cause 6/Normal Clearing
[Aug 21 11:08:39] DEBUG[1169] chan_dahdi.c: Chan 9 - Bits changed from 0x00 to 0x08
[Aug 21 11:08:39] DEBUG[1169] chan_dahdi.c: Chan 9 - CAS Rx << [CLEAR FORWARD] 0x08
[Aug 21 11:08:39] DEBUG[1169] chan_dahdi.c: Chan 9 - Call ended
[Aug 21 11:08:39] DEBUG[1169] chan_dahdi.c: Chan 9 - CAS Tx >> [IDLE] 0x08
[Aug 21 11:08:39] DEBUG[1169] chan_dahdi.c: Chan 9 - CAS Raw Tx >> 0x09
[Aug 21 11:08:39] VERBOSE[1169] chan_dahdi.c: MFC/R2 call end on channel 9

This crash disconnects all current calls and causes logoff for all dinamic agents involved.

Attached the gdb backtrace.txt with "thread apply all bt" activated, also system.conf dahdi configuration example.
Comments:By: Freetech Solutions (freetech) 2013-08-21 10:19:46.652-0500

gdb output and system.conf dahdi configuration.

By: Freetech Solutions (freetech) 2013-08-21 11:07:49.703-0500

Full log gzipped.

By: Matt Jordan (mjordan) 2013-08-21 11:11:29.738-0500

From your backtrace, this appears to be happening from a call in {{libopenr2}}:

{noformat}
Core was generated by `/usr/sbin/asterisk -f -U asterisk -G asterisk -vvvg -c'.
Program terminated with signal 11, Segmentation fault.
#0  0x0014c4e5 in vfprintf () from /lib/libc.so.6
#0  0x0014c4e5 in vfprintf () from /lib/libc.so.6
No symbol table info available.
#1  0x00156742 in fprintf () from /lib/libc.so.6
No symbol table info available.
#2  0x00949d59 in ?? () from /usr/lib/libopenr2.so.3
No symbol table info available.
#3  0x00000000 in ?? ()
No symbol table info available.
{noformat}

This may be a problem in that library and not Asterisk. Can you install that library with debug symbols?




By: Freetech Solutions (freetech) 2013-08-21 12:02:30.810-0500

(sorry the multiple comments)

System is in production, I have just the mfcr2 logging flag in true as the example of one of the telco frames below:

usecallerid=yes
hidecallerid=no
callwaiting=no
usecallingpres=yes
callwaitingcallerid=no
threewaycalling=yes
transfer=yes
canpark=yes
cancallforward=yes
callreturn=yes
relaxdtmf=yes
rxgain=1.0
txgain=0.0
signalling=mfcr2
mfcr2_variant=ar
mfcr2_get_ani_first=no
mfcr2_max_ani=13
mfcr2_max_dnis=14
mfcr2_category=national_subscriber
mfcr2_call_files=yes
mfcr2_logdir=span1
mfcr2_logging=all
mfcr2_mfback_timeout=-1
mfcr2_metering_pulse_timeout=-1
group=0
faxdetect=both
faxbuffers=>12,half
context=from-pstn
channel => 1-15,17-31

For the last thread, I'm trying to find out an error from R2 side regarding the last thread:

Thread 1 (Thread 0xb6a4fb90 (LWP 899)): ----------------> would be THREAD 3064265616 in mfcr2 logs
#0 0x0014c4e5 in vfprintf () from /lib/libc.so.6
#1 0x00156742 in fprintf () from /lib/libc.so.6
#2 0x00949d59 in ?? () from /usr/lib/libopenr2.so.3
#3 0x00000000 in ?? ()

It may be a good idea to open up an issue for openR2 as well.

By: Rusty Newton (rnewton) 2013-09-05 18:37:40.279-0500

Let us know when you are able to get libopen2 installed with debug symbols.

I'll contact the MFC/R2 Asterisk support maintainer to see if he has any ideas in the meantime.

By: Freetech Solutions (freetech) 2013-09-05 18:55:33.961-0500

Thanks Newton, I have a maint window for this Friday night. I'm planning to update libopenr2 and install it with debug symbols as well (1.3.2* to 1.3.3*).

Will keep an eye on it during all Saturday and next week as well and report asap.

Rgds,



By: Freetech Solutions (freetech) 2013-09-16 18:07:31.166-0500

We were not able yet to reproduce the issue again after updating libopenr2 and installing debug symbol flags.
It had been running as expected during whole past week.

Seems to be that the update from 1.3.2 to 1.3.3 solved the problem.
We assume that if we do not have any new report we can close this item.

Current libraries running:
libopenr2-1.3.3-0
libopenr2-devel-1.3.3-0
libopenr2-debuginfo-1.3.3-0

Rgds,

By: Rusty Newton (rnewton) 2013-09-23 22:41:37.410-0500

Give it another week and let us know. Thanks.

By: Freetech Solutions (freetech) 2013-10-10 09:38:04.900-0500

No new reports of coredumps for this environment.

Rgds,

By: Richard Mudgett (rmudgett) 2013-10-10 11:45:12.816-0500

Updating to newer version of libopenr2 resolved the issue.