Summary: | ASTERISK-22353: Random Asterisk Segmentation Fault | ||
Reporter: | Freetech Solutions (freetech) | Labels: | |
Date Opened: | 2013-08-21 10:17:54 | Date Closed: | 2013-10-10 11:45:12 |
Priority: | Major | Regression? | |
Status: | Closed/Complete | Components: | |
Versions: | 1.8.23.0 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | CentOS release 5.7 (Final) Kernel 2.6.18-238.12.1.el5 #1 SMP Tue May 31 13:23:01 EDT 2011 i686 i686 i386 GNU/Linux Asterisk 1.8.23.0 DAHDI Version: 2.6.1 Echo Canceller: HWEC, OSLEC libpri-1.4.14-0 libpri-devel-1.4.14-0 libopenr2-1.3.2-1 libopenr2-devel-1.3.2-1 Cards Installed: 03:08.0 Communication controller: Digium, Inc. Wildcard TE420 quad-span T1/E1/J1 card 3.3V (PCI-Express) (5th gen) (rev 02) Elastix 2.4 web framework | Attachments: | ( 0) backtrace.txt ( 1) full_21082013.gz ( 2) system.conf |
Description: | Randomly, asterisk is generating a coredump:
-rw------- 1 asterisk asterisk 26M Feb 13 2013 core.conci-elx.example.com-2013-02-13T18:01:49-0300 -rw------- 1 asterisk asterisk 539 Aug 20 19:41 core.conci-elx.example.com-2013-07-01T15:08:54-0300 -rw------- 1 asterisk asterisk 48M Jul 2 11:50 core.conci-elx.example.com-2013-07-02T11:50:07-0300 -rw------- 1 asterisk asterisk 58M Jul 8 18:35 core.conci-elx.example.com-2013-07-08T18:35:31-0300 -rw------- 1 asterisk asterisk 99M Jul 15 17:15 core.conci-elx.example.com-2013-07-15T17:15:04-0300 -rw------- 1 asterisk asterisk 86M Jul 18 12:23 core.conci-elx.example.com-2013-07-18T12:23:12-0300 -rw------- 1 asterisk asterisk 90M Jul 22 19:30 core.conci-elx.example.com-2013-07-22T19:30:07-0300 -rw------- 1 asterisk asterisk 43M Jul 23 11:23 core.conci-elx.example.com-2013-07-23T11:23:39-0300 -rw------- 1 asterisk asterisk 87M Aug 13 17:15 core.conci-elx.example.com-2013-08-13T17:15:40-0300 -rw------- 1 asterisk asterisk 544 Aug 20 19:13 core.conci-elx.example.com-2013-08-20T17:15:44-0300 GDB output belongs to file "core.conci-elx.example.com-2013-08-21T11:08:39-0300". Full log at the moment of the coredump shows nothing abnormal: [Aug 21 11:08:31] VERBOSE[4439] res_agi.c: -- <DAHDI/9-1>AGI Script hangup.agi completed, returning 0 [Aug 21 11:08:31] VERBOSE[4439] pbx.c: -- Executing [s@macro-hangupcall:51] Hangup("DAHDI/9-1", "") in new stack [Aug 21 11:08:31] VERBOSE[4439] app_macro.c: == Spawn extension (macro-hangupcall, s, 51) exited non-zero on 'DAHDI/9-1' in macro 'hangupcall' [Aug 21 11:08:31] VERBOSE[4439] features.c: == Spawn extension (ext-queues, h, 1) exited non-zero on 'DAHDI/9-1' [Aug 21 11:08:32] VERBOSE[4851] pbx.c: -- Executing [s@ivr-3:9] Set("DAHDI/24-1", "TIMEOUT(digit)=3") in new stack [Aug 21 11:08:32] VERBOSE[4851] func_timeout.c: -- Digit timeout set to 3.000 [Aug 21 11:08:32] VERBOSE[4851] pbx.c: -- Executing [s@ivr-3:10] Set("DAHDI/24-1", "TIMEOUT(response)=3") in new stack [Aug 21 11:08:32] VERBOSE[4851] func_timeout.c: -- Response timeout set to 3.000 [Aug 21 11:08:32] VERBOSE[4851] pbx.c: -- Executing [s@ivr-3:11] Set("DAHDI/24-1", "__IVR_RETVM=") in new stack [Aug 21 11:08:32] VERBOSE[4851] pbx.c: -- Executing [s@ivr-3:12] ExecIf("DAHDI/24-1", "1?Background(en/cc_01_central)") in new stack [Aug 21 11:08:32] VERBOSE[4851] file.c: -- <DAHDI/24-1> Playing 'en/cc_01_central.alaw' (language 'en') [Aug 21 11:08:39] VERBOSE[4439] res_musiconhold.c: -- Started music on hold, class 'freetech', on SIP/2003-00000000 [Aug 21 11:08:39] VERBOSE[4439] pbx.c: == Spawn extension (ext-queues, 7005, 10) exited non-zero on 'DAHDI/9-1' [Aug 21 11:08:39] DEBUG[4439] chan_dahdi.c: disconnecting MFC/R2 call on chan 9 [Aug 21 11:08:39] DEBUG[4439] chan_dahdi.c: ast cause 16 resulted in openr2 cause 6/Normal Clearing [Aug 21 11:08:39] DEBUG[1169] chan_dahdi.c: Chan 9 - Bits changed from 0x00 to 0x08 [Aug 21 11:08:39] DEBUG[1169] chan_dahdi.c: Chan 9 - CAS Rx << [CLEAR FORWARD] 0x08 [Aug 21 11:08:39] DEBUG[1169] chan_dahdi.c: Chan 9 - Call ended [Aug 21 11:08:39] DEBUG[1169] chan_dahdi.c: Chan 9 - CAS Tx >> [IDLE] 0x08 [Aug 21 11:08:39] DEBUG[1169] chan_dahdi.c: Chan 9 - CAS Raw Tx >> 0x09 [Aug 21 11:08:39] VERBOSE[1169] chan_dahdi.c: MFC/R2 call end on channel 9 This crash disconnects all current calls and causes logoff for all dinamic agents involved. Attached the gdb backtrace.txt with "thread apply all bt" activated, also system.conf dahdi configuration example. | ||
Comments: | By: Freetech Solutions (freetech) 2013-08-21 10:19:46.652-0500 gdb output and system.conf dahdi configuration. By: Freetech Solutions (freetech) 2013-08-21 11:07:49.703-0500 Full log gzipped. By: Matt Jordan (mjordan) 2013-08-21 11:11:29.738-0500 From your backtrace, this appears to be happening from a call in {{libopenr2}}: {noformat} Core was generated by `/usr/sbin/asterisk -f -U asterisk -G asterisk -vvvg -c'. Program terminated with signal 11, Segmentation fault. #0 0x0014c4e5 in vfprintf () from /lib/libc.so.6 #0 0x0014c4e5 in vfprintf () from /lib/libc.so.6 No symbol table info available. #1 0x00156742 in fprintf () from /lib/libc.so.6 No symbol table info available. #2 0x00949d59 in ?? () from /usr/lib/libopenr2.so.3 No symbol table info available. #3 0x00000000 in ?? () No symbol table info available. {noformat} This may be a problem in that library and not Asterisk. Can you install that library with debug symbols? By: Freetech Solutions (freetech) 2013-08-21 12:02:30.810-0500 (sorry the multiple comments) System is in production, I have just the mfcr2 logging flag in true as the example of one of the telco frames below: usecallerid=yes hidecallerid=no callwaiting=no usecallingpres=yes callwaitingcallerid=no threewaycalling=yes transfer=yes canpark=yes cancallforward=yes callreturn=yes relaxdtmf=yes rxgain=1.0 txgain=0.0 signalling=mfcr2 mfcr2_variant=ar mfcr2_get_ani_first=no mfcr2_max_ani=13 mfcr2_max_dnis=14 mfcr2_category=national_subscriber mfcr2_call_files=yes mfcr2_logdir=span1 mfcr2_logging=all mfcr2_mfback_timeout=-1 mfcr2_metering_pulse_timeout=-1 group=0 faxdetect=both faxbuffers=>12,half context=from-pstn channel => 1-15,17-31 For the last thread, I'm trying to find out an error from R2 side regarding the last thread: Thread 1 (Thread 0xb6a4fb90 (LWP 899)): ----------------> would be THREAD 3064265616 in mfcr2 logs #0 0x0014c4e5 in vfprintf () from /lib/libc.so.6 #1 0x00156742 in fprintf () from /lib/libc.so.6 #2 0x00949d59 in ?? () from /usr/lib/libopenr2.so.3 #3 0x00000000 in ?? () It may be a good idea to open up an issue for openR2 as well. By: Rusty Newton (rnewton) 2013-09-05 18:37:40.279-0500 Let us know when you are able to get libopen2 installed with debug symbols. I'll contact the MFC/R2 Asterisk support maintainer to see if he has any ideas in the meantime. By: Freetech Solutions (freetech) 2013-09-05 18:55:33.961-0500 Thanks Newton, I have a maint window for this Friday night. I'm planning to update libopenr2 and install it with debug symbols as well (1.3.2* to 1.3.3*). Will keep an eye on it during all Saturday and next week as well and report asap. Rgds, By: Freetech Solutions (freetech) 2013-09-16 18:07:31.166-0500 We were not able yet to reproduce the issue again after updating libopenr2 and installing debug symbol flags. It had been running as expected during whole past week. Seems to be that the update from 1.3.2 to 1.3.3 solved the problem. We assume that if we do not have any new report we can close this item. Current libraries running: libopenr2-1.3.3-0 libopenr2-devel-1.3.3-0 libopenr2-debuginfo-1.3.3-0 Rgds, By: Rusty Newton (rnewton) 2013-09-23 22:41:37.410-0500 Give it another week and let us know. Thanks. By: Freetech Solutions (freetech) 2013-10-10 09:38:04.900-0500 No new reports of coredumps for this environment. Rgds, By: Richard Mudgett (rmudgett) 2013-10-10 11:45:12.816-0500 Updating to newer version of libopenr2 resolved the issue. |