[Home]

Summary:ASTERISK-15895: Random crashes
Reporter:E. Versaevel (erikje)Labels:
Date Opened:2010-03-30 10:16:51Date Closed:2010-07-21 11:20:27
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Resources/res_timing_pthread
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) gdb.txt
( 1) gdb_31-03-2010.txt
Description:Asterisk randomly crashes on me
I'm running Ubuntu Server and have tried asterisk 1.6.1.17 and 1.6.1.18

Asterisk randomly crashes, my first bt's pointed at
#0  0xb7f6d410 in __kernel_vsyscall ()
#1  0xb7e42085 in raise () from /lib/tls/i686/cmov/libc.so.6
#2  0xb7e43a01 in abort () from /lib/tls/i686/cmov/libc.so.6
#3  0xb7e7ab7c in ?? () from /lib/tls/i686/cmov/libc.so.6
#4  0xb7f04138 in __fortify_fail () from /lib/tls/i686/cmov/libc.so.6
ASTERISK-1  0xb7f040f0 in __stack_chk_fail () from /lib/tls/i686/cmov/libc.so.6
ASTERISK-2  0xb7af5d24 in __ast_pthread_mutex_unlock (filename=0x2449 <Address 0x2449 out of bounds>, lineno=-1239132680, func=0x1 <Address 0x1 out of bounds>,
   mutex_name=0x0, t=0xb62455f0) at /usr/src/asterisk-1.6.1.18/include/asterisk/lock.h:731

But that is because of the stack protector, so i recompiled asterisk without optimize and disabled the stack protector to generate the attached backtrace

Comments:By: E. Versaevel (erikje) 2010-03-31 03:00:56

Uploaded a new trace for a crash this morning, i'm getting the feeling it's related to a dialplan reload or sip reload

I've got an update script writing a new config every 5 mins and doing a sip reload/dialplan reload if nesscesary, i'm getting the feeling it crashes asterisk as it reloads while a call is traversing the dialplan or something

By: E. Versaevel (erikje) 2010-03-31 03:02:45

Sorry, i appended to the gdb of yesterday so the new traces is at the end of the 2nd file

By: E. Versaevel (erikje) 2010-03-31 09:03:48

Easy reproducable by sending asterisk a lot of calls and issuing dialplan reloads.

By: E. Versaevel (erikje) 2010-03-31 09:42:46

-= 12460 extensions (69791 priorities) in 6960 contexts. =-

By: Leif Madsen (lmadsen) 2010-04-27 13:32:20

Thanks for the information! Would you mind providing some valgrind output when you're reproducing the issue? Note that valgrind will make the system incredibly slow, so you should only do this after hours or on a test environment.

Thanks!

By: E. Versaevel (erikje) 2010-04-28 04:33:00

Tiny problem with that is that i recompiled asterisk with fd_timer and that fixed my issue and the system is in production right now so i'm unable to perform that test right now, can you explain what i should do with valgrind?

By: Leif Madsen (lmadsen) 2010-05-25 14:40:20

The information for how to use valgrind is available in doc/backtrace.txt of your Asterisk source. Since switching away from res_timing_pthread seems to have resolved your issue, unless you are able to reproduce the problem and provide the valgrind information there isn't much point in leaving this issue open.

I'm going to close this issue for now, but if you are able to provide the information please reopen this issue. Thanks!

By: Digium Subversion (svnbot) 2010-07-21 11:15:10

Repository: asterisk
Revision: 278465

U   trunk/res/res_timing_pthread.c

------------------------------------------------------------------------
r278465 | russell | 2010-07-21 11:14:59 -0500 (Wed, 21 Jul 2010) | 41 lines

Use poll() instead of select() in res_timing_pthread to avoid stack corruption.

This code did not properly check FD_SETSIZE to ensure that it did not try to
select() on fds that were too large.  Switching to poll() removes the limitation
on the maximum fd value.

(closes issue ASTERISK-14848)
Reported by: keiron

(closes issue ASTERISK-15960)
Reported by: Eddie Edwards

(closes issue ASTERISK-15349)
Reported by: Hubguru

(closes issue ASTERISK-14670)
Reported by: flop

(closes issue ASTERISK-12249)
Reported by: falves11

(closes issue ASTERISK-13973)
Reported by: vrban

(closes issue ASTERISK-15971)
Reported by: aleksey2000

(closes issue ASTERISK-14385)
Reported by: kowalma

(closes issue ASTERISK-16185)
Reported by: dcabot

(closes issue ASTERISK-16085)
Reported by: glwgoes

(closes issue ASTERISK-15895)
Reported by: erikje

possibly other issues, too ...

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=278465

By: Digium Subversion (svnbot) 2010-07-21 11:20:26

Repository: asterisk
Revision: 278479

_U  branches/1.6.2/
U   branches/1.6.2/res/res_timing_pthread.c

------------------------------------------------------------------------
r278479 | russell | 2010-07-21 11:20:17 -0500 (Wed, 21 Jul 2010) | 48 lines

Merged revisions 278465 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

........
 r278465 | russell | 2010-07-21 11:15:00 -0500 (Wed, 21 Jul 2010) | 41 lines
 
 Use poll() instead of select() in res_timing_pthread to avoid stack corruption.
 
 This code did not properly check FD_SETSIZE to ensure that it did not try to
 select() on fds that were too large.  Switching to poll() removes the limitation
 on the maximum fd value.
 
 (closes issue ASTERISK-14848)
 Reported by: keiron
 
 (closes issue ASTERISK-15960)
 Reported by: Eddie Edwards
 
 (closes issue ASTERISK-15349)
 Reported by: Hubguru
 
 (closes issue ASTERISK-14670)
 Reported by: flop
 
 (closes issue ASTERISK-12249)
 Reported by: falves11
 
 (closes issue ASTERISK-13973)
 Reported by: vrban
 
 (closes issue ASTERISK-15971)
 Reported by: aleksey2000
 
 (closes issue ASTERISK-14385)
 Reported by: kowalma
 
 (closes issue ASTERISK-16185)
 Reported by: dcabot
 
 (closes issue ASTERISK-16085)
 Reported by: glwgoes
 
 (closes issue ASTERISK-15895)
 Reported by: erikje
 
 possibly other issues, too ...
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=278479