[Home]

Summary:ASTERISK-16185: Asterisk cores @ 110 calls
Reporter:Dave Cabot (dcabot)Labels:
Date Opened:2010-06-01 12:35:32Date Closed:2010-07-21 11:20:25
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) gdb_bt-4.txt
( 1) gdb_bt_full.txt
( 2) gdb_bt_full-3.txt
( 3) gdb_bt_full-4.txt
( 4) gdb-output.tar.gz
( 5) thread_apply_all_bt-4.txt
( 6) uac.xml
Description:Load testing IVR using SIPP.  Asterisk cores at 110 calls.  IVR using PHP FastAGI.

****** STEPS TO REPRODUCE ******

sudo sipp -sf uac.xml -m 1000 -r 5 -l 200 -s 2393629172 192.168.20.51 -i 192.168.25.135 --trace_err



****** ADDITIONAL INFORMATION ******

Connected to Asterisk 1.6.2.7 currently running on asterisk-1 (pid = 11995)
asterisk-1*CLI> core show sysinfo
asterisk-1*CLI>
System Statistics
-----------------
 System Uptime:             85 hours
 Total RAM:                 2075260 KiB
 Free RAM:                  855736 KiB
 Buffer RAM:                181928 KiB
 Total Swap Space:          2096472 KiB
 Free Swap Space:           2096472 KiB

 Number of Processes:       236


[iot-fmy-qa root@asterisk-1 ~]# ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 32762
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 4096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32762
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
Comments:By: Paul Belanger (pabelanger) 2010-06-01 12:52:13

Your backtrace is optimized out (see below), please upload a new one.
---
Thank you for your bug report. In order to move your issue forward, we require a backtrace from the core file produced after the crash. Please see the doc/backtrace.txt file in your Asterisk source directory.

Also, be sure you have DONT_OPTIMIZE enabled in menuselect within the Compiler Flags section, then:

make install

after enabling, reproduce the crash, and then execute the instructions in doc/backtrace.txt.

When complete, attach that file to this issue report. Thanks!

By: Dave Cabot (dcabot) 2010-06-01 13:10:24

Changing the compile flag changes the behavior.  Now asterisk takes about 104 SIP calls, looses contact with it's IAXs trunks and doesn't do much else.

 158 calls (limit 200)                  Peak was 200 calls, after 40 s
 0 Running, 365 Paused, 50 Woken up
 0 dead call msg (discarded)            4 out-of-call msg (discarded)        
 3 open sockets                        
 1040 Total RTP pckts sent              0.000 last period RTP rate (kB/s)

                                Messages  Retrans   Timeout   Unexpected-Msg
     INVITE ---------->         651       2590      389                
        100 <----------         104       0         0         0        
        180 <----------         0         0         0         0        
        183 <----------         0         0         0         0        
        200 <----------  E-RTD1 104       0         0         0        
        ACK ---------->         104       0                            
      Pause [    10.0s]         104                           0        
             [ NOP ]              
      Pause [    20.0s]         104                           0        
        BYE ---------->         104       936       104                
        200 <----------         0         0         0         0        

      Pause [   2000ms]         0                             0        
------ [+|-|*|/]: Adjust rate ---- [q]: Soft exit ---- [p]: Pause traffic -----

Last Error: Aborting call on UDP retransmission timeout for Call-ID '493...

By: Dave Cabot (dcabot) 2010-06-01 13:15:16

I was able to make it crash, but it's definitely not as consistent as it was with the DONT_OPTIMIZE compiler flag off

By: Dave Cabot (dcabot) 2010-06-01 14:10:19

uploaded tar file with the three gdb outputs (bt, bt full, and thread apply all bt).

By: Leif Madsen (lmadsen) 2010-06-02 11:07:34

Please attach your backtraces as separate .txt files and not as an archive.

By: Dave Cabot (dcabot) 2010-06-02 11:34:17

Done as requested

By: Leif Madsen (lmadsen) 2010-06-08 10:28:10

I see you're using res_timing_pthread. Can you try disabling that and using a different timing source and then try to reproduce?

By: Dave Cabot (dcabot) 2010-06-08 12:01:52

I'm able to take it up to 500 calls without core.  Looks like that resolves it.  Thanks.

By: Leif Madsen (lmadsen) 2010-06-14 14:11:40

I'm closing this issue as suspended for now unless someone is willing to continue debugging why this occurs with res_timing_pthread. If so, please request a marshal to reopen the issue.

By: Digium Subversion (svnbot) 2010-07-21 11:15:08

Repository: asterisk
Revision: 278465

U   trunk/res/res_timing_pthread.c

------------------------------------------------------------------------
r278465 | russell | 2010-07-21 11:14:59 -0500 (Wed, 21 Jul 2010) | 41 lines

Use poll() instead of select() in res_timing_pthread to avoid stack corruption.

This code did not properly check FD_SETSIZE to ensure that it did not try to
select() on fds that were too large.  Switching to poll() removes the limitation
on the maximum fd value.

(closes issue ASTERISK-14848)
Reported by: keiron

(closes issue ASTERISK-15960)
Reported by: Eddie Edwards

(closes issue ASTERISK-15349)
Reported by: Hubguru

(closes issue ASTERISK-14670)
Reported by: flop

(closes issue ASTERISK-12249)
Reported by: falves11

(closes issue ASTERISK-13973)
Reported by: vrban

(closes issue ASTERISK-15971)
Reported by: aleksey2000

(closes issue ASTERISK-14385)
Reported by: kowalma

(closes issue ASTERISK-16185)
Reported by: dcabot

(closes issue ASTERISK-16085)
Reported by: glwgoes

(closes issue ASTERISK-15895)
Reported by: erikje

possibly other issues, too ...

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=278465

By: Digium Subversion (svnbot) 2010-07-21 11:20:25

Repository: asterisk
Revision: 278479

_U  branches/1.6.2/
U   branches/1.6.2/res/res_timing_pthread.c

------------------------------------------------------------------------
r278479 | russell | 2010-07-21 11:20:17 -0500 (Wed, 21 Jul 2010) | 48 lines

Merged revisions 278465 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

........
 r278465 | russell | 2010-07-21 11:15:00 -0500 (Wed, 21 Jul 2010) | 41 lines
 
 Use poll() instead of select() in res_timing_pthread to avoid stack corruption.
 
 This code did not properly check FD_SETSIZE to ensure that it did not try to
 select() on fds that were too large.  Switching to poll() removes the limitation
 on the maximum fd value.
 
 (closes issue ASTERISK-14848)
 Reported by: keiron
 
 (closes issue ASTERISK-15960)
 Reported by: Eddie Edwards
 
 (closes issue ASTERISK-15349)
 Reported by: Hubguru
 
 (closes issue ASTERISK-14670)
 Reported by: flop
 
 (closes issue ASTERISK-12249)
 Reported by: falves11
 
 (closes issue ASTERISK-13973)
 Reported by: vrban
 
 (closes issue ASTERISK-15971)
 Reported by: aleksey2000
 
 (closes issue ASTERISK-14385)
 Reported by: kowalma
 
 (closes issue ASTERISK-16185)
 Reported by: dcabot
 
 (closes issue ASTERISK-16085)
 Reported by: glwgoes
 
 (closes issue ASTERISK-15895)
 Reported by: erikje
 
 possibly other issues, too ...
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=278479