[Home]

Summary:ASTERISK-21753: Seg Fault while attempting to queue AST_CONTROL_SRCCHANGE on a NULl channel when handling an incoming SIP ACK over TCP
Reporter:Mathieu Boyer (thieums63)Labels:
Date Opened:2013-05-02 10:49:07Date Closed:2017-07-25 16:59:07
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:1.8.21.0 10.12.2 11.3.0 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Debian Wheezy Kernel 3.2.0-4-rt-amd64 #1 SMP PREEMPT RT Debian 3.2.41-2 x86_64 GNU/Linux Bi Xeon E5620 @ 2.40 Ghz 12 G RAM Attachments:( 0) backtrace1.txt
( 1) backtrace2.txt
( 2) crash1.pcapng
( 3) crash1.PNG
( 4) crash2.pcapng
( 5) crash2.PNG
( 6) crash3.full.log.tar.bz2
( 7) crash3.log
( 8) crash3.pcapng
( 9) crash3.PNG
(10) crash5.pcapng
Description:Asterisk is performing pure SIP transit calls from/to an Avaya Communication Manager 5.2.1(ACM) from/to a SIP carrier (COLT).
From an asterisk(192.168.69.9) point of view :
- ACM(192.168.69.10) is SIP TCP PEER (as SIP UDP is not supported in this ACM release)
- COLT(192.168.254.5 / 192.168.253.254) is a SIP UDP PEER

directrtpsetup = yes

This asterisk run around 300.000 calls per day

Asterisk is crashing (seg fault) on similar SIP message (incoming SIP TCP ACK with session description, cseq = "2 ACK", from ACM).

Crash frequency : between 0-2 per day.

I can't reproduce the issue (i've generated with sipp the same scenario, it doesn't crash)

I'm providing two exemples (11.3.0 bt, pcap file)




Comments:By: Rusty Newton (rnewton) 2013-05-08 14:33:40.181-0500

It may be difficult on a high volume system, but another helpful data point would be a DEBUG & VERBOSE (level 5 at least) log from the moments right up to the crash.

https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information

Expect very large logs to be generated.. we only need a small chunk where the crash happens.


By: Matt Jordan (mjordan) 2013-05-09 09:59:20.476-0500

I'm assuming in the call flow that ASTERISK is at 192.168.69.9, and so the ACK is in response to a 200 OK that we sent in response to an inbound INVITE request. Is that correct?

What's interesting here is how fast the ACK is coming back. It's almost like the SIP dialog isn't yet associated with a channel, as in both cases the channel pointer is NULL. That's awfully strange however, as the channel had to have been created when we sent back the 200 OK.

By: Mathieu Boyer (thieums63) 2013-05-11 03:31:47.710-0500

@Rusty : i will try to get a verbose and debug trace. Actually asterisk is compiled with DONT_OPTIMIZE and DEBUG flags. As the load is increasing faster, I can't put on asterisk as much trafic as I can. Is it OK if I provide ONLY debug/verbose and .pcap without backtraces ?
@Matt : yes that's correct.

By: Rusty Newton (rnewton) 2013-05-13 12:35:01.634-0500

@Mathieu, yeah it's okay to just get the log and pcap since you already have backtraces. You don't need those compiler options set for getting a log and pcap.

By: Mathieu Boyer (thieums63) 2013-05-14 09:01:18.127-0500

The log file includes VERBOSE & DEBUG (level 5)
It's a grep on string "C-000288ca"

We've got a REINVITE @ 13:14:52
[May 14 13:14:52] DEBUG[29928][C-000288ca] chan_sip.c: **** Received INVITE (5) - Command in SIP INVITE
And then a BYE @ 13:14:52
[May 14 13:14:52] DEBUG[26494][C-000288ca] chan_sip.c: **** Received BYE (8) - Command in SIP BYE
As I don't see BYE in the pcap from 192.168.69.10, I guess it comes from carrier side (not captured)

The last line in the log file is the last line written by asterisk before it crashed, then respawned thanks to safe_asterisk
[May 14 13:14:52] DEBUG[29928][C-000288ca] chan_sip.c: We're settling with these formats: (alaw)
[May 14 13:14:57] Asterisk 11.3.0 built by exploit @ vil-asterisk01 on a x86_64 running Linux on 2013-05-11 08:32:14 UTC

I just changed my tshark capture filter in order to get every SIP signaling (-f "port 5060")
I'll provide a full SIP trace @ next seg fault

Do you still need DEBUG/VERBOSE traces ?



By: Mathieu Boyer (thieums63) 2013-05-15 02:32:17.251-0500

full sip trace.
192.168.69.10 = ACM (Avaya Communication Manager 5.2.1)
192.168.69.9 = asterisk 11.3.0
192.168.254.29 = SIP carrier (COLT)

By: Rusty Newton (rnewton) 2013-05-17 16:52:46.673-0500

Thanks, I'm sure all that debug will be helpful. For the VERBOSE and DEBUG log it would be nice to see everything happening right before the crash and not just the particular channel you grepped out..

By: Mathieu Boyer (thieums63) 2013-05-18 04:21:39.488-0500

full VERBOSE and DEBUG trace of the minute during the crash.
Generated by grep cmd : grep "\[May 14 13:14" full.1 > crash3.full.log

By: Rusty Newton (rnewton) 2013-05-22 09:54:33.560-0500

Thanks!

By: Rusty Newton (rnewton) 2017-07-25 16:59:07.680-0500

We are cleaning up the issue tracker.

Closing this out as there has been no movement on it for many years and it was filed against versions that are all no longer under bug fix support.

If an issue exists with a supported version of Asterisk please open a new issue and provide debug from a recent supported version. Thanks!