[Home]

Summary:ASTERISK-21316: Segfault on ast_channel_tech(chan)->send_digit_begin
Reporter:Ashley Winters (awinters)Labels:
Date Opened:2013-03-25 14:34:41Date Closed:2018-01-02 08:44:23.000-0600
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Core/Channels
Versions:11.2.1 Frequency of
Occurrence
Occasional
Related
Issues:
Environment:CentOS 6.3Attachments:( 0) gdb-send_digit_begin-segfault.txt
( 1) unlocked-send_digit-race.patch
Description:Calling {{ast_channel_tech(chan)}} multiple times in a row while chan is unlocked is a race condition. I experienced a segfault when the tech changed to {{null_tech}} between the null check and the function-pointer dereference.
Comments:By: Ashley Winters (awinters) 2013-03-25 14:36:07.206-0500

GDB trace showing the NULL dereference.

By: Ashley Winters (awinters) 2013-03-25 15:19:20.180-0500

Not the most elegant solution, but it should take care of the NULL dereference.

By: Rusty Newton (rnewton) 2013-03-29 09:52:39.778-0500

Thanks! Acknowledging.

By: Matt Jordan (mjordan) 2013-03-29 11:12:26.556-0500

I am curious how a digit managed to get put on a zombie channel in the first place:

{noformat}
#1  0x0000000000472b22 in ast_senddigit_begin (chan=0x7faaa00462b8, digit=35 '#') at channel.c:4750
4750 if (!ast_channel_tech(chan)->send_digit_begin(chan, digit))
(gdb) p chan->name
$1 = (const ast_string_field) 0x7faaa003a63a "AsyncGoto/SIP/sansay-sd-00002e78<ZOMBIE>"
(gdb) p *chan->tech
{noformat}

That channel is going to die - what queued the DTMF digit on it?

By: Ashley Winters (awinters) 2013-03-29 12:56:08.034-0500

It's a bridged call. We have a manager event listener which can trigger unbridge upon DTMF. It's like a dynamic features.conf for our multi-tenant IVR. So, sequence of events as best I can piece together:

1. Bridge channel A and B
2. DTMF # received on A
3. Simultaneously ChannelRedirect both A and B to dialplan locations, triggering unbridge
4. After {{ast_channel_bridge}} returns, the channels are not yet marked as ZOMBIE
6. Towards the end of {{ast_channel_bridge}}, it finally checks whether DTMF '#' should have triggered a feature, and if not forwards to the soon-to-be-ZOMBIE channel B
7. In between the NULL check of {{!ast_channel_tech(chan)->send_digit_begin}} and the segfault location, channel B is masqueraded on another thread
8. Segfault

By: Etienne Lessard (hexanol) 2014-07-04 14:48:23.323-0500

I'm getting the same segfault on the latest asterisk 11 version, i.e. 11.10.2.

My scenario is the following:

Given I have a queue with a member Local/123@something
Given the extension 123@something does a Dial(SIP/abcdef)
When someone calls the queue
Then it calls SIP/abcdef
When SIP/abcdef answer the call almost at the same time as the caller press a DTMF key
Then asterisk segfault

It's kinda hard to reproduce manually. I've seen the crash twice on production asterisk, but to reproduce it, it's easier to add a small sleep between the

{noformat}
if (!ast_channel_tech(chan)->send_digit_begin)
{noformat}

and

{noformat}
if (!ast_channel_tech(chan)->send_digit_begin(chan, digit))
{noformat}

statements in ast_senddigit_begin.


By: Joshua C. Colp (jcolp) 2017-12-19 04:51:03.078-0600

Have you experienced this under Asterisk 13? The bridging scenario you mentioned has been completely rewritten which I don't think should result in the specific scenario happening now.

By: Asterisk Team (asteriskteam) 2018-01-02 08:44:23.946-0600

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines