[Home]

Summary:ASTERISK-24473: Crash if a PRI DAHDI span is removed in the middle of a call
Reporter:Tzafrir Cohen (tzafrir)Labels:
Date Opened:2014-10-30 12:01:25Date Closed:
Priority:MajorRegression?
Status:Open/NewComponents:Channels/chan_dahdi
Versions:12.0.0 13.0.0-beta3 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 90-create-calls
Description:Asterisk now includes support for destroying DAHDI channels and pri spans when the device that includes them disappears (try: 'dahdi_span_assignment remove' , or, well, disconnect an Astribank).

(On a different note: some work is needed to support mfcr2 and ss7 channels)

However this fails to work when a channel is part of a span and is in the middle of a call. destroy_channel() is called with now=1.

One example backtrace:
{noformat}
#0  0x0039e560 in pthread_mutex_trylock () from /lib/libpthread.so.0
#1  0x04fb11b0 in dahdi_read (ast=0xb72d27b4) at chan_dahdi.c:9207
#2  0x080d0a7b in __ast_read (chan=0xb72d27b4, dropaudio=0) at channel.c:4054
#3  0x080d5cfb in ast_read (c0=0x90a6cec, c1=0xb72d27b4, config=0xb5ee7104, fo=0xb5ee6b60, rc=0xb5ee6b5c) at channel.c:4408
#4  ast_generic_bridge (c0=0x90a6cec, c1=0xb72d27b4, config=0xb5ee7104, fo=0xb5ee6b60, rc=0xb5ee6b5c) at channel.c:7630
#5  ast_channel_bridge (c0=0x90a6cec, c1=0xb72d27b4, config=0xb5ee7104, fo=0xb5ee6b60, rc=0xb5ee6b5c) at channel.c:8105
#6  0x08114482 in ast_bridge_call (chan=0x90a6cec, peer=0xb72d27b4, config=0xb5ee7104) at features.c:4489
#7  0x027807ee in dial_exec_full (chan=0x90a6cec, data=<value optimized out>, peerflags=0xb5ee7958, continue_exec=0x0) at app_dial.c:3047
#8  0x02781bfe in dial_exec (chan=0x90a6cec, data=0xb5ee9cbc "DAHDI/g0/1002,300,Tt") at app_dial.c:3130
#9  0x081789bd in pbx_exec (c=0x90a6cec, app=0x9060c80, data=0xb5ee9cbc "DAHDI/g0/1002,300,Tt") at pbx.c:1622
#10 0x08184556 in pbx_extension_helper (c=0x90a6cec, con=0x0, context=0x90a78a0 "macro-dialout-trunk", exten=0x90a78f0 "s", priority=22, label=0x0,
   callerid=0xb7201f48 "1001", action=E_SPAWN, found=0xb5eebfb8, combined_find_spawn=1) at pbx.c:4915
#11 0x00e315ac in _macro_exec (chan=<value optimized out>, data=0x90a78f0 "s", exclusive=0) at app_macro.c:412
#12 0x081789bd in pbx_exec (c=0x90a6cec, app=0x904edf0, data=0xb5eee9ac "dialout-trunk,1,1002,,off") at pbx.c:1622
#13 0x08184556 in pbx_extension_helper (c=0x90a6cec, con=0x0, context=0x90a78a0 "macro-dialout-trunk", exten=0x90a78f0 "s", priority=5, label=0x0,
   callerid=0xb7201f48 "1001", action=E_SPAWN, found=0xb5ef0ad8, combined_find_spawn=1) at pbx.c:4915
#14 0x0818c450 in ast_spawn_extension (c=0x90a6cec, args=0x0) at pbx.c:6037
#15 __ast_pbx_run (c=0x90a6cec, args=0x0) at pbx.c:6512
#16 0x0818e039 in ast_pbx_run_args (c=0x90a6cec) at pbx.c:6890
#17 ast_pbx_run (c=0x90a6cec) at pbx.c:6899
#18 0x04fbd5a9 in __analog_ss_thread (data=0xb728d268) at sig_analog.c:2164
#19 0x081d71f7 in dummy_start (data=0x8cd8cb0) at utils.c:1192
#20 0x0039cb39 in start_thread () from /lib/libpthread.so.0
#21 0x002d8ace in error_at_line () from /lib/libc.so.6
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
{noformat}

and in frame 1: we get that:
p = ast_channel_tech_pvt(ast);

p is NULL
Comments:By: Matt Jordan (mjordan) 2014-11-05 14:41:43.587-0600

So, I'm not surprised about this in the least. We knew when many of these commands were added that running them while an active {{ast_channel}} was using the {{pvt}} was going to cause severe problems. Take, for example, {{pri destroy span}}:

{code}
               e->command = "pri destroy span";
               e->usage =
                       "Usage: pri destroy span <span>\n"
                       "       Destorys D-channel of span and its B-channels.\n"
                       "       DON'T USE THIS UNLESS YOU KNOW WHAT YOU ARE DOING.\n";
               return NULL;
{code}

Or {{dahdi destroy channels}}:

{code}
       case CLI_INIT:
               e->command = "dahdi destroy channels";
               e->usage =
                       "Usage: dahdi destroy channels <from_channel> [<to_channel>]\n"
                       "       DON'T USE THIS UNLESS YOU KNOW WHAT YOU ARE DOING.  Immediately removes a given channel, whether it is in use or not\n";
               return NULL;
       case CLI_GENERATE:
               return NULL;
       }
{code}

Those large, all caps messages are there exactly because of this problem. Mucking with the private {{chan_dahdi}} information while an {{ast_channel}} is in use will crash. Asterisk is not designed for this feature.

I'm not saying this can't be fixed: it's software. But I think this warrants some serious consideration, because I don't see any way to fix this without massive overhauls in {{chan_dahdi}}. I think a proposal for how you would like this system to function in {{chan_dahdi}} would be appropriate, and should be started on the {{asterisk-dev}} list.

By: Tzafrir Cohen (tzafrir) 2017-03-14 05:07:28.611-0500

Just wanted to post the script I used to stress-test Asterisk as it had some non-trivial things in it. The basic setup is Astribanks (a 4-port PRI and 8-port BRI) each configured such that the odd ports are TE and the even ports are NT, and each pair of ports is cross-linked (timing issues? I don't care here).

All channels are configured to be in a context in which extension 1000 runs Echo.

This script (90-create-calls), when copied to /usr/share/dahdi/span-config.d , will create calls in all channels (originally. later I reduced the load, as you can see). This is run as soon as a span is created. So just use:

 dahdi_span_assignments add

and you have calls.

It also attempts to create a call in the D-channel, but that has no effect. The tricky bit was that running 'asterisk -rx' for all the channels took too long and the udev hook timed out. I thus replaced it with a direct write to the socket. And all's well (That is: potentially  up until I unassign the span).