[Home]

Summary:ASTERISK-24590: Segmentation fault in get_multiple_fields_as_var_list
Reporter:Anton Evzhakov (Anber)Labels:
Date Opened:2014-12-03 09:11:56.000-0600Date Closed:2015-01-27 12:47:35.000-0600
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Core/Bridging Resources/res_stasis
Versions:12.7.1 Frequency of
Occurrence
Frequent
Related
Issues:
is related toASTERISK-24592 Random crashes with preceding ERROR "Failed to encode JSON object"
Environment:uname -a Linux astratel 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/LinuxAttachments:( 0) backtrace.txt
( 1) lastmoment.log
Description:Sometimes asterisk crashes with SIGSEGV while closing bridge connected to stasis.

{code:title=backtrace.txt|borderStyle=solid}
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000005babc1 in get_multiple_fields_as_var_list (object=0x0, object_field=0x1) at sorcery.c:1123
1123 if (object_field->multiple_handler(object, &tmp)) {
#0  0x00000000005babc1 in get_multiple_fields_as_var_list (object=0x0, object_field=0x1) at sorcery.c:1123
       tmp = 0x0
#1  0x00000000005bac27 in ast_sorcery_objectset_create2 (sorcery=0x7f8a0001dd68, object=0x7f8a00005858, flags=AST_HANDLER_PREFER_STRING) at sorcery.c:1133
       details = 0x0
       object_type = 0x991
       i = {c = 0x7f89e8767dd0, last_node = 0x4828c0 <bridge_channel_wait+108>, complete = 134349144, flags = 32653}
       object_field = 0x7f8a4003fcd8
       head = 0x7f8c340382b8
       tail = 0x7f89e8767d70
#2  0x00000000005c5f38 in cache_udpate (cached_entry=0x5bac27 <ast_sorcery_objectset_create2+28>, eid=0x7f89e8767d00, new_snapshot=0x0) at stasis_cache.c:451
       new_max = 140230682336616
       new_elems = 0x991
       res = -394887856
       old_snapshot = 0x5c5f38 <cache_udpate+440>
       is_remote = 32649
       idx = 0
       __PRETTY_FUNCTION__ = "cache_udpate"
#3  0x00000000004c9acb in ast_channel_state_set (chan=0x7f8a4003fcd8, value=AST_STATE_DOWN) at channel_internal_api.c:952
No locals.
#4  0x00000000004828c0 in bridge_channel_wait (bridge_channel=0x7f89e8768700) at bridge_channel.c:2282
       __p = 0
       __x = 0
       ms = 0
       outfd = 0
       chan = 0x0
       __PRETTY_FUNCTION__ = "bridge_channel_wait"
#5  0x000000000046aa04 in bridge_reconfigured (bridge=0x7f8d08020158, colp_update=32649) at bridge.c:1485
No locals.
#6  0x00000000005e7705 in udptl_rx_packet (s=0x7f8a4003fcd8, buf=0x0, len=0) at udptl.c:504
       j = 0
       l = 32649
       x = 32
       stat1 = 0
       stat2 = 0
       i = 32653
       ptr = 4729024
       seq_no = 0
       ifp = 0x0
       data = 0x0
       ifp_len = 0
       repaired = {534, -1, 134573088, 32653, 0, 0, 4514738, 0, -394885376, 32649, 0, 0, 4630977, 0, 134349144, 32653}
       bufs = {0x7f8d080174b0 "bridge_channel_depart_thread started at [ 1713] bridge.c ast_bridge_impart()", 0x51954a <format_cap_init+120> "\340H\211P\b\353\fH\213E\330H\307@\b", 0x7f89e8768700 "", 0x9f608fe682867fd9 <error: Cannot access memory at address 0x9f608fe682867fd9>, 0x0, 0x0, 0x7f89e87689c0 "\300I\333\350\211\177", 0x7f89e8768700 "", 0x9f608fe683067fd9 <error: Cannot access memory at address 0x9f608fe683067fd9>, 0x60735fb6924e7fd9 <error: Cannot access memory at address 0x60735fb6924e7fd9>, 0x7f8d00000000 " ", 0x7f89e8767ef0 "", 0x0, 0x7f8d175f2ba2 <error: Cannot access memory at address 0x7f8d175f2ba2>, 0x0}
       lengths = {3900079616, 32649, 4631044, 0, 3900079824, 32649, 134349144, 32653, 0, 0, 134349144, 32653, 3900079824, 32649, 6190853}
       span = 564
       entries = -394885376
       ifp_no = 1073751266
       __PRETTY_FUNCTION__ = "udptl_rx_packet"
#7  0x00007f8d15dd1182 in ?? ()
No symbol table info available.
#8  0x0000000000000000 in ?? ()
No symbol table info available.
{code}

{quote}
Thread 1 (LWP 7785):
#0  0x00000000005babc1 in get_multiple_fields_as_var_list (object=0x0, object_field=0x1) at sorcery.c:1123
#1  0x00000000005bac27 in ast_sorcery_objectset_create2 (sorcery=0x7f8a0001dd68, object=0x7f8a00005858, flags=AST_HANDLER_PREFER_STRING) at sorcery.c:1133
#2  0x00000000005c5f38 in cache_udpate (cached_entry=0x5bac27 <ast_sorcery_objectset_create2+28>, eid=0x7f89e8767d00, new_snapshot=0x0) at stasis_cache.c:451
#3  0x00000000004c9acb in ast_channel_state_set (chan=0x7f8a4003fcd8, value=AST_STATE_DOWN) at channel_internal_api.c:952
#4  0x00000000004828c0 in bridge_channel_wait (bridge_channel=0x7f89e8768700) at bridge_channel.c:2282
#5  0x000000000046aa04 in bridge_reconfigured (bridge=0x7f8d08020158, colp_update=32649) at bridge.c:1485
#6  0x00000000005e7705 in udptl_rx_packet (s=0x7f8a4003fcd8, buf=0x0, len=0) at udptl.c:504
#7  0x00007f8d15dd1182 in ?? ()
#8  0x0000000000000000 in ?? ()
{quote}
Comments:By: Matt Jordan (mjordan) 2014-12-03 09:43:29.246-0600

Thank you for your bug report. In order to move your issue forward, we require a backtrace[1] from the core file produced after the crash. Also, be sure you have DONT_OPTIMIZE enabled in menuselect within the Compiler Flags section, then:

make install

After enabling, reproduce the crash, and then execute the backtrace[1] instructions. When complete, attach that file to this issue report.

[1] https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

Please provide a full backtrace as indicated in the instructions. Attach it to this issue as a separate file.

By: Matt Jordan (mjordan) 2014-12-03 09:48:10.522-0600

Also - {{udptl_rx_packet}} at line 504 does not call {{bridge_reconfigured}}. In fact, it doesn't make any sense for the UDPTL stack to be calling directly into the bridging code.

In 12.7.1, {{udptl}} at lines 503-505 contain the following:

{code}
503: if (ptr + 1 > len)
504: return -1;
505: entries = buf[ptr++];
{code}

You will need to reproduce whatever issue you are having on an unmodified version of Asterisk. Since UDPTL has no business calling into the bridging layer directly, that may be difficult to do.

By: Anton Evzhakov (Anber) 2014-12-03 09:48:20.289-0600

Backtrace

By: Anton Evzhakov (Anber) 2014-12-03 09:52:41.428-0600

Sorry for wrong backtrace


By: Anton Evzhakov (Anber) 2014-12-03 12:19:14.794-0600

Also I have many errors in logs like this
{quote}
ERROR[8228]: ari/ari_websockets.c:171 ast_ari_websocket_session_write: Failed to encode JSON object
{quote}
and this
{quote}
ERROR[7785]: astobj2.c:149 INTERNAL_OBJ: bad magic number for object 0x7f8a4003fcd8. Object is likely destroyed.
{quote}

I attached correct backtrace file



By: Matt Jordan (mjordan) 2014-12-03 12:39:18.672-0600

Looking at your backtrace, the channel appears to already be destroyed or otherwise in a rather bad state.

Can you provide the ARI calls you're using that reproduce this scenario?

By: Anton Evzhakov (Anber) 2014-12-03 12:58:38.962-0600

Sorry, but I can't. I have thousands of calls per day, but asterisk crashes only 4-5 times per day on random calls.

By: Anton Evzhakov (Anber) 2014-12-03 12:59:11.889-0600

Debug log

By: Rusty Newton (rnewton) 2015-01-09 17:09:00.826-0600

Thanks. The bad magic number messages may point to reference counting issues and help track down the bug.

Can you follow the [Reference Count Debugging|https://wiki.asterisk.org/wiki/display/AST/Reference+Count+Debugging] guide and provide the generated "refs" log after your crash?

By: Rusty Newton (rnewton) 2015-01-09 17:11:28.413-0600

I forgot to mention.. in addition, if possible you'll want to make sure this issue occurs in 13.x as you probably already know that 12 is in Security Fix only support. [Asterisk Versions|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions].

By: Matt Jordan (mjordan) 2015-01-27 12:47:23.711-0600

I'm not able to reproduce this, and I can't see how we would go from the UDPTL stack to a bridge reconfiguration.

Since the backtrace doesn't make sense and we don't know have any information on how to reliably reproduce this issue, I'm going to close this out as "Can't Reproduce". If you can provide either what [~rnewton] requested or a way to reliably reproduce the problem, we'll be happy to re-open the issue.