[Home]

Summary:ASTERISK-28294: Segmentation Fault on strchr
Reporter:Salah Ahmed (rubel)Labels:fax
Date Opened:2019-02-15 16:37:10.000-0600Date Closed:2019-02-26 09:21:07.000-0600
Priority:MinorRegression?
Status:Closed/CompleteComponents:Core/Logging
Versions:13.24.1 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:
Description:Hi,

Recently we have experiencing a crash on our system. gdb core dump report following,
{code}
#1  0x000000000053a47f in logger_strip_verbose_magic (message=0x7feda0004302 "", level=-1) at logger.c:1366
       begin = 0x7feda0032000 <error: Cannot access memory at address 0x7feda0032000>
       end = 0x7feda0031fff ""
       stripped_message = 0x7fee18000a80 " (keepalive-fax, keepalive-fax, 7) exited non-zero on 'PJSIP/ser02-000001b1'\n\020Feb 14 20:04:36\bVERBOSE\006pbx.c\016__ast_pbx_run5\340N\240\355\177\027\027\065\320O\240\355\177\200I\240\355\177Q\001x\240\355\177\240\025\240\355\177\350G\240\355\177&l\376zpd\300@\240\355\177\377\377\377\377\300\004\064SUCCESS@(\240\355\177\261\240\t\240\355\177\340I\240"...
       dst = 0x7fee1800480b ""
       magic = 0 '\000'
       __PRETTY_FUNCTION__ = "logger_strip_verbose_magic"
#2  0x000000000053a5d2 in logger_print_normal (logmsg=0x7feda0004280) at logger.c:1400
       chan = 0x0
       buf = "Executing [keepalive-fax@keepalive-fax:7] ExecIf(\"PJSIP/cnhk1ser02-000001b1\", \"1?Busy()\") in new stack\n\000ive_handler\") in new stack\n\000RE ms.hostname LIKE ?    AND ms.service = vcc    AND IFNULL(ms.force"...
       v = 0x0
       tmpmsg = 0x7feda0004298 "\340B"
       level = -1
       __PRETTY_FUNCTION__ = "logger_print_normal"
#3  0x000000000053b9e4 in logger_thread (data=0x0) at logger.c:1617
       next = 0x0
       msg = 0x7feda0004280
       __PRETTY_FUNCTION__ = "logger_thread"
#4  0x00000000005f6640 in dummy_start (data=0x2d76570) at utils.c:1238
       __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, 5944777430901416190, 0, 140660921573472, 0, 140660859598592, 5944777430876250366, -5954807578999037698},
             __mask_was_saved = 0}}, __pad = {0x7fee2891eef0, 0x0, 0x0, 0x0}}
       __cancel_routine = 0x450f0b <ast_unregister_thread>
       __cancel_arg = 0x7fee2891f700
       __not_first_call = 0
       ret = 0x0
       a = {start_routine = 0x53b821 <logger_thread>, data = 0x0, name = 0x2d766d0 "logger_thread        started at [ 1705] logger.c init_logger()"}
#5  0x00007fee2a71f064 in start_thread (arg=0x7fee2891f700) at pthread_create.c:309
       __res = <optimized out>
       pd = 0x7fee2891f700
       now = <optimized out>
       unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140660859598592, -5954806884488486658, 0, 140660921573472, 0, 140660859598592, 5944777430903513342, 5944771658510852350},
             mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
       not_first_call = <optimized out>
       pagesize_m1 = <optimized out>
       sp = <optimized out>
       freesize = <optimized out>
       __PRETTY_FUNCTION__ = "start_thread"
#6  0x00007fee29a0762d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
{code}

From logger.c file,
{code}
       do {
               end = strchr(begin, magic);
if (end) {
                       size_t len = end - begin;
                       memcpy(dst, begin, len);
                       begin = end + 1;
dst += len;
               } else {
                       strcpy(dst, begin); /* safe */
                       break;
               }
} while (1);
{code}

It seems, If magic is 0 then there have possibility for this crash.

Thank You,
Comments:By: Asterisk Team (asteriskteam) 2019-02-15 16:37:11.711-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

By: George Joseph (gjoseph) 2019-02-19 09:03:13.800-0600

Can't seem to reproduce this.  Can you provide the a dialplan snippet that causes the crash and what level is  "verbose" at?


By: Salah Ahmed (rubel) 2019-02-19 09:33:08.087-0600

Hello George,

Actually I didn't reproduce this in my environment. This logic is seems suspicious to me after some gdb analysis. If there have no possibility of 0 magic value then, somehow this value set to 0 in my environment and that causes the crash.

Just to share, In Frame-2 logmsg has following data,

{code}
(gdb) p *logmsg
$2 = {type = LOGMSG_VERBOSE, level = 5, line = 4350, lwp = 18847, callid = 0x7feda0001298, __field_mgr_pool = 0x7feda00042e0, date = 0x7feda0004368 "Feb 14 20:04:36",
 file = 0x7feda0004384 "pbx.c", function = 0x7feda000438c "__ast_pbx_run", message = 0x7feda0004302 "", level_name = 0x7feda000437a "VERBOSE", __field_mgr = {
   last_alloc = 0x7feda000438c "__ast_pbx_run", header = 0x7feda0005010}, list = {next = 0x0}}
{code}

This message = "" create level = -1 and eventually it set 0 to magic value.

Thank You,  

By: Joshua C. Colp (jcolp) 2019-02-25 08:37:52.321-0600

Looking at the code which produces the underlying logger message - the magic should always be non-zero, so the question is really how it was zero. Did the system run out of memory at the time?

By: Salah Ahmed (rubel) 2019-02-25 15:46:49.246-0600

Hello,

It seems something happened on that particular machine, This version of asterisk running all other machines without any issue with identical configuration. And we could not reproduce this issue on our local environment. So, this could be a hardware issue. Thanks for your responses.

Thanks

By: Joshua C. Colp (jcolp) 2019-02-26 09:21:07.785-0600

If this occurs again and you can get full information (system level, Asterisk logs, everything) then we can look into this again, otherwise there's nothing much to act on and this is not something that anyone else has seen either.